# The <font color='red'>isna</font> and <font color='red'>notna</font> functions

**The first step in handling missing values is to find them.**

![image.png](attachment:f5c79da7-31f6-4804-aa6e-539256b40486.png)

We can use either the isna or notna function to detect missing values. 

The <font color='red'>isna</font> function evaluates each cell in a DataFrame and returns True to indicate a missing value. Let’s work through an example to see its output.

In [1]:
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "A": [1, 2, 3, np.nan, 7],
    "B": [2.4, np.nan, 5.1, np.nan, 2.6],
    "C": [np.nan, "foo","zoo","bar", np.nan],
    "D": [11.5, np.nan, 6.2, 21.1, 8.7]
})

print(df.isna())

       A      B      C      D
0  False  False   True  False
1  False   True  False   True
2  False  False  False  False
3   True   True  False  False
4  False  False   True  False


We have a grid of True and False values, where True values indicate the missing values. The real-life datasets are quite large so we’re likely to have a DataFrame with thousands or millions of rows. In such cases, returning an entire DataFrame filled with True and False values is useless. If we want to count the number of missing values in columns, we can use the <u>sum function along with the isna function</u>.

In [2]:
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "A": [1, 2, 3, np.nan, 7],
    "B": [2.4, np.nan, 5.1, np.nan, 2.6],
    "C": [np.nan, "foo","zoo","bar", np.nan],
    "D": [11.5, np.nan, 6.2, 21.1, 8.7]
})

print(df.isna().sum())

A    1
B    2
C    2
D    1
dtype: int64


We can now see the number of missing values in each column. Adding another sum function returns the total number of missing values in the DataFrame:

In [3]:
df.isna().sum().sum()

6

##### If we’re interested in the number of missing values in each row, we can still use the sum function, but the axis parameter needs to be set to 1.

In [4]:
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "A": [1, 2, 3, np.nan, 7],
    "B": [2.4, np.nan, 5.1, np.nan, 2.6],
    "C": [np.nan, "foo","zoo","bar", np.nan],
    "D": [11.5, np.nan, 6.2, 21.1, 8.7]
})

print(df.isna().sum(axis=1))

0    1
1    2
2    0
3    2
4    1
dtype: int64


In [6]:
df

Unnamed: 0,A,B,C,D
0,1.0,2.4,,11.5
1,2.0,,foo,
2,3.0,5.1,zoo,6.2
3,,,bar,21.1
4,7.0,2.6,,8.7


The notna function works the same, but the output is the opposite. It returns False for missing values. Thus, we can use the notna function to count the number of non-missing values in the columns or rows. Here’s an example using the notna function.

In [7]:
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "A": [1, 2, 3, np.nan, 7],
    "B": [2.4, np.nan, 5.1, np.nan, 2.6],
    "C": [np.nan, "foo","zoo","bar", np.nan],
    "D": [11.5, np.nan, 6.2, 21.1, 8.7]
})

print(df.notna().sum())

A    4
B    3
C    3
D    4
dtype: int64


The DataFrame contains five rows, and there’s only one missing value in column A so the output of the notna function for this column is 4. The isna or notna function is an essential part of an exploratory data analysis process. The number of missing values has a large impact on how we handle them