# Dropping missing values with `drop_na`

`data.dropna(axis=0, how='any', tresh=None, subset=None, inplace=False)`

1. axis (0 or 1)

    use 0 or 'index' to drop rows that contain missing values, or <br>
    use 1 or 'column names' to drop columns that contain missing values <br>

2. Drop rows or columns that have at least one missing values <br>

    'any' : if any NA values are present <br>
    'all' : if all values are NA <br>

3. Set some treshold value to drop missing values <br>

    tresh : int, optional

4. Drop rows of selected columns

    subset : define columns that have missing values

5. Drop rows or columns temporarily or permanently

    inplace : boole (True of False)





In [1]:
import pandas as pd

In [3]:
df_student = pd.read_csv("./datasets/df_m.csv")

df_student

Unnamed: 0,Name,Score,Grades
0,Paul,98.0,
1,Aaron,,AB
2,Krista,99.0,AA
3,Veronica,87.0,
4,Paxton,90.0,AC
5,Madison,,BA
6,Aurora,82.0,BB
7,,,


## any

In [4]:
df_student.dropna(axis=0, how='any')

Unnamed: 0,Name,Score,Grades
2,Krista,99.0,AA
4,Paxton,90.0,AC
6,Aurora,82.0,BB


## Drop columns where at least one value is missing

In [6]:
df_student.dropna(axis=1, how='any')

0
1
2
3
4
5
6
7


## Drop rows where all values are missing

In [7]:
df_student.dropna(axis=0, how='all')

Unnamed: 0,Name,Score,Grades
0,Paul,98.0,
1,Aaron,,AB
2,Krista,99.0,AA
3,Veronica,87.0,
4,Paxton,90.0,AC
5,Madison,,BA
6,Aurora,82.0,BB


## Drop the columns where all values are missing

In [8]:
df_student.dropna(axis='columns', how='all')

Unnamed: 0,Name,Score,Grades
0,Paul,98.0,
1,Aaron,,AB
2,Krista,99.0,AA
3,Veronica,87.0,
4,Paxton,90.0,AC
5,Madison,,BA
6,Aurora,82.0,BB
7,,,


## Keep only the rows with at least 2 non-NA values

In [9]:
df_student.dropna(axis=0, thresh=2)

Unnamed: 0,Name,Score,Grades
0,Paul,98.0,
1,Aaron,,AB
2,Krista,99.0,AA
3,Veronica,87.0,
4,Paxton,90.0,AC
5,Madison,,BA
6,Aurora,82.0,BB


## Define columns to erase missing values

In [11]:
df_student.dropna(subset=['Score', 'Grades'])

Unnamed: 0,Name,Score,Grades
2,Krista,99.0,AA
4,Paxton,90.0,AC
6,Aurora,82.0,BB


## Using inplace

In [12]:
df_student.dropna(inplace=True)

In [13]:
df_student

Unnamed: 0,Name,Score,Grades
2,Krista,99.0,AA
4,Paxton,90.0,AC
6,Aurora,82.0,BB
