<a href="https://colab.research.google.com/github/sureshmecad/Google-Colab/blob/master/1_Drop_rows_containing_empty_cells.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

- **Pandas** will **recognise** a value as **null** if it is a **np.nan** object, which will print as **NaN in the DataFrame.**

- Your missing values are probably **empty strings**, which Pandas **doesn't recognise as null**.

- To fix this, you can **convert the empty stings** (or whatever is in your empty cells) to **np.nan** objects using **replace()**, and then call **dropna()**on your DataFrame to delete rows with null tenants.

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.DataFrame(np.random.randn(10, 2), columns=list('AB'))
df['Tenant'] = np.random.choice(['Babar', 'Rataxes', ''], 10)
df

Unnamed: 0,A,B,Tenant
0,-0.404992,-0.409367,
1,0.115631,-0.476095,Rataxes
2,0.984713,1.968267,Babar
3,0.272921,0.350391,Babar
4,-0.913832,-0.051186,Babar
5,0.83024,-0.212489,Rataxes
6,1.761064,0.969756,Babar
7,0.094864,-0.878637,Rataxes
8,0.338316,0.467382,Babar
9,0.5679,0.580283,Babar


In [3]:
# Now we replace any empty strings in the Tenants column with np.nan objects

df['Tenant'].replace('', np.nan, inplace=True)
df

Unnamed: 0,A,B,Tenant
0,-0.404992,-0.409367,
1,0.115631,-0.476095,Rataxes
2,0.984713,1.968267,Babar
3,0.272921,0.350391,Babar
4,-0.913832,-0.051186,Babar
5,0.83024,-0.212489,Rataxes
6,1.761064,0.969756,Babar
7,0.094864,-0.878637,Rataxes
8,0.338316,0.467382,Babar
9,0.5679,0.580283,Babar


In [4]:
# drop the null values

df.dropna(subset=['Tenant'], inplace=True)
df

Unnamed: 0,A,B,Tenant
1,0.115631,-0.476095,Rataxes
2,0.984713,1.968267,Babar
3,0.272921,0.350391,Babar
4,-0.913832,-0.051186,Babar
5,0.83024,-0.212489,Rataxes
6,1.761064,0.969756,Babar
7,0.094864,-0.878637,Rataxes
8,0.338316,0.467382,Babar
9,0.5679,0.580283,Babar


-----------------

In [5]:
df1 = pd.DataFrame({
    'A': range(5),
    'B': ['foo', '', 'bar', '', 'xyz']
})

df1

Unnamed: 0,A,B
0,0,foo
1,1,
2,2,bar
3,3,
4,4,xyz


In [7]:
df1['B'].astype(bool)

0     True
1    False
2     True
3    False
4     True
Name: B, dtype: bool

In [8]:
df1[df1['B'].astype(bool)]

Unnamed: 0,A,B
0,0,foo
2,2,bar
4,4,xyz


##### **If your goal is to remove not only empty strings, but also strings only containing whitespace, use str.strip beforehand:**

In [9]:
df1[df1['B'].str.strip().astype(bool)]

Unnamed: 0,A,B
0,0,foo
2,2,bar
4,4,xyz
