# Missing Data
pandas primarily uses the value np.nan to represent missing data.

Reindexing allows you to change/add/delete the index on a specified axis. This returns a copy of the data.

In [3]:
import pandas as pd
import numpy as np
dates = pd.date_range('20190101',periods=10)
df = pd.DataFrame(np.random.randn(10,4), 
                  index=dates, 
                  columns=list('ABCD'))
df1 = df.reindex(index=dates[0:4], columns=list(df.columns) + ['E'])

In [4]:
df1.loc[dates[0]:dates[1], 'E'] = 1

In [5]:
df1


Unnamed: 0,A,B,C,D,E
2019-01-01,0.199886,0.312388,1.394258,-0.311931,1.0
2019-01-02,0.259445,-0.377668,-1.481911,1.805175,1.0
2019-01-03,1.452134,-2.576209,-0.246738,-1.127367,
2019-01-04,2.026428,0.183045,1.275433,-0.834084,


To drop any rows that have missing data.


In [6]:
df1.dropna(how='any')

Unnamed: 0,A,B,C,D,E
2019-01-01,0.199886,0.312388,1.394258,-0.311931,1.0
2019-01-02,0.259445,-0.377668,-1.481911,1.805175,1.0


Filling missing data.

In [7]:
df1.fillna(value='FILL VALUE')

Unnamed: 0,A,B,C,D,E
2019-01-01,0.199886,0.312388,1.394258,-0.311931,1
2019-01-02,0.259445,-0.377668,-1.481911,1.805175,1
2019-01-03,1.452134,-2.576209,-0.246738,-1.127367,FILL VALUE
2019-01-04,2.026428,0.183045,1.275433,-0.834084,FILL VALUE


To get the boolean mask where values are nan.

In [8]:
pd.isna(df1)

Unnamed: 0,A,B,C,D,E
2019-01-01,False,False,False,False,False
2019-01-02,False,False,False,False,False
2019-01-03,False,False,False,False,True
2019-01-04,False,False,False,False,True
