When working with dataframes, we may want to remove rows or columns that have **NA** values present.

In [1]:
import pandas as pd
import numpy as np
import pandas.util.testing as pdt

NA = np.nan
df = pd.DataFrame([[NA, 2, NA, 0],
                   [3, 4, NA, 1],
                   [NA, NA, NA, 5]])
df

Unnamed: 0,0,1,2,3
0,,2.0,,0
1,3.0,4.0,,1
2,,,,5


In Pandas, we can use the `.dropna()` DataFrame method to drop either rows or columns. The default behavior is to drop rows that have **NaNs** in _any_ of the columns.

In [2]:
explicit = df.dropna(axis=0, how='any')
defaults = df.dropna()

pdt.assert_frame_equal(explicit, defaults)
explicit

Unnamed: 0,0,1,2,3


And we easily do the same with columns by setting a value for `axis` (but dropping rows is much more common).

In [3]:
df.dropna(axis=1)

Unnamed: 0,3
0,0
1,1
2,5


We can also specify what criteria must be met to drop a row/column. For example, we can require that all values in the row (or column) must be NaNs in order to drop.

In [4]:
df.dropna(axis=1, how='all')

Unnamed: 0,0,1,3
0,,2.0,0
1,3.0,4.0,1
2,,,5


It can be quite useful to drop rows based on a particular column. To do that, we simply use the subset argument, and pass in a list of column names.

In [5]:
df.dropna(subset=[0])

Unnamed: 0,0,1,2,3
1,3.0,4.0,,1


As a side note, instead of using `.dropna()`, we could use `pd.isnull()` to subset the DataFrame. However, I much prefer the former because we can use it in conjuntion with **method chaining**.

In [6]:
pdt.assert_frame_equal(df.dropna(subset=[0]), df[~pd.isnull(df[0])])

The full documentation for `.dropna()` (and these examples), can be found [here](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html).