**Dropping Rows and Columns in a Pandas Dataframe**

https://chrisalbon.com/python/data_wrangling/pandas_dropping_column_and_rows/

In [1]:
import pandas as pd

**Create a Dataframe**

In [24]:
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'year': [2012, 2012, 2013, 2014, 2014], 
        'reports': [4, 24, 31, 2, 3]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df1 = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
df

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Santa Cruz,Tina,31,2013
Maricopa,Jake,2,2014
Yuma,Amy,3,2014


**Drop an observation (row)**

In [4]:
df.drop(['Cochice','Pima'])

Unnamed: 0,name,reports,year
Santa Cruz,Tina,31,2013
Maricopa,Jake,2,2014
Yuma,Amy,3,2014


**Drop a variable (column)**

**Note : axis = 1 denotes that we are referring to a column and not a row**

In [5]:
df.drop('reports', axis = 1)

Unnamed: 0,name,year
Cochice,Jason,2012
Pima,Molly,2012
Santa Cruz,Tina,2013
Maricopa,Jake,2014
Yuma,Amy,2014


**Drop a row if it contains a specific value in this case : 'Tina'**

Specifically, Create a new dataframe called that includes all rows where the value of a cell in the 'name' column does not equal 'Tina'

In [6]:
df = df[df['name'] != 'Tina']

In [7]:
df

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Maricopa,Jake,2,2014
Yuma,Amy,3,2014


**Drop a row by row number, in this case row 3**

**Note that Pandas uses 0 based numbering, so 0 is the first row, 1 is the second row and so on.**

In [8]:
df.drop(df.index[2])

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Yuma,Amy,3,2014


**..Can be extended to dropping a range**

In [11]:
df.drop(df.index[[1,2]])

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Yuma,Amy,3,2014


**..Or dropping relative to the end of the dataframe df**

In [14]:
df.drop(df.index[-1])

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Maricopa,Jake,2,2014


In [25]:
df1

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Santa Cruz,Tina,31,2013
Maricopa,Jake,2,2014
Yuma,Amy,3,2014


In [26]:
df1.drop(df1.index[-1],inplace=True)

In [27]:
df1

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Santa Cruz,Tina,31,2013
Maricopa,Jake,2,2014


In [28]:
df1.drop(df1.index[[2,3]],inplace=True)

In [29]:
df1

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012


**You can select ranges relative to the top or drop relative to the bottom of the dataframe as well**

In [30]:
df[:3] #Keep top 3

Unnamed: 0,name,reports,year
Cochice,Jason,4,2012
Pima,Molly,24,2012
Santa Cruz,Tina,31,2013


In [None]:
df[:-3] #Drop bottom 3