# Dropping Features

<span>Dropping feature is a common task in cleaning data. I use Pandas to drop the majority of features and observations in my workflow. Below is an example of a column of drop statements in Python. You can find the full documentation on dropping null value on Pandas' documents (link below).</span>

Dropping Data Documentation:  https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop.html

### Import Preliminaries

In [1]:
# Import modules
import pandas as pd

### Import Data

In [62]:
# Import data
df = pd.read_csv('Data/Pokemon.csv')

# View the head of the dataframe 
df.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


### Dropping Feature

In [63]:
# Drop the "name" feature and view the head of the dataframe
df.drop('Name', axis=1).head()

Unnamed: 0,#,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Fire,,309,39,52,43,60,50,65,1,False


### Dropping Observations

In [64]:
# Drop the third observations from the DataFrame 
df.drop(3, axis=0).head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False


##### Filtering Observations

While you can drop the observations using the drop function in Pandas, but its just faster to just use loc and list filtering available in the package.

In [65]:
# View the all the observations where the Type is not 'Grass' via filtering
df[df['Type 1'] != 'Grass'].head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
7,6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
8,6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False


##### Loc Statements

In [66]:
# Select only a subset of the datasets using the loc statement
df.loc[:2, 'Name':'HP']

Unnamed: 0,Name,Type 1,Type 2,Total,HP
0,Bulbasaur,Grass,Poison,318,45
1,Ivysaur,Grass,Poison,405,60
2,Venusaur,Grass,Poison,525,80


### Dropping Mulptile Features

In [67]:
# Drop the "Name" feature and view the head of the dataframe
df.drop(['#','Name', 'Generation','Legendary'] ,axis=1).head()

Unnamed: 0,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed
0,Grass,Poison,318,45,49,49,65,65,45
1,Grass,Poison,405,60,62,63,80,80,60
2,Grass,Poison,525,80,82,83,100,100,80
3,Grass,Poison,625,80,100,123,122,120,80
4,Fire,,309,39,52,43,60,50,65


### Dropping Columns Inplace

By default the inplace parameter for all the drop functions is set to False, but your can pass in opposite Bollean value very easily.

In [68]:
# Copy the DataFrame
df_copy = df.copy()

# Drop the Name column
df_copy.drop('Name', axis=1, inplace=True)

# View the head of the dataframe 
df_copy.head()

Unnamed: 0,#,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Fire,,309,39,52,43,60,50,65,1,False


### Dropping Index

If you want to reset your index the best way to do this would be to just use the reset index function. But if you have dataframe with the multindex we can also us the drop datapoints related to the index value

In [75]:
# Filtered dataframe for Grass Pokemon
df[(df['Type 1'] == 'Grass') | (df['Type 2'] == 'Grass')].head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
48,43,Oddish,Grass,Poison,320,45,50,55,75,65,30,1,False


In [76]:
# Filtered dataframe for Grass Pokemon + reset index
df[(df['Type 1'] == 'Grass') | (df['Type 2'] == 'Grass')].reset_index().head()

Unnamed: 0,index,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,48,43,Oddish,Grass,Poison,320,45,50,55,75,65,30,1,False


In [91]:
# Creating a Multindex DataFrame via the Group by Function
sum_stats = df.groupby(['Type 1','Type 2'])['Total', 'HP', 'Attack', 'Defense',
       'Sp. Atk', 'Sp. Def', 'Speed'].mean()
sum_stats.head(19)

Unnamed: 0_level_0,Unnamed: 1_level_0,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed
Type 1,Type 2,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Bug,Electric,395.5,60.0,62.0,55.0,77.0,55.0,86.5
Bug,Fighting,550.0,80.0,155.0,95.0,40.0,100.0,80.0
Bug,Fire,455.0,70.0,72.5,60.0,92.5,80.0,80.0
Bug,Flying,419.5,63.0,70.142857,61.571429,72.857143,69.071429,82.857143
Bug,Ghost,236.0,1.0,90.0,45.0,30.0,30.0,40.0
Bug,Grass,384.0,55.0,73.833333,76.666667,57.333333,76.666667,44.5
Bug,Ground,345.0,45.5,62.0,97.5,44.5,57.5,38.0
Bug,Poison,347.916667,53.75,68.333333,58.083333,42.5,59.333333,65.916667
Bug,Rock,435.0,46.666667,56.666667,146.666667,36.666667,113.333333,35.0
Bug,Steel,509.714286,67.714286,114.714286,112.428571,68.142857,83.285714,63.428571


In [105]:
# Drop "bug" type  from first level of index
sum_stats.drop('Bug' , axis=0,  level=0).head(19)

Unnamed: 0_level_0,Unnamed: 1_level_0,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed
Type 1,Type 2,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Dark,Dragon,440.0,72.0,85.0,70.0,78.333333,70.0,64.666667
Dark,Fighting,418.0,57.5,82.5,92.5,40.0,92.5,53.0
Dark,Fire,476.666667,65.0,80.0,56.666667,110.0,73.333333,91.666667
Dark,Flying,494.0,93.2,92.2,73.8,84.2,70.4,80.2
Dark,Ghost,430.0,50.0,80.0,100.0,75.0,90.0,35.0
Dark,Ice,470.0,62.5,107.5,60.0,40.0,80.0,120.0
Dark,Psychic,385.0,69.5,73.0,70.5,52.5,60.5,59.0
Dark,Steel,415.0,55.0,105.0,85.0,50.0,55.0,65.0
Dragon,Electric,680.0,100.0,150.0,120.0,120.0,100.0,90.0
Dragon,Fairy,590.0,75.0,110.0,110.0,110.0,105.0,80.0


In [107]:
# Drop "fire"  type from second level of index
sum_stats.drop('Fire' , axis=0,  level=1).head(19)

Unnamed: 0_level_0,Unnamed: 1_level_0,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed
Type 1,Type 2,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Bug,Electric,395.5,60.0,62.0,55.0,77.0,55.0,86.5
Bug,Fighting,550.0,80.0,155.0,95.0,40.0,100.0,80.0
Bug,Flying,419.5,63.0,70.142857,61.571429,72.857143,69.071429,82.857143
Bug,Ghost,236.0,1.0,90.0,45.0,30.0,30.0,40.0
Bug,Grass,384.0,55.0,73.833333,76.666667,57.333333,76.666667,44.5
Bug,Ground,345.0,45.5,62.0,97.5,44.5,57.5,38.0
Bug,Poison,347.916667,53.75,68.333333,58.083333,42.5,59.333333,65.916667
Bug,Rock,435.0,46.666667,56.666667,146.666667,36.666667,113.333333,35.0
Bug,Steel,509.714286,67.714286,114.714286,112.428571,68.142857,83.285714,63.428571
Bug,Water,269.0,40.0,30.0,32.0,50.0,52.0,65.0


Author: Kavi Sekhon