## Rename and Delete Columns
It is often the case where you change your column names or remove unnecessary columns. 

***Rename columns***

Here are two popular ways to rename dataframe columns. 
1. ***dictionary substitution***. very useful if you only want to rename a few of the columns.
2. ***list replacement***: requires a full list of names (this is more error prone)



In [1]:
# Import libraries
import pandas as pd
import numpy as np

In [19]:
# import data
filename = 'car_financing.xlsx'
df = pd.read_excel(filename)

In [3]:
df.head()

Unnamed: 0,Month,Starting Balance,Repayment,Interest Paid,Principal Paid,New Balance,term,interest_rate,car_type
0,1,34689.96,687.23,202.93,484.3,34205.66,60,0.0702,Toyota Sienna
1,2,34205.66,687.23,200.1,487.13,33718.53,60,0.0702,Toyota Sienna
2,3,33718.53,687.23,197.25,489.98,33228.55,60,0.0702,Toyota Sienna
3,4,33228.55,687.23,194.38,492.85,32735.7,60,0.0702,Toyota Sienna
4,5,32735.7,687.23,191.5,495.73,32239.97,60,0.0702,Toyota Sienna


In [5]:
# This wont work as there is a spcae in the column name
# I want to fix that
df.Principal Paid

SyntaxError: invalid syntax (1577420648.py, line 3)

In [6]:
# Approach 1 dictionary substitution using rename method
df = df.rename(columns = {'Starting Balance': 'starting_balance',
                          'Interest Paid': 'interest_paid',
                          'Principal Paid': 'principal_paid',
                          'New Balance': 'new_balance'}
              )

In [7]:
# dataframe after renaming columns
df.head()

Unnamed: 0,Month,starting_balance,Repayment,interest_paid,principal_paid,new_balance,term,interest_rate,car_type
0,1,34689.96,687.23,202.93,484.3,34205.66,60,0.0702,Toyota Sienna
1,2,34205.66,687.23,200.1,487.13,33718.53,60,0.0702,Toyota Sienna
2,3,33718.53,687.23,197.25,489.98,33228.55,60,0.0702,Toyota Sienna
3,4,33228.55,687.23,194.38,492.85,32735.7,60,0.0702,Toyota Sienna
4,5,32735.7,687.23,191.5,495.73,32239.97,60,0.0702,Toyota Sienna


In [8]:
# Approach 2 list replacement
# Only changing Month -> month, but we need to list the rest of columns
df.columns = ['month',
              'starting_balance',
              'Repayment',
              'interest_paid',
              'principal_paid',
              'new_balance',
              'term',
              'interest_rate',
              'car_type']

In [9]:
df.head()

Unnamed: 0,month,starting_balance,Repayment,interest_paid,principal_paid,new_balance,term,interest_rate,car_type
0,1,34689.96,687.23,202.93,484.3,34205.66,60,0.0702,Toyota Sienna
1,2,34205.66,687.23,200.1,487.13,33718.53,60,0.0702,Toyota Sienna
2,3,33718.53,687.23,197.25,489.98,33228.55,60,0.0702,Toyota Sienna
3,4,33228.55,687.23,194.38,492.85,32735.7,60,0.0702,Toyota Sienna
4,5,32735.7,687.23,191.5,495.73,32239.97,60,0.0702,Toyota Sienna


## Deleting columns

In [17]:
# Approach 1
# This approach allows yu to drop multiple columns at a time
df = df.drop(columns=['term'])


In [18]:
df.head()

Unnamed: 0,Month,Starting Balance,Interest Paid,Principal Paid,New Balance,interest_rate,car_type
0,1,34689.96,202.93,484.3,34205.66,0.0702,Toyota Sienna
1,2,34205.66,200.1,487.13,33718.53,0.0702,Toyota Sienna
2,3,33718.53,197.25,489.98,33228.55,0.0702,Toyota Sienna
3,4,33228.55,194.38,492.85,32735.7,0.0702,Toyota Sienna
4,5,32735.7,191.5,495.73,32239.97,0.0702,Toyota Sienna


In [15]:
# Approach 2 use the del command
del df['Repayment']


In [16]:
df.head()

Unnamed: 0,Month,Starting Balance,Interest Paid,Principal Paid,New Balance,term,interest_rate,car_type
0,1,34689.96,202.93,484.3,34205.66,60,0.0702,Toyota Sienna
1,2,34205.66,200.1,487.13,33718.53,60,0.0702,Toyota Sienna
2,3,33718.53,197.25,489.98,33228.55,60,0.0702,Toyota Sienna
3,4,33228.55,194.38,492.85,32735.7,60,0.0702,Toyota Sienna
4,5,32735.7,191.5,495.73,32239.97,60,0.0702,Toyota Sienna
