# Pandas Tutorial
 

In [5]:
import pandas as pd

# Slicing the Data Frame

In [2]:
XYZ_web= {'Day':[1,2,3,4,5,6], "Visitors":[1000, 700,6000,1000,400,350], "Bounce_Rate":[20,20, 23,15,10,34]}
 
df= pd.DataFrame(XYZ_web)
 
print(df)

   Day  Visitors  Bounce_Rate
0    1      1000           20
1    2       700           20
2    3      6000           23
3    4      1000           15
4    5       400           10
5    6       350           34


# Print starting 2 rows & ending 2 rows from the table

In [3]:
print(df.head(2))

Unnamed: 0,Day,Visitors,Bounce_Rate
0,1,1000,20
1,2,700,20


In [4]:
print(df.tail(2))

   Day  Visitors  Bounce_Rate
4    5       400           10
5    6       350           34


# Merging & Joining

In merging, you can merge two data frames to form a single data frame. You can also decide which columns you want to make common. Let me implement that practically, first I will create three data frames, which has some key-value pairs and then merge the data frames together. Refer the code below:

In [6]:
df1= pd.DataFrame({ "HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2=pd.DataFrame({ "HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":[50,45,45,67]}, index=[2005, 2006,2007,2008])
 
merged= pd.merge(df1,df2)
 
print(merged)

   HPI  Int_Rate  IND_GDP
0   80         2       50
1   90         1       45
2   70         2       45
3   60         3       67


As you can see above, the two data frames has merged into a single data frame. Now, you can also specify the column which you want to make common. For example, I want the “HPI” column to be common and for everything else, I want separate columns. So, let me implement that practically:

In [7]:
df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":[50,45,45,67]}, index=[2005, 2006,2007,2008])
 
merged= pd.merge(df1,df2,on ="HPI")
 
print(merged)

   HPI  Int_Rate_x  IND_GDP_x  Int_Rate_y  IND_GDP_y
0   80           2         50           2         50
1   90           1         45           1         45
2   70           2         45           2         45
3   60           3         67           3         67


In [15]:
df1 = pd.DataFrame({"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2 = pd.DataFrame({"Low_Tier_HPI":[50,45,67,34],"Unemployment":[1,3,5,6]}, index=[2001, 2003,2004,2005])
 
joined= df1.join(df2)

print(joined)

      Int_Rate  IND_GDP  Low_Tier_HPI  Unemployment
2001         2       50          50.0           1.0
2002         1       45           NaN           NaN
2003         2       45          45.0           3.0
2004         3       67          67.0           5.0


In [14]:
df1 = pd.DataFrame({"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2 = pd.DataFrame({"Low_Tier_HPI":[50,45,67,34],"Unemployment":[1,3,5,6]}, index=[2001, 2003,2004,2005])
 
joined= df2.join(df1)

print(joined)

      Low_Tier_HPI  Unemployment  Int_Rate  IND_GDP
2001            50             1       2.0     50.0
2003            45             3       2.0     45.0
2004            67             5       3.0     67.0
2005            34             6       NaN      NaN


# Concatenation 

Concatenation basically glues the dataframes together. You can select the dimension on which you want to concatenate. For that, just use “pd.concat” and pass in the list of dataframes to concatenate together. Consider the below example.

In [16]:
df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":[50,45,45,67]}, index=[2005, 2006,2007,2008])
 
concat= pd.concat([df1,df2])
 
print(concat)

      HPI  Int_Rate  IND_GDP
2001   80         2       50
2002   90         1       45
2003   70         2       45
2004   60         3       67
2005   80         2       50
2006   90         1       45
2007   70         2       45
2008   60         3       67


In [17]:
df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]}, index=[2001, 2002,2003,2004])
 
df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":[50,45,45,67]}, index=[2005, 2006,2007,2008])
 
concat= pd.concat([df1,df2],axis=1)
 
print(concat)

       HPI  Int_Rate  IND_GDP   HPI  Int_Rate  IND_GDP
2001  80.0       2.0     50.0   NaN       NaN      NaN
2002  90.0       1.0     45.0   NaN       NaN      NaN
2003  70.0       2.0     45.0   NaN       NaN      NaN
2004  60.0       3.0     67.0   NaN       NaN      NaN
2005   NaN       NaN      NaN  80.0       2.0     50.0
2006   NaN       NaN      NaN  90.0       1.0     45.0
2007   NaN       NaN      NaN  70.0       2.0     45.0
2008   NaN       NaN      NaN  60.0       3.0     67.0


# Change the index

Next in python pandas tutorial, we’ll understand how to change the index values in a dataframe. For example, let us create a dataframe with some key value pairs in a dictionary and change the index values. Consider the example below:

In [25]:
import pandas as pd
 
df= pd.DataFrame({"Day":[1,2,3,4], "Visitors":[200, 100,230,300], "Bounce_Rate":[20,45,60,10]}) 
 
df.set_index("Day", inplace= True)
 
print(df)

     Visitors  Bounce_Rate
Day                       
1         200           20
2         100           45
3         230           60
4         300           10


As you can notice in the output above, the index value has been changed with respect to the “Day” column.

# Change the Column Headers

Let us now change the headers of column in this python pandas tutorial. Let us take the same example, where I will change the column header from “Visitors” to “Users”. So, let me implement it practically.

In [26]:
import pandas as pd
 
df = pd.DataFrame({"Day":[1,2,3,4], "Visitors":[200, 100,230,300], "Bounce_Rate":[20,45,60,10]})
 
df = df.rename(columns={"Visitors":"Users"})
 
print(df)

   Day  Users  Bounce_Rate
0    1    200           20
1    2    100           45
2    3    230           60
3    4    300           10


As you see above, column header “Visitors” has been changed to “Users”. Next in python pandas tutorial, let us perform data munging.