In [1]:
import pandas as pd

In [2]:
topfive_2004 = pd.read_csv('topfive_2004.csv', index_col='Athlete')
topfive_2008 = pd.read_csv('topfive_2008.csv', index_col='Athlete')

In [3]:
topfive_2004

Unnamed: 0_level_0,Gold,Silver,Bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"PHELPS, Michael",6.0,0.0,2.0
"PEIRSOL, Aaron",3.0,0.0,0.0
"THORPE, Ian",2.0,1.0,1.0
"KITAJIMA, Kosuke",2.0,0.0,1.0
"HACKETT, Grant",1.0,2.0,0.0


In [4]:
topfive_2008

Unnamed: 0_level_0,Gold,Silver,bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"PHELPS, Michael",8.0,0.0,0.0
"GREVERS, Matt",2.0,1.0,0.0
"PEIRSOL, Aaron",2.0,1.0,0.0
"LOCHTE, Ryan",2.0,0.0,2.0
"KITAJIMA, Kosuke",2.0,0.0,1.0


If we just try to add the dataframes together to get the total number of Gold, Silver and Bronze medals, we only get results where the index-column pairs are identical, everything else returns as NaN

In [5]:
topfive_2004 + topfive_2008

Unnamed: 0_level_0,Bronze,Gold,Silver,bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"GREVERS, Matt",,,,
"HACKETT, Grant",,,,
"KITAJIMA, Kosuke",,4.0,0.0,
"LOCHTE, Ryan",,,,
"PEIRSOL, Aaron",,5.0,1.0,
"PHELPS, Michael",,14.0,0.0,
"THORPE, Ian",,,,


#### `.add()`
Does the same as `df1 + df2` but also allows us to use a `fill_value` argument to avoid all of the NaN returns.

In [7]:
topfive_2004.add(topfive_2008, fill_value = 0)

Unnamed: 0_level_0,Bronze,Gold,Silver,bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
"GREVERS, Matt",,2.0,1.0,0.0
"HACKETT, Grant",0.0,1.0,2.0,
"KITAJIMA, Kosuke",1.0,4.0,0.0,1.0
"LOCHTE, Ryan",,2.0,0.0,2.0
"PEIRSOL, Aaron",0.0,5.0,1.0,0.0
"PHELPS, Michael",2.0,14.0,0.0,0.0
"THORPE, Ian",1.0,2.0,1.0,


Above, we still have some NaNs, but that's because our column names do not perfectly match.  We can clean up our column names first to get our expected result.

In [8]:
topfive_2008.rename(columns = {'bronze':'Bronze'}, inplace=True)

In [9]:
topfive_2004.add(topfive_2008, fill_value = 0)

Unnamed: 0_level_0,Gold,Silver,Bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"GREVERS, Matt",2.0,1.0,0.0
"HACKETT, Grant",1.0,2.0,0.0
"KITAJIMA, Kosuke",4.0,0.0,2.0
"LOCHTE, Ryan",2.0,0.0,2.0
"PEIRSOL, Aaron",5.0,1.0,0.0
"PHELPS, Michael",14.0,0.0,2.0
"THORPE, Ian",2.0,1.0,1.0


Additionally, we can use the `.sub()` method to subtract one dataframe from another.  Also available are `.mul` and `.div` for multiplication and division.

In [10]:
topfive_2004.sub(topfive_2008, fill_value = 0)

Unnamed: 0_level_0,Gold,Silver,Bronze
Athlete,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"GREVERS, Matt",-2.0,-1.0,0.0
"HACKETT, Grant",1.0,2.0,0.0
"KITAJIMA, Kosuke",0.0,0.0,0.0
"LOCHTE, Ryan",-2.0,0.0,-2.0
"PEIRSOL, Aaron",1.0,-1.0,0.0
"PHELPS, Michael",-2.0,0.0,2.0
"THORPE, Ian",2.0,1.0,1.0
