In [1]:
# import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## Appending pandas Series
In this exercise, you'll load sales data from the months January, February, and March into DataFrames. Then, you'll extract Series with the **`'Units'`** column from each and append them together with method chaining using **`.append()`**.

To check that the stacking worked, you'll print slices from these Series, and finally, you'll add the result to figure out the total units sold in the first quarter.

In [2]:
# Load 'sales-jan-2015.csv' into a DataFrame: jan
jan = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/sales-jan-2015.csv', 
                  parse_dates=True, index_col='Date')

# Load 'sales-feb-2015.csv' into a DataFrame: feb
feb = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/sales-feb-2015.csv', 
                  parse_dates=True, index_col='Date')

# Load 'sales-mar-2015.csv' into a DataFrame: mar
mar = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/sales-mar-2015.csv', 
                  parse_dates=True, index_col='Date')

In [3]:
# Extract the 'Units' column from jan: jan_units
jan_units = jan['Units']

# Extract the 'Units' column from feb: feb_units
feb_units = feb['Units']

# Extract the 'Units' column from mar: mar_units
mar_units = mar['Units']

In [4]:
# Append feb_units and then mar_units to jan_units: quarter1
quarter1 = jan_units.append(feb_units).append(mar_units)

# Print the first slice from quarter1
print(quarter1.loc['jan 27, 2015':'feb 2, 2015'])

# Print the second slice from quarter1
print(quarter1.loc['feb 26, 2015':'mar 7,2015'])

Date
2015-01-27 07:11:55    18
2015-02-02 08:33:01     3
2015-02-02 20:54:49     9
Name: Units, dtype: int64
Date
2015-02-26 08:57:45     4
2015-02-26 08:58:51     1
2015-03-06 10:11:45    17
2015-03-06 02:03:56    17
Name: Units, dtype: int64


In [5]:
# Compute & print total sales in quarter1
quarter1.sum()

642

## Concatenating pandas Series along row axis
Having learned how to append Series, you'll now learn how to achieve the same result by concatenating Series instead. You'll continue to work with the sales data you've seen previously.

Your job is to use **`pd.concat()`** with a list of Series to achieve the same result that you would get by chaining calls to **`.append().`**

You may be wondering about the difference between **`pd.concat()`** and pandas' **`.append()`** method. One way to think of the difference is that **`.append()`** is a specific case of a concatenation, while **`pd.concat()`** gives you more flexibility, as you'll see in later exercises.

In [6]:
# Initialize empty list: units
units = []

# Build the list of Series
for month in [jan, feb, mar]:
    units.append(month.Units)
units

[Date
 2015-01-21 19:13:21    11
 2015-01-09 05:23:51     8
 2015-01-06 17:19:34    17
 2015-01-02 09:51:06    16
 2015-01-11 14:51:02    11
 2015-01-01 07:31:20    18
 2015-01-24 08:01:16     1
 2015-01-25 15:40:07     6
 2015-01-13 05:36:12     7
 2015-01-03 18:00:19    19
 2015-01-16 00:33:47    17
 2015-01-16 07:21:12    13
 2015-01-20 19:49:24    12
 2015-01-26 01:50:25    14
 2015-01-15 02:38:25    16
 2015-01-06 13:47:37    16
 2015-01-15 15:33:40     7
 2015-01-27 07:11:55    18
 2015-01-20 11:28:02    13
 2015-01-16 19:20:46     8
 Name: Units, dtype: int64, Date
 2015-02-26 08:57:45     4
 2015-02-16 12:09:19    10
 2015-02-03 14:14:18    13
 2015-02-02 08:33:01     3
 2015-02-25 00:29:00    10
 2015-02-05 01:53:06    19
 2015-02-09 08:57:30    19
 2015-02-11 20:03:08     7
 2015-02-04 21:52:45    14
 2015-02-09 13:09:55     7
 2015-02-07 22:58:10     1
 2015-02-11 22:50:44     4
 2015-02-26 08:58:51     1
 2015-02-05 22:05:03    10
 2015-02-04 15:36:29    13
 2015-02-19 16:0

In [7]:
# Concatenate the list: quarter1
quarter1 = pd.concat([jan.Units, feb.Units, mar.Units], axis='rows')
quarter1

Date
2015-01-21 19:13:21    11
2015-01-09 05:23:51     8
2015-01-06 17:19:34    17
2015-01-02 09:51:06    16
2015-01-11 14:51:02    11
2015-01-01 07:31:20    18
2015-01-24 08:01:16     1
2015-01-25 15:40:07     6
2015-01-13 05:36:12     7
2015-01-03 18:00:19    19
2015-01-16 00:33:47    17
2015-01-16 07:21:12    13
2015-01-20 19:49:24    12
2015-01-26 01:50:25    14
2015-01-15 02:38:25    16
2015-01-06 13:47:37    16
2015-01-15 15:33:40     7
2015-01-27 07:11:55    18
2015-01-20 11:28:02    13
2015-01-16 19:20:46     8
2015-02-26 08:57:45     4
2015-02-16 12:09:19    10
2015-02-03 14:14:18    13
2015-02-02 08:33:01     3
2015-02-25 00:29:00    10
2015-02-05 01:53:06    19
2015-02-09 08:57:30    19
2015-02-11 20:03:08     7
2015-02-04 21:52:45    14
2015-02-09 13:09:55     7
2015-02-07 22:58:10     1
2015-02-11 22:50:44     4
2015-02-26 08:58:51     1
2015-02-05 22:05:03    10
2015-02-04 15:36:29    13
2015-02-19 16:02:58    10
2015-02-19 10:59:33    16
2015-02-02 20:54:49     9
2015-02

In [8]:
# Print slices from quarter1
print(quarter1.loc['jan 27, 2015':'feb 2, 2015'])
print(quarter1.loc['feb 26, 2015':'mar 7, 2015'])

Date
2015-01-27 07:11:55    18
2015-02-02 08:33:01     3
2015-02-02 20:54:49     9
Name: Units, dtype: int64
Date
2015-02-26 08:57:45     4
2015-02-26 08:58:51     1
2015-03-06 10:11:45    17
2015-03-06 02:03:56    17
Name: Units, dtype: int64


## Appending DataFrames with ignore_index
In this exercise, you'll use the Baby Names Dataset (from data.gov) again. This time, names_1981 and names_1881 are to be loaded without specifying an Index column (so the default Indexes for both are RangeIndexes).

You'll use the DataFrame **`.append()`** method to make a DataFrame **`combined_names`**. To distinguish rows from the original two DataFrames, you'll add a **`'year'`** column to each with the year (1881 or 1981 in this case). In addition, you'll specify **`ignore_index=True`** so that the index values are not used along the concatenation axis. The resulting axis will instead be labeled **`0, 1, ..., n-1`**, which is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information.

In [9]:
# Import the data files
names_1881 = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Baby names/names1881.csv', 
                         header=None, names=['name', 'gender', 'count'])
names_1981 = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Baby names/names1981.csv', 
                         header=None, names=['name', 'gender', 'count'])
print(names_1881.head())
print(names_1981.head())

        name gender  count
0       Mary      F   6919
1       Anna      F   2698
2       Emma      F   2034
3  Elizabeth      F   1852
4   Margaret      F   1658
       name gender  count
0  Jennifer      F  57032
1   Jessica      F  42519
2    Amanda      F  34370
3     Sarah      F  28162
4   Melissa      F  28003


In [10]:
# Add 'year' column to names_1881 and names_1981
names_1881['year'] = 1881
names_1981['year'] = 1981

print(names_1881.head(3))
print(names_1981.head(2))

   name gender  count  year
0  Mary      F   6919  1881
1  Anna      F   2698  1881
2  Emma      F   2034  1881
       name gender  count  year
0  Jennifer      F  57032  1981
1   Jessica      F  42519  1981


In [11]:
# Append names_1981 after names_1881 with ignore_index=True: combined_names
combined_names = names_1881.append(names_1981, ignore_index=True)

# Print shapes of names_1981, names_1881, and combined_names
print(names_1981.shape)
print(names_1881.shape)
print(combined_names.shape)

(19455, 4)
(1935, 4)
(21390, 4)


In [12]:
# Print all rows that contain the name 'Morgan'
combined_names.loc[combined_names.name == 'Morgan', :]

Unnamed: 0,name,gender,count,year
1283,Morgan,M,23,1881
2096,Morgan,F,1769,1981
14390,Morgan,M,766,1981


## Concatenating pandas DataFrames along column axis
The function **`pd.concat()`** can concatenate DataFrames horizontally as well as vertically (vertical is the default). To make the DataFrames stack horizontally, you have to specify the keyword argument **`axis=1 or axis='columns'`**.

In this exercise, you'll use weather data with maximum and mean daily temperatures sampled at different rates (quarterly versus monthly). You'll concatenate the rows of both and see that, where rows are missing in the coarser DataFrame, null values are inserted in the concatenated DataFrame. This corresponds to an *outer join* (which you will explore in more detail in later exercises).

In [13]:
# Create the weather_max and weather_mean Dataframes

weather_max = pd.DataFrame({'Max TemperatureF': [68,89,91,84]}, index= ['Jan', 'Apr', 'Jul', 'Oct'] )
weather_max.index.name = 'Month'
weather_mean = pd.DataFrame({'Mean TemperatureF':[32.354839, 28.714286,35.0,53.1,62.612903,70.133333,72.870968,
                                                  70.0,63.766667,55.451613,39.8,34.935484]}, 
                            index=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])
weather_mean.index.name = 'Month'


In [14]:
# Concatenate weather_max and weather_mean horizontally: weather
weather = pd.concat([weather_max,weather_mean], axis='columns') # or axis=1

# Print weather
weather

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  


Unnamed: 0,Max TemperatureF,Mean TemperatureF
Apr,89.0,53.1
Aug,,70.0
Dec,,34.935484
Feb,,28.714286
Jan,68.0,32.354839
Jul,91.0,72.870968
Jun,,70.133333
Mar,,35.0
May,,62.612903
Nov,,39.8


## Reading multiple files to build a DataFrame
It is often convenient to build a large DataFrame by parsing many files as DataFrames and concatenating them all at once. You'll do this here with three files, but, in principle, this approach can be used to combine data from dozens or hundreds of files.

In [16]:
'''
Not sure why this would not work in the Jupyter Notebook, yet worked in DataCamp
for medal in medal_types:

    # Create the file name: file_name
    file_name = "%s_top5.csv" % medal
    
    # Create list of column names: columns
    columns = ['Country', medal]
    
    # Read file_name into a DataFrame: df
    medal_df = pd.read_csv(file_name, header=0, index_col='Country', names=columns)

    # Append medal_df to medals
    medals.append(medal_df)

# Concatenate medals horizontally: medals
medals = pd.concat(medals, axis='columns')

# Print medals
print(medals)
'''
medal_types = ['bronze','silver','gold']
medals = []
for medal in medal_types:

    # Create the file name: file_name
    file_name = 'D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Summer Olympic medals/%s_top5.csv' % medal
    
    # Create list of column names: columns
    columns = ['Country', medal]
    
    # Read file_name into a DataFrame: df
    medal_df = pd.read_csv(file_name, header=0, index_col='Country', names=columns)

    # Append medal_df to medals
    medals.append(medal_df)

# Concatenate medals horizontally: medals
medals = pd.concat(medals, axis='columns')

# Print medals
print(medals)

                bronze  silver    gold
France           475.0   461.0     NaN
Germany          454.0     NaN   407.0
Italy              NaN   394.0   460.0
Soviet Union     584.0   627.0   838.0
United Kingdom   505.0   591.0   498.0
United States   1052.0  1195.0  2088.0


of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.




## Concatenating vertically to get MultiIndexed rows
When stacking a sequence of DataFrames vertically, it is sometimes desirable to construct a MultiIndex to indicate the DataFrame from which each row originated. 

This can be done by specifying the **`keys`** parameter in the call to **`pd.concat()`**, which generates a hierarchical index with the labels from keys as the outermost index label. So you don't have to rename the columns of each DataFrame as you load it. Instead, only the Index column needs to be specified.

Here, you'll continue working with DataFrames compiled from The Guardian's Olympic medal dataset.

In [17]:
'''
DataCamp Code(That executed) :
for medal in medal_types:

    file_name = "%s_top5.csv" % medal
    
    # Read file_name into a DataFrame: medal_df
    medal_df = pd.read_csv(file_name, index_col='Country')
    
    # Append medal_df to medals
    medals.append(medal_df)
    
# Concatenate medals: medals
medals = pd.concat(medals, keys=['bronze','silver','gold'])

# Print medals in entirety
print(medals)'''

# My Code that bombed
medal_types = ['bronze','silver','gold']
medals = []
for medal in medal_types:

    # Create the file name: file_name
    file_name = 'D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Summer Olympic medals/%s_top5.csv' % medal
    
    # Create list of column names: columns
    columns = ['Country', medal]
    
    # Read file_name into a DataFrame: df
    medal_df = pd.read_csv(file_name, index_col='Country')
    
    # Append medal_df to medals
    medals.append(medal_df)
    
# Concatenate medals: medals
medals = pd.concat(medals, keys=['bronze','silver','gold'])

# Print medals in entirety
print(medals)


                        Total
       Country               
bronze United States   1052.0
       Soviet Union     584.0
       United Kingdom   505.0
       France           475.0
       Germany          454.0
silver United States   1195.0
       Soviet Union     627.0
       United Kingdom   591.0
       France           461.0
       Italy            394.0
gold   United States   2088.0
       Soviet Union     838.0
       United Kingdom   498.0
       Italy            460.0
       Germany          407.0


## Slicing MultiIndexed DataFrames
This exercise picks up where the last ended (again using The Guardian's Olympic medal dataset).

You are provided with the MultiIndexed DataFrame as produced at the end of the preceding exercise. Your task is to sort the DataFrame and to use the pd.IndexSlice to extract specific slices.

In [18]:
'''
This should work when we get the 'medals' situation figured out
'''
# Sort the entries of medals: medals_sorted
medals_sorted = medals.sort_index(level=0)

# Print the number of Bronze medals won by Germany
print(medals_sorted.loc[('bronze','Germany')])

# Print data about silver medals
print(medals_sorted.loc['silver'])

# Create alias for pd.IndexSlice: idx
idx = pd.IndexSlice

# Print all the data on medals won by the United Kingdom
print(medals_sorted.loc[idx[:,'United Kingdom'], :])

Total    454.0
Name: (bronze, Germany), dtype: float64
                 Total
Country               
France           461.0
Italy            394.0
Soviet Union     627.0
United Kingdom   591.0
United States   1195.0
                       Total
       Country              
bronze United Kingdom  505.0
gold   United Kingdom  498.0
silver United Kingdom  591.0


## Concatenating horizontally to get MultiIndexed columns
It is also possible to construct a DataFrame with hierarchically indexed columns. For this exercise, you'll start with pandas imported and a list of three DataFrames called **`dataframes`**. 

All three DataFrames contain **`'Company', 'Product', and 'Units'`** columns with a **`'Date'`** column as the index pertaining to sales transactions during the month of February, 2015. The first DataFrame describes **Hardware** transactions, the second describes **Software** transactions, and the third, **Service** transactions.

Your task is to concatenate the DataFrames horizontally and to create a MultiIndex on the columns. From there, you can summarize the resulting DataFrame and slice some information from it.

In [19]:
# Create the list of file names: filenames
filenames = ['D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/feb-sales-Hardware.csv',
             'D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/feb-sales-Software.csv',
             'D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Sales/feb-sales-Service.csv']
# Create the list of three DataFrames: dataframes
dataframes = []
for filename in filenames:
    dataframes.append(pd.read_csv(filename, index_col='Date', parse_dates=True))
    
dataframes

[                             Company   Product  Units
 Date                                                 
 2015-02-04 21:52:45  Acme Coporation  Hardware     14
 2015-02-07 22:58:10  Acme Coporation  Hardware      1
 2015-02-19 10:59:33        Mediacore  Hardware     16
 2015-02-02 20:54:49        Mediacore  Hardware      9
 2015-02-21 20:41:47            Hooli  Hardware      3,
                              Company   Product  Units
 Date                                                 
 2015-02-16 12:09:19            Hooli  Software     10
 2015-02-03 14:14:18          Initech  Software     13
 2015-02-02 08:33:01            Hooli  Software      3
 2015-02-05 01:53:06  Acme Coporation  Software     19
 2015-02-11 20:03:08          Initech  Software      7
 2015-02-09 13:09:55        Mediacore  Software      7
 2015-02-11 22:50:44            Hooli  Software      4
 2015-02-04 15:36:29        Streeplex  Software     13
 2015-02-21 05:01:26        Mediacore  Software      3,
        

In [20]:
# Concatenate dataframes: february
february = pd.concat(dataframes, axis=1, keys=['Hardware','Software','Service'])
february.head()

Unnamed: 0_level_0,Hardware,Hardware,Hardware,Software,Software,Software,Service,Service,Service
Unnamed: 0_level_1,Company,Product,Units,Company,Product,Units,Company,Product,Units
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2
2015-02-02 08:33:01,,,,Hooli,Software,3.0,,,
2015-02-02 20:54:49,Mediacore,Hardware,9.0,,,,,,
2015-02-03 14:14:18,,,,Initech,Software,13.0,,,
2015-02-04 15:36:29,,,,Streeplex,Software,13.0,,,
2015-02-04 21:52:45,Acme Coporation,Hardware,14.0,,,,,,


In [21]:
february.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 20 entries, 2015-02-02 08:33:01 to 2015-02-26 08:58:51
Data columns (total 9 columns):
(Hardware, Company)    5 non-null object
(Hardware, Product)    5 non-null object
(Hardware, Units)      5 non-null float64
(Software, Company)    9 non-null object
(Software, Product)    9 non-null object
(Software, Units)      9 non-null float64
(Service, Company)     6 non-null object
(Service, Product)     6 non-null object
(Service, Units)       6 non-null float64
dtypes: float64(3), object(6)
memory usage: 1.6+ KB


In [22]:
# Assign pd.IndexSlice: idx
idx = pd.IndexSlice

# Create the slice: slice_2_8
slice_2_8 = february.loc['2015-02-02':'2015-02-08', idx[:, 'Company']]

# Print slice_2_8
slice_2_8

Unnamed: 0_level_0,Hardware,Software,Service
Unnamed: 0_level_1,Company,Company,Company
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
2015-02-02 08:33:01,,Hooli,
2015-02-02 20:54:49,Mediacore,,
2015-02-03 14:14:18,,Initech,
2015-02-04 15:36:29,,Streeplex,
2015-02-04 21:52:45,Acme Coporation,,
2015-02-05 01:53:06,,Acme Coporation,
2015-02-05 22:05:03,,,Hooli
2015-02-07 22:58:10,Acme Coporation,,


## Concatenating DataFrames from a dict
You're now going to revisit the sales data you worked with earlier in the chapter. Three DataFrames **`jan, feb, and mar`** have been pre-loaded for you. Your task is to aggregate the sum of all sales over the **`'Company'`** column into a single DataFrame. You'll do this by constructing a dictionary of these DataFrames and then concatenating them.


In [23]:
print(jan.head())
print(feb.head())
print(mar.head())

                       Company   Product  Units
Date                                           
2015-01-21 19:13:21  Streeplex  Hardware     11
2015-01-09 05:23:51  Streeplex   Service      8
2015-01-06 17:19:34    Initech  Hardware     17
2015-01-02 09:51:06      Hooli  Hardware     16
2015-01-11 14:51:02      Hooli  Hardware     11
                       Company   Product  Units
Date                                           
2015-02-26 08:57:45  Streeplex   Service      4
2015-02-16 12:09:19      Hooli  Software     10
2015-02-03 14:14:18    Initech  Software     13
2015-02-02 08:33:01      Hooli  Software      3
2015-02-25 00:29:00    Initech   Service     10
                       Company   Product  Units
Date                                           
2015-03-22 14:42:25  Mediacore  Software      6
2015-03-12 18:33:06    Initech   Service     19
2015-03-22 03:58:28  Streeplex  Software      8
2015-03-15 00:53:12      Hooli  Hardware     19
2015-03-17 19:25:37      Hooli  Hardware

In [24]:
# Make the list of tuples: month_list
month_list = [('january', jan),('february', feb), ('march', mar)]
month_list

[('january',                              Company   Product  Units
  Date                                                 
  2015-01-21 19:13:21        Streeplex  Hardware     11
  2015-01-09 05:23:51        Streeplex   Service      8
  2015-01-06 17:19:34          Initech  Hardware     17
  2015-01-02 09:51:06            Hooli  Hardware     16
  2015-01-11 14:51:02            Hooli  Hardware     11
  2015-01-01 07:31:20  Acme Coporation  Software     18
  2015-01-24 08:01:16          Initech  Software      1
  2015-01-25 15:40:07          Initech   Service      6
  2015-01-13 05:36:12            Hooli   Service      7
  2015-01-03 18:00:19            Hooli   Service     19
  2015-01-16 00:33:47            Hooli  Hardware     17
  2015-01-16 07:21:12          Initech   Service     13
  2015-01-20 19:49:24  Acme Coporation  Hardware     12
  2015-01-26 01:50:25  Acme Coporation  Software     14
  2015-01-15 02:38:25  Acme Coporation   Service     16
  2015-01-06 13:47:37  Acme Coporatio

In [25]:
# Create an empty dictionary: month_dict
month_dict = {}

for month_name, month_data in month_list:

    # Group month_data: month_dict[month_name]
    month_dict[month_name] = month_data.groupby('Company').sum()

# Concatenate data in month_dict: sales
sales = pd.concat(month_dict)

# Print sales
sales

Unnamed: 0_level_0,Unnamed: 1_level_0,Units
Unnamed: 0_level_1,Company,Unnamed: 2_level_1
february,Acme Coporation,34
february,Hooli,30
february,Initech,30
february,Mediacore,45
february,Streeplex,37
january,Acme Coporation,76
january,Hooli,70
january,Initech,37
january,Mediacore,15
january,Streeplex,50


In [26]:
# Print all sales by Mediacore
idx = pd.IndexSlice
sales.loc[idx[:, 'Mediacore'], :]

Unnamed: 0_level_0,Unnamed: 1_level_0,Units
Unnamed: 0_level_1,Company,Unnamed: 2_level_1
february,Mediacore,45
january,Mediacore,15
march,Mediacore,68


## Concatenating DataFrames with inner join
Here, you'll continue working with DataFrames compiled from **The Guardian's Olympic medal dataset.**

The DataFrames **`bronze, silver, and gold`** have been pre-loaded for you.

Your task is to compute an inner join.

In [31]:
#Import data
bronze = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Summer Olympic medals/bronze_top5.csv', index_col='Country')
silver = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Summer Olympic medals/silver_top5.csv', index_col='Country')
gold = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/Summer Olympic medals/gold_top5.csv', index_col='Country')

In [32]:
# Create the list of DataFrames: medal_list
medal_list = [bronze, silver, gold]

# Concatenate medal_list horizontally using an inner join: medals
medals = pd.concat(medal_list, axis=1, join='inner', keys=['bronze','silver','gold'])

# Print medals
medals

Unnamed: 0_level_0,bronze,silver,gold
Unnamed: 0_level_1,Total,Total,Total
Country,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
United States,1052.0,1195.0,2088.0
Soviet Union,584.0,627.0,838.0
United Kingdom,505.0,591.0,498.0


## Resampling & concatenating DataFrames with inner join
In this exercise, you'll compare the historical 10-year GDP (Gross Domestic Product) growth in the US and in China. The data for the US starts in 1947 and is recorded quarterly; by contrast, the data for China starts in 1961 and is recorded annually.

You'll need to use a combination of resampling and an inner join to align the index labels. You'll need an appropriate offset alias for resampling, and the method **`.resample()`** must be chained with some kind of aggregation method (**`.pct_change()`** and **`.last()`** in this case).

In [42]:
# Import the datasets
us = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/GDP/gdp_usa.csv', 
                 index_col='Year', parse_dates=True, header=0, names=['Year', 'US'])
china = pd.read_csv('D:/Springboard_DataCamp/data/Merging_DataFrames_with_Pandas/GDP/gdp_china.csv', 
                    index_col='Year', parse_dates=True, header=0, names=['Year', 'China'])

In [44]:
print(china.head())
print(us.head())

                China
Year                 
1960-01-01  59.184116
1961-01-01  49.557050
1962-01-01  46.685179
1963-01-01  50.097303
1964-01-01  59.062255
               US
Year             
1947-01-01  243.1
1947-04-01  246.3
1947-07-01  250.1
1947-10-01  260.3
1948-01-01  266.2
