<a href="https://colab.research.google.com/github/Kunbao2006/2019DataScienceCourse/blob/master/20191014_02_Pandas_and_Dataframes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Pandas and Dataframes

**Pandas** is a popular Python library, used for statistical analysis. We operate on **dataframes**, which you can think of as "Excel sheets as variables in Python". 

In [142]:
import pandas as pd

from google.colab import drive
drive.mount('/content/drive')

driveLoc = "/content/drive/My Drive"

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Importing data

Now, let's import some data! We do this using the `read_csv` function within the `pd` library, so we call it with `pd.read_csv`. We tell it one thing: the location of the file to be read. 

Let's get started by importing a CSV file, which are plain-text versions of data values! Some sample CSV files can be found in this server's `Resources` folder. To use them, you'll need to go to the homepage, download the CSV files from the `Resources` folder, and **re-upload to the same folder as this notebook**. 

Once you're ready, run this line to store all that data in to a single variable:

Google Drive is mounted at /content/drive

/content/drive/My Drive/Colab Notebooks/T4 Data Analysis /Original


In [0]:
data = pd.read_csv(r"/content/drive/My Drive/Colab Notebooks/T4 Data Analysis /Original/gdp_asia.csv")

Printing is a bit unwieldy because of all the data. To read just a bit of info from the beginning, which lets you check if everything imported correctly, use `.head()`:

In [0]:
data.head(10) # number of lines to be printed out

In [0]:
data.tail()

Notice that the column headings are each of the columns in the CSV file, and the row headings are just numbers. This is OK, but not quite right--we want each row heading to be the country. In pandas terminology, we want to **set the country column as the index**. To do this, we re-import, and specify an `index_col`:

In [0]:
data = pd.read_csv(r"/content/drive/My Drive/Colab Notebooks/T4 Data Analysis /Original/gdp_asia.csv", index_col="country")

In [12]:
data.head()

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_1957,gdpPercap_1962,gdpPercap_1967,gdpPercap_1972,gdpPercap_1977,gdpPercap_1982,gdpPercap_1987,gdpPercap_1992,gdpPercap_1997,gdpPercap_2002,gdpPercap_2007
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Afghanistan,779.445314,820.85303,853.10071,836.197138,739.981106,786.11336,978.011439,852.395945,649.341395,635.341351,726.734055,974.580338
Bahrain,9867.084765,11635.79945,12753.27514,14804.6727,18268.65839,19340.10196,19211.14731,18524.02406,19035.57917,20292.01679,23403.55927,29796.04834
Bangladesh,684.244172,661.637458,686.341554,721.186086,630.233627,659.877232,676.981866,751.979403,837.810164,972.770035,1136.39043,1391.253792
Cambodia,368.469286,434.038336,496.913648,523.432314,421.624026,524.972183,624.475478,683.895573,682.303175,734.28517,896.226015,1713.778686
China,400.448611,575.987001,487.674018,612.705693,676.900092,741.23747,962.42138,1378.904018,1655.784158,2289.234136,3119.280896,4959.114854


Now, the first two rows look a bit weird, but that's pandas' way of telling us that "country" is an index column. You can have multiple index columns, but that's beyond the scope of this class.

<hr>

## Some DataFrame operations

Some useful functions you can do with DataFrames below. Try them out, and see what they do!

In [13]:
data.describe()

Unnamed: 0,gdpPercap_1952,gdpPercap_1957,gdpPercap_1962,gdpPercap_1967,gdpPercap_1972,gdpPercap_1977,gdpPercap_1982,gdpPercap_1987,gdpPercap_1992,gdpPercap_1997,gdpPercap_2002,gdpPercap_2007
count,33.0,33.0,33.0,33.0,33.0,33.0,33.0,33.0,33.0,33.0,33.0,33.0
mean,5195.484004,5787.73294,5729.369625,5971.173374,8187.468699,7791.31402,7434.135157,7608.226508,8639.690248,9834.093295,10174.090397,12473.02687
std,18634.890865,19506.515959,16415.857196,14062.591362,19087.502918,11815.777923,8701.176499,8090.262765,9727.431088,11094.180481,11150.719203,14154.937343
min,331.0,350.0,388.0,349.0,357.0,371.0,424.0,385.0,347.0,415.0,611.0,944.0
25%,749.681655,793.577415,825.623201,836.197138,1049.938981,1175.921193,1443.429832,1704.686583,1785.402016,1902.2521,2092.712441,2452.210407
50%,1206.947913,1547.944844,1649.552153,2029.228142,2571.423014,3195.484582,4106.525293,4106.492315,3726.063507,3645.379572,4090.925331,4471.061906
75%,3035.326002,3290.257643,4187.329802,5906.731805,8597.756202,11210.08948,12954.79101,11643.57268,15215.6579,19702.05581,19233.98818,22316.19287
max,108382.3529,113523.1329,95458.11176,80894.88326,109347.867,59265.47714,33693.17525,28118.42998,34932.91959,40300.61996,36023.1054,47306.98978


In [14]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 33 entries, Afghanistan to Yemen Rep.
Data columns (total 12 columns):
gdpPercap_1952    33 non-null float64
gdpPercap_1957    33 non-null float64
gdpPercap_1962    33 non-null float64
gdpPercap_1967    33 non-null float64
gdpPercap_1972    33 non-null float64
gdpPercap_1977    33 non-null float64
gdpPercap_1982    33 non-null float64
gdpPercap_1987    33 non-null float64
gdpPercap_1992    33 non-null float64
gdpPercap_1997    33 non-null float64
gdpPercap_2002    33 non-null float64
gdpPercap_2007    33 non-null float64
dtypes: float64(12)
memory usage: 4.6+ KB


In [15]:
data.mean()

gdpPercap_1952     5195.484004
gdpPercap_1957     5787.732940
gdpPercap_1962     5729.369625
gdpPercap_1967     5971.173374
gdpPercap_1972     8187.468699
gdpPercap_1977     7791.314020
gdpPercap_1982     7434.135157
gdpPercap_1987     7608.226508
gdpPercap_1992     8639.690248
gdpPercap_1997     9834.093295
gdpPercap_2002    10174.090397
gdpPercap_2007    12473.026870
dtype: float64

In [16]:
data.max()

gdpPercap_1952    108382.35290
gdpPercap_1957    113523.13290
gdpPercap_1962     95458.11176
gdpPercap_1967     80894.88326
gdpPercap_1972    109347.86700
gdpPercap_1977     59265.47714
gdpPercap_1982     33693.17525
gdpPercap_1987     28118.42998
gdpPercap_1992     34932.91959
gdpPercap_1997     40300.61996
gdpPercap_2002     36023.10540
gdpPercap_2007     47306.98978
dtype: float64

In [17]:
data.min()

gdpPercap_1952    331.0
gdpPercap_1957    350.0
gdpPercap_1962    388.0
gdpPercap_1967    349.0
gdpPercap_1972    357.0
gdpPercap_1977    371.0
gdpPercap_1982    424.0
gdpPercap_1987    385.0
gdpPercap_1992    347.0
gdpPercap_1997    415.0
gdpPercap_2002    611.0
gdpPercap_2007    944.0
dtype: float64

In [19]:
data.columns

Index(['gdpPercap_1952', 'gdpPercap_1957', 'gdpPercap_1962', 'gdpPercap_1967',
       'gdpPercap_1972', 'gdpPercap_1977', 'gdpPercap_1982', 'gdpPercap_1987',
       'gdpPercap_1992', 'gdpPercap_1997', 'gdpPercap_2002', 'gdpPercap_2007'],
      dtype='object')

In [20]:
data.index

Index(['Afghanistan', 'Bahrain', 'Bangladesh', 'Cambodia', 'China',
       'Hong Kong China', 'India', 'Indonesia', 'Iran', 'Iraq', 'Israel',
       'Japan', 'Jordan', 'Korea Dem. Rep.', 'Korea Rep.', 'Kuwait', 'Lebanon',
       'Malaysia', 'Mongolia', 'Myanmar', 'Nepal', 'Oman', 'Pakistan',
       'Philippines', 'Saudi Arabia', 'Singapore', 'Sri Lanka', 'Syria',
       'Taiwan', 'Thailand', 'Vietnam', 'West Bank and Gaza', 'Yemen Rep.'],
      dtype='object', name='country')

### <font color="red">Exercise 1: Import and check

Import the data from the CSV files of the other continents, and check how many rows of data they each have.

In [65]:
continents = ["africa","americas","asia", "europe", "oceania","pop_all"]
for i in continents:
  exec("data_%s = pd.read_csv(r'/content/drive/My Drive/Colab Notebooks/T4 Data Analysis /Original/gdp_'+i+'.csv', index_col='country')" % i)
  exec("print(data_%s.head())" % i) # to check if data is correctly imported
  exec("print(len(data_%s))" % i)

              gdpPercap_1952  gdpPercap_1957  ...  gdpPercap_2002  gdpPercap_2007
country                                       ...                                
Algeria          2449.008185     3013.976023  ...     5288.040382     6223.367465
Angola           3520.610273     3827.940465  ...     2773.287312     4797.231267
Benin            1062.752200      959.601080  ...     1372.877931     1441.284873
Botswana          851.241141      918.232535  ...    11003.605080    12569.851770
Burkina Faso      543.255241      617.183465  ...     1037.645221     1217.032994

[5 rows x 12 columns]
52
          continent  gdpPercap_1952  ...  gdpPercap_2002  gdpPercap_2007
country                              ...                                
Argentina  Americas     5911.315053  ...     8797.640716    12779.379640
Bolivia    Americas     2677.326347  ...     3413.262690     3822.137084
Brazil     Americas     2108.944355  ...     8131.212843     9065.800825
Canada     Americas    11367.161120

Unnamed: 0,country,gdpPercap_1952,gdpPercap_1957,gdpPercap_1962,gdpPercap_1967,gdpPercap_1972,gdpPercap_1977,gdpPercap_1982,gdpPercap_1987,gdpPercap_1992,gdpPercap_1997,gdpPercap_2002,gdpPercap_2007
0,Afghanistan,779.445314,820.85303,853.10071,836.197138,739.981106,786.11336,978.011439,852.395945,649.341395,635.341351,726.734055,974.580338
1,Bahrain,9867.084765,11635.79945,12753.27514,14804.6727,18268.65839,19340.10196,19211.14731,18524.02406,19035.57917,20292.01679,23403.55927,29796.04834
2,Bangladesh,684.244172,661.637458,686.341554,721.186086,630.233627,659.877232,676.981866,751.979403,837.810164,972.770035,1136.39043,1391.253792
3,Cambodia,368.469286,434.038336,496.913648,523.432314,421.624026,524.972183,624.475478,683.895573,682.303175,734.28517,896.226015,1713.778686
4,China,400.448611,575.987001,487.674018,612.705693,676.900092,741.23747,962.42138,1378.904018,1655.784158,2289.234136,3119.280896,4959.114854
5,Hong Kong China,3054.421209,3629.076457,4692.648272,6197.962814,8315.928145,11186.14125,14560.53051,20038.47269,24757.60301,28377.63219,30209.01516,39724.97867
6,India,546.565749,590.061996,658.347151,700.770611,724.032527,813.337323,855.723538,976.512676,1164.406809,1458.817442,1746.769454,2452.210407
7,Indonesia,749.681655,858.900271,849.28977,762.431772,1111.107907,1382.702056,1516.872988,1748.356961,2383.140898,3119.335603,2873.91287,3540.651564
8,Iran,3035.326002,3290.257643,4187.329802,5906.731805,9613.818607,11888.59508,7608.334602,6642.881371,7235.653188,8263.590301,9240.761975,11605.71449
9,Iraq,4129.766056,6229.333562,8341.737815,8931.459811,9576.037596,14688.23507,14517.90711,11643.57268,3745.640687,3076.239795,4390.717312,4471.061906


In [45]:
america.head()

NameError: ignored

<hr>

## Reading and filtering data in DataFrames

There are many, many ways of accessing data in DataFrames. Here are a few ways--you can read up on other ways at the [Pandas DataFrame documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) page. It's worth clicking over just to see the variety of functions you can call to handle DataFrames!

Here, though, we'll start with accessing information the way you'd expect, by row and column:

In [30]:
#         Row, Column
data.iloc[0,0]

779.4453145

In [31]:
data.iloc[0:4,0:2]

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_1957
country,Unnamed: 1_level_1,Unnamed: 2_level_1
Afghanistan,779.445314,820.85303
Bahrain,9867.084765,11635.79945
Bangladesh,684.244172,661.637458
Cambodia,368.469286,434.038336


In [34]:
data.loc["Afghanistan","gdpPercap_1952"]

779.4453145

In [35]:
data.loc["Afghanistan":"China","gdpPercap_1952":"gdpPercap_1962"]

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_1957,gdpPercap_1962
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Afghanistan,779.445314,820.85303,853.10071
Bahrain,9867.084765,11635.79945,12753.27514
Bangladesh,684.244172,661.637458,686.341554
Cambodia,368.469286,434.038336,496.913648
China,400.448611,575.987001,487.674018


In [40]:
data.loc["China","gdpPercap_1962"]

487.6740183

In [42]:
rowsToShow = ["China","India","Singapore"]
colsToShow = ["gdpPercap_1952","gdpPercap_2002"]

data.loc[rowsToShow,colsToShow]

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_2002
country,Unnamed: 1_level_1,Unnamed: 2_level_1
China,400.448611,3119.280896
India,546.565749,1746.769454
Singapore,2315.138227,36023.1054


In [43]:
data.loc[["China","India","Singapore"],["gdpPercap_1952","gdpPercap_2002"]]

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_2002
country,Unnamed: 1_level_1,Unnamed: 2_level_1
China,400.448611,3119.280896
India,546.565749,1746.769454
Singapore,2315.138227,36023.1054


In [48]:
for i in range(1, 6):
  exec("line_dict_%s = { 'a': 1 }" % i)

print (line_dict_1)

{'a': 1}


### <font color="red">Exercise 2: Get data

Import the necessary CSV file, and set up a DataFrame for the GDP data of Canada, the United States, and Mexico for last decade. Your result should look like this:

               gdpPercap_2002  gdpPercap_2007
country                                      
Canada            33328.96507     36319.23501
United States     39097.09955     42951.65309
Mexico            10742.44053     11977.57496

In [68]:
data_americas

rowsToShow = ["Canada","United States","Mexico"]
colsToShow = ["gdpPercap_2002","gdpPercap_2007"]

data_americas.loc[rowsToShow,colsToShow]

Unnamed: 0_level_0,gdpPercap_2002,gdpPercap_2007
country,Unnamed: 1_level_1,Unnamed: 2_level_1
Canada,33328.96507,36319.23501
United States,39097.09955,42951.65309
Mexico,10742.44053,11977.57496


<hr>

## Filtering data

Here's one of the most powerful features of DataFrames--being able to quickly work with large chunks of data. If you had to do this with for loops, it'd be a bit of pain to filter everything out item by item, not to mention having to reconstruct your lists one by one.

In [69]:
subset = data.loc["Afghanistan":"China", "gdpPercap_1952"]
subset

country
Afghanistan     779.445314
Bahrain        9867.084765
Bangladesh      684.244172
Cambodia        368.469286
China           400.448611
Name: gdpPercap_1952, dtype: float64

In [70]:
subset > 500

country
Afghanistan     True
Bahrain         True
Bangladesh      True
Cambodia       False
China          False
Name: gdpPercap_1952, dtype: bool

In [72]:
condition = subset > 500
subset[condition]

country
Afghanistan     779.445314
Bahrain        9867.084765
Bangladesh      684.244172
Name: gdpPercap_1952, dtype: float64

## More Filtering

Take a look at what's being done, and try to figure it out, particularly when it comes to the two-condition criteria!

In [0]:
# Can you figure out what's being done in the below code? 

#dataAll = pd.read_csv("gdp_pop_all.csv", index_col = "country")
dataAll = data_pop_all

print(len(dataAll))

print(dataAll["continent"])       # 1st method
print(dataAll.loc[:,"continent"]) # 2nd method
print(dataAll.continent)          # 3rd method

"""
Datafram syntaxes:
and - &
or  - |

Add new column:
var["columnName"] = var
"""


criteria = (dataAll["continent"] == "Asia") & (dataAll["gdpPercap_2007"] > 9000)  # criteria: in Asia and GDP Per Cap in 2007 is more than 9000
# specifying 2 conditions: continent == Asia, gdpPercap > 9000

dataAll["gdp_2007"] = dataAll["gdpPercap_2007"] * dataAll["pop_2007"]             # calculates the overall GDP in 2007 by muliplying the GDP Per Cap with the total population in 2007 and add a column for it
# creates a new column called gdp_2007 that calcs the ACTUAL GDP

print (dataAll["gdp_2007"])
dataAll[criteria][["gdpPercap_2007","pop_2007", "gdp_2007"]]
# filter the dataframe based off criteria, then print out only 3 columns

"""
In Simpler Words
colsToShow = ["gdpPercap_2007","pop_2007", "gdp_2007"]
subset = dataAll[criteria]
subset [colsToShow]
"""


In [102]:
rows_to_show = ["Singapore", "Malaysia"]
cols_to_show = ["gdpPercap_1952", "gdpPercap_1957"]
subset1 = data.loc[rows_to_show, cols_to_show]

rows_to_show = ["Thailand", "Indonesia"]
cols_to_show = ["gdpPercap_1952", "gdpPercap_1957"]
subset2 = data.loc[rows_to_show, cols_to_show]

rows_to_show = ["Thailand", "Indonesia"]
cols_to_show = ["gdpPercap_2002", "gdpPercap_2007"]
subset3 = data.loc[rows_to_show, cols_to_show]

newSubset = subset1.append(subset2)
newSubset

newSubset2 = subset1.append(subset3, sort=False)
newSubset2

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_1957,gdpPercap_2002,gdpPercap_2007
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Singapore,2315.138227,2843.104409,,
Malaysia,1831.132894,1810.066992,,
Thailand,,,5913.187529,7458.396327
Indonesia,,,2873.91287,3540.651564


### <font color="red">Exercise 3: African data and filtering

* Read data from the Africa file
* Find the Per Capita GDP of Egypt in 2007
* Find countries whose Per Capita GDP exceeded Egypt's that year. 

There should be 9 countries.

In [0]:
data_africa

In [113]:
data_africa.loc["Egypt","gdpPercap_2007"]

5581.180998

In [132]:
subset = data_africa.loc[:,"gdpPercap_2007"]
condition = subset > data_africa.loc["Egypt","gdpPercap_2007"]
countriesExEgypt = subset [condition]
for i in countriesExEgypt.index:
  print (i)
print ("\nTotal number of countries:",len(countriesExEgypt))


Algeria
Botswana
Equatorial Guinea
Gabon
Libya
Mauritius
Reunion
South Africa
Tunisia

Total number of countries: 9


Teachers solution: 

In [140]:
#africa = pd.read_csv("gdp_africa.csv", index_col="country")
africa = data_africa

egypt2007 = africa.loc["Egypt", "gdpPercap_2007"]

condition = africa["gdpPercap_2007"] > egypt2007
africa[condition]

Unnamed: 0_level_0,gdpPercap_1952,gdpPercap_1957,gdpPercap_1962,gdpPercap_1967,gdpPercap_1972,gdpPercap_1977,gdpPercap_1982,gdpPercap_1987,gdpPercap_1992,gdpPercap_1997,gdpPercap_2002,gdpPercap_2007
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Algeria,2449.008185,3013.976023,2550.81688,3246.991771,4182.663766,4910.416756,5745.160213,5681.358539,5023.216647,4797.295051,5288.040382,6223.367465
Botswana,851.241141,918.232535,983.653976,1214.709294,2263.611114,3214.857818,4551.14215,6205.88385,7954.111645,8647.142313,11003.60508,12569.85177
Equatorial Guinea,375.643123,426.096408,582.841971,915.596003,672.412257,958.566812,927.825343,966.896815,1132.055034,2814.480755,7703.4959,12154.08975
Gabon,4293.476475,4976.198099,6631.459222,8358.761987,11401.94841,21745.57328,15113.36194,11864.40844,13522.15752,14722.84188,12521.71392,13206.48452
Libya,2387.54806,3448.284395,6757.030816,18772.75169,21011.49721,21951.21176,17364.27538,11770.5898,9640.138501,9467.446056,9534.677467,12057.49928
Mauritius,1967.955707,2034.037981,2529.067487,2475.387562,2575.484158,3710.982963,3688.037739,4783.586903,6058.253846,7425.705295,9021.815894,10956.99112
Reunion,2718.885295,2769.451844,3173.72334,4021.175739,5047.658563,4319.804067,5267.219353,5303.377488,6101.255823,6071.941411,6316.1652,7670.122558
South Africa,4725.295531,5487.104219,5768.729717,7114.477971,7765.962636,8028.651439,8568.266228,7825.823398,7225.069258,7479.188244,7710.946444,9269.657808
Tunisia,1468.475631,1395.232468,1660.30321,1932.360167,2753.285994,3120.876811,3560.233174,3810.419296,4332.720164,4876.798614,5722.895655,7092.923025


### <font color="red">Exercise 4: What does this do? 

What do each of the lines in this chunk of code do? Run it, find out, and explain to someone sitting next to you.

In [141]:
#first = pd.read_csv('gdp_pop_all.csv', index_col='country')
first = data_pop_all                                # imports data
# imports the CSV file, set the index col

second = first[first['continent'] == 'Americas']    # adds all the countries in Americas into a new dataframe called second
# filter out to only show countries where continent ==  Americas

third = second.drop('Puerto Rico')                  # remove the row with Puerto Rico (no axis specified, so default is index (row))
# drops the Puerto Rico row for some reason

fourth = third.drop('continent', axis = 1)          # remove the label from the column with continent (axis = 1: column)
# drops the continent column for some reason by specifying the axis

fourth.to_csv('/content/drive/My Drive/result.csv')                         # exports the data into a CSV file named result.csv
fourth.to_excel("/content/drive/My Drive/result.xlsx")
print (first)
print (second)
print (third)
print (fourth)

                         continent  gdpPercap_1952  ...  pop_2007      gdp_2007
country                                             ...                        
Algeria                     Africa     2449.008185  ...  33333216  2.074449e+11
Angola                      Africa     3520.610273  ...  12420476  5.958390e+10
Benin                       Africa     1062.752200  ...   8078314  1.164315e+10
Botswana                    Africa      851.241141  ...   1639131  2.060363e+10
Burkina Faso                Africa      543.255241  ...  14326203  1.743546e+10
Burundi                     Africa      339.296459  ...   8390505  3.608510e+09
Cameroon                    Africa     1172.667655  ...  17696293  3.613752e+10
Central African Republic    Africa     1071.310713  ...   4369038  3.084613e+09
Chad                        Africa     1178.665927  ...  10238807  1.744758e+10
Comoros                     Africa     1102.990936  ...    710960  7.011117e+08
Congo Dem. Rep.             Africa      

## Inserting data

Inserting column data into your DataFrames is straightforward. Just add it in:

### <font color="red">Exercise 5: European data analysis

Import the GDP data for Europe. Write an expression to select each of the following:

* GDP per capita for all countries in 1982.
* GDP per capita for Denmark for all years.
* GDP per capita for all countries for years after 1985.
* GDP per capita for each country in 2007 as a multiple of GDP per capita for that country in 1952. Show a DataFrame with 1952, 2007, and "2007 vs. 1952", for example:

                        gdpPercap_1952  gdpPercap_2007  2007/1952
country                                                          
Albania                    1601.056136     5937.029526   3.708196
Austria                    6137.076492    36126.492700   5.886596
Belgium                    8343.105127    33692.605080   4.038377
Bosnia and Herzegovina      973.533195     7446.298803   7.648736
Bulgaria                   2444.286648    10680.792820   4.369697

In [0]:
#europe = pd.read_csv("gdp_europe.csv", index_col="country")
europe = data_europe