# Programming for Data Science 2024
## Homework Assigment Two
Homework activities aim at testing not only your ability to put into practice the concepts you have learned during the Lectures and Labs, but also your ability to explore the Python documentation as a resource. Above all, it is an opportunity for you to challenge yourself and practice. If you are having difficulties with the assignment reach out for support using moodle's Discussion Forum.

### Description

This homework assignment will test your capacity to **load and manipulate data with Pandas**. 
    
The goal is to develop intuition on filtering, arranging, and merging data, which will be useful for the next homework assignment.<br/>
Fill the empty cells with your code and deliver a copy of this notebook to Moodle. <br/>
    
Your submission will be graded according to the following guidelines:
1. **Execution** (does your program does what is asked from the exercise?
2. **Objectivity** (are you using the adequate libraries? are you using a library ... )
3. **Readability** of your code (including comments, variables naming, supporting text, etc ...)

**Comment your code properly**, which includes naming your variables in a meaningful manner. **Badly documented code will be penalized.**

This assignment is to be done in pairs, and remember that you can't have the same pair from the previous and subsequent assignments.

**Students that are caught cheating will obtain a score of 0 points.** <br>

Homework 2 is worth 25% of your final grade.    

The submission package should correspond to a .zip archive (.rar files are not accepted) with the following files:
1. Jupyter Notebook with the output of all the cells;
2. HTML print of your Jupyter Notebook (in Jupyter go to File -> Download as -> HTML);
3. All text or .csv files that are exported as part of the exercises. **Please don't upload the files downloaded/imported as part of the exercises.**

**Please change the name of the notebook to "H2.\<student_1_id\>_\<student_2_id\>.ipynb", replacing \<student_id\> by your student_id.** <br>

Submission is done through the respective Moodle activity, and only one of the group members should submit the files. <br>
The deadline is the **19th of October at 12:00**. <br>
A penalty of 1 point per day late will be applied to late deliveries. <br>
**In this notebook, you are allowed to use Pandas and Numpy.**

In [3]:
import numpy as np
import scipy
import pandas as pd

# <span style="color:brown"> Start Here </span> 

[Please Complete the following form with your details]

Student Name - <br> Pedro Trindade
Student id - <br> 20240573
Contact e-mail - <br> 20240573@novaims.unl.pt


Student Name - <br> João Marques
Student id - <br>20240342
Contact e-mail - <br> 20240342@novaims.unl.pt

# <span style="color:brown"> Part 1 - Get the Data </span>

## Download and Load the World Development Indicators Dataset

We will work with the **World Development Indicators dataset**, which should be downloaded from the world bank data catalog.<br/>
Hence, the first step is to unzip the data given on Moodle, you can do this by running the cell below. <br/>

In [4]:
import zipfile, io

z = zipfile.ZipFile("WDI_csv.zip")
z.extractall()

del z

*The above code downloads a zip archive to the working folder, which by default is the the location of this notebook in your computer. <br/>
Secondly, and since the document downloaded is a zip archive, it extracts the documents from the archive. <br/> 
The contents include multiple .csv files, however we will be working only with the document 'WDICSV.csv'. <br/>*

**1.** In the cell bellow, use Pandas to open the file "WDICSV.csv" and **save** it to a variable called **wdi**.<br/>

In [5]:
wdi = pd.read_csv("WDICSV.csv")

**2.** Check the top of the dataframe to ensure it was loaded correctly.

In [6]:
wdi.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2014,2015,2016,2017,2018,2019,2020,2021,2022,2023
0,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.ZS,,,,,,,...,17.40141,17.911234,18.463874,18.924037,19.437054,20.026254,20.647969,21.165877,21.863139,
1,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.RU.ZS,,,,,,,...,6.728819,7.005877,7.308571,7.547226,7.875917,8.243018,8.545483,8.906711,9.26132,
2,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.UR.ZS,,,,,,,...,38.080931,38.422282,38.722108,38.993157,39.337872,39.695279,40.137847,40.522209,41.011132,
3,Africa Eastern and Southern,AFE,Access to electricity (% of population),EG.ELC.ACCS.ZS,,,,,,,...,31.860474,33.9038,38.854624,40.199898,43.017148,44.381259,46.264875,48.100862,48.711995,
4,Africa Eastern and Southern,AFE,"Access to electricity, rural (% of rural popul...",EG.ELC.ACCS.RU.ZS,,,,,,,...,17.619475,16.500171,24.605861,25.396929,27.037528,29.137914,31.001049,32.77791,33.747907,


## Download and Load the Penn World Table V9.0

We will additionally use data from the pwt v9.0 tables. <br/> 

**Run the following cell to download the dataset**.

In [7]:
import urllib
urllib.request.urlretrieve("https://www.rug.nl/ggdc/docs/pwt90.xlsx", "pwt90.xlsx")

('pwt90.xlsx', <http.client.HTTPMessage at 0x281cedd2420>)

**3.** In the following cell, open and read the file 'pwt90.xlsx' and **save** it into variable **pwt**. <br/>

In [8]:
pwt = pd.read_excel("pwt90.xlsx", sheet_name="Data")

**4.** Check the top of the dataframe to ensure it was loaded correctly.

In [9]:
pwt.head()

Unnamed: 0,countrycode,country,currency_unit,year,rgdpe,rgdpo,pop,emp,avh,hc,...,csh_g,csh_x,csh_m,csh_r,pl_c,pl_i,pl_g,pl_x,pl_m,pl_k
0,ABW,Aruba,Aruban Guilder,1950,,,,,,,...,,,,,,,,,,
1,ABW,Aruba,Aruban Guilder,1951,,,,,,,...,,,,,,,,,,
2,ABW,Aruba,Aruban Guilder,1952,,,,,,,...,,,,,,,,,,
3,ABW,Aruba,Aruban Guilder,1953,,,,,,,...,,,,,,,,,,
4,ABW,Aruba,Aruban Guilder,1954,,,,,,,...,,,,,,,,,,


# <span style="color:brown"> Part 2 - Data Processing </span>

## Data Wrangling

Now that we have loaded our data we are ready to start playing with it. <br/>

**5.** Start by printing all the column values in the cell bellow.

In [10]:
print("Columms from pwt:")
print(pwt.columns)
print("Columms from wdi:")
print(wdi.columns)

Columms from pwt:
Index(['countrycode', 'country', 'currency_unit', 'year', 'rgdpe', 'rgdpo',
       'pop', 'emp', 'avh', 'hc', 'ccon', 'cda', 'cgdpe', 'cgdpo', 'ck',
       'ctfp', 'cwtfp', 'rgdpna', 'rconna', 'rdana', 'rkna', 'rtfpna',
       'rwtfpna', 'labsh', 'delta', 'xr', 'pl_con', 'pl_da', 'pl_gdpo',
       'i_cig', 'i_xm', 'i_xr', 'i_outlier', 'cor_exp', 'statcap', 'csh_c',
       'csh_i', 'csh_g', 'csh_x', 'csh_m', 'csh_r', 'pl_c', 'pl_i', 'pl_g',
       'pl_x', 'pl_m', 'pl_k'],
      dtype='object')
Columms from wdi:
Index(['Country Name', 'Country Code', 'Indicator Name', 'Indicator Code',
       '1960', '1961', '1962', '1963', '1964', '1965', '1966', '1967', '1968',
       '1969', '1970', '1971', '1972', '1973', '1974', '1975', '1976', '1977',
       '1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986',
       '1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995',
       '1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '200

**6.** List the values in the column 'Country Name'.You will get a list with repeated values, **delete all duplicates** to ease your analysis. <br/>

*Tip: There is a method in the pandas library that allows to do this easily.*

In [11]:
# List the values in the 'Country Name' column and remove duplicates
unique_country_names = wdi['Country Name'].drop_duplicates().tolist()

# Display the unique country names
print(unique_country_names)


['Africa Eastern and Southern', 'Africa Western and Central', 'Arab World', 'Caribbean small states', 'Central Europe and the Baltics', 'Early-demographic dividend', 'East Asia & Pacific', 'East Asia & Pacific (excluding high income)', 'East Asia & Pacific (IDA & IBRD countries)', 'Euro area', 'Europe & Central Asia', 'Europe & Central Asia (excluding high income)', 'Europe & Central Asia (IDA & IBRD countries)', 'European Union', 'Fragile and conflict affected situations', 'Heavily indebted poor countries (HIPC)', 'High income', 'IBRD only', 'IDA & IBRD total', 'IDA blend', 'IDA only', 'IDA total', 'Late-demographic dividend', 'Latin America & Caribbean', 'Latin America & Caribbean (excluding high income)', 'Latin America & the Caribbean (IDA & IBRD countries)', 'Least developed countries: UN classification', 'Low & middle income', 'Low income', 'Lower middle income', 'Middle East & North Africa', 'Middle East & North Africa (excluding high income)', 'Middle East & North Africa (IDA &

You might notice that while the bottom rows represent Countries, the top rows represent aggregates of countries (e.g., world regions). <br/> We are only interested in **working with country-level data**, and as such we need to filter out all the unnecessary rows.

**7.** Save all the values of column 'Country Name' in a variable called **cnames**. <br/>

In [12]:
cnames = wdi["Country Name"]
cnames

0         Africa Eastern and Southern
1         Africa Eastern and Southern
2         Africa Eastern and Southern
3         Africa Eastern and Southern
4         Africa Eastern and Southern
                     ...             
395803                       Zimbabwe
395804                       Zimbabwe
395805                       Zimbabwe
395806                       Zimbabwe
395807                       Zimbabwe
Name: Country Name, Length: 395808, dtype: object

**7.1.** Delete all duplicate values.<br>

In [13]:
cnames = wdi["Country Name"].drop_duplicates()
cnames

0            Africa Eastern and Southern
1488          Africa Western and Central
2976                          Arab World
4464              Caribbean small states
5952      Central Europe and the Baltics
                       ...              
388368             Virgin Islands (U.S.)
389856                West Bank and Gaza
391344                       Yemen, Rep.
392832                            Zambia
394320                          Zimbabwe
Name: Country Name, Length: 266, dtype: object

**7.2.** Print the names that do not represent countries.

In [14]:
cnames_aggregated = cnames[cnames.index <= cnames[cnames == "World"].index[0]].tolist()
print(cnames_aggregated)


['Africa Eastern and Southern', 'Africa Western and Central', 'Arab World', 'Caribbean small states', 'Central Europe and the Baltics', 'Early-demographic dividend', 'East Asia & Pacific', 'East Asia & Pacific (excluding high income)', 'East Asia & Pacific (IDA & IBRD countries)', 'Euro area', 'Europe & Central Asia', 'Europe & Central Asia (excluding high income)', 'Europe & Central Asia (IDA & IBRD countries)', 'European Union', 'Fragile and conflict affected situations', 'Heavily indebted poor countries (HIPC)', 'High income', 'IBRD only', 'IDA & IBRD total', 'IDA blend', 'IDA only', 'IDA total', 'Late-demographic dividend', 'Latin America & Caribbean', 'Latin America & Caribbean (excluding high income)', 'Latin America & the Caribbean (IDA & IBRD countries)', 'Least developed countries: UN classification', 'Low & middle income', 'Low income', 'Lower middle income', 'Middle East & North Africa', 'Middle East & North Africa (excluding high income)', 'Middle East & North Africa (IDA &

You can take advantage of the structure of the dataset to realize that aggregates (Continents, Regions, etc) are all located on the top of the series 'cnames'. Moreover, since the series is small you can easily validate this assumption manually and then use that information to extract a slice of all the entries that represent non-countries entities.<br/>

**8.** In the next cell filter out, from **wdi**, the rows in which 'Country Name' represents an aggregate of countries.<br/>

In [15]:
# Filter wdi to keep countries that are NOT in cnames_aggregated
wdi_filtered_out = wdi[~wdi['Country Name'].isin(cnames_aggregated)]



**9.** Check that the top of the **wdi** dataframe now only has countries and not aggregates of countries.

In [16]:
# Display the filtered DataFrame
wdi_filtered_out.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2014,2015,2016,2017,2018,2019,2020,2021,2022,2023
72912,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.ZS,,,,,,,...,26.1,27.6,28.8,30.3,31.4,32.6,33.8,34.9,36.1,
72913,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.RU.ZS,,,,,,,...,10.2,11.4,12.6,13.5,14.5,15.6,16.4,17.4,18.5,
72914,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.UR.ZS,,,,,,,...,78.0,79.5,80.5,81.6,82.6,83.2,83.8,84.5,85.0,
72915,Afghanistan,AFG,Access to electricity (% of population),EG.ELC.ACCS.ZS,,,,,,,...,89.5,71.5,97.7,97.7,93.4,97.7,97.7,97.7,85.3,
72916,Afghanistan,AFG,"Access to electricity, rural (% of rural popul...",EG.ELC.ACCS.RU.ZS,,,,,,,...,86.5,64.6,97.1,97.1,91.6,97.1,97.1,97.1,81.7,


**10.** Reset the indexes of **wdi**. Perform this operation in-place.

In [17]:
# Reset the indexes of wdi in place
wdi_filtered_out.reset_index(drop=True, inplace=True)



**11.** Show that the indexes have been reseted.

In [18]:
# Display the updated DataFrame
wdi_filtered_out.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2014,2015,2016,2017,2018,2019,2020,2021,2022,2023
0,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.ZS,,,,,,,...,26.1,27.6,28.8,30.3,31.4,32.6,33.8,34.9,36.1,
1,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.RU.ZS,,,,,,,...,10.2,11.4,12.6,13.5,14.5,15.6,16.4,17.4,18.5,
2,Afghanistan,AFG,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.UR.ZS,,,,,,,...,78.0,79.5,80.5,81.6,82.6,83.2,83.8,84.5,85.0,
3,Afghanistan,AFG,Access to electricity (% of population),EG.ELC.ACCS.ZS,,,,,,,...,89.5,71.5,97.7,97.7,93.4,97.7,97.7,97.7,85.3,
4,Afghanistan,AFG,"Access to electricity, rural (% of rural popul...",EG.ELC.ACCS.RU.ZS,,,,,,,...,86.5,64.6,97.1,97.1,91.6,97.1,97.1,97.1,81.7,


*Note that when reseting the index, pandas appends a new column at the begining of the data frame, which holds the previous index values.*

## Indicator Codes and Indicator Names

**12.** Select the columns 'Indicator Name' and 'Indicator Code'.Then, delete all the duplicates, and print the top 5 and bottom 5 values. <br/>

*Note: You should be able to do everything in a single line of code for the top 5 values and a single line for the bottom 5 values.*

In [19]:
#TOP 5 VALUES
wdi_filtered_out[['Indicator Name', 'Indicator Code']].drop_duplicates().head()


Unnamed: 0,Indicator Name,Indicator Code
0,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.ZS
1,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.RU.ZS
2,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.UR.ZS
3,Access to electricity (% of population),EG.ELC.ACCS.ZS
4,"Access to electricity, rural (% of rural popul...",EG.ELC.ACCS.RU.ZS


In [20]:
#BOTTOM 5 VALUES
wdi_filtered_out[['Indicator Name', 'Indicator Code']].drop_duplicates().tail()


Unnamed: 0,Indicator Name,Indicator Code
1483,Women who believe a husband is justified in be...,SG.VAW.REFU.ZS
1484,Women who were first married by age 15 (% of w...,SP.M15.2024.FE.ZS
1485,Women who were first married by age 18 (% of w...,SP.M18.2024.FE.ZS
1486,Women's share of population ages 15+ living wi...,SH.DYN.AIDS.FE.ZS
1487,Young people (ages 15-24) newly infected with HIV,SH.HIV.INCD.YG


**13.** Create a new DataFrame named **indicators** made up of the columns 'Indicator Name' and 'Indicator Code'. Then, delete all the duplicated entries. Finally, set the column 'Indicator Code' as the index of **indicators**. 

*Note: Try to perform all these steps in a single line of code.*

In [21]:
indicators = wdi[['Indicator Name', 'Indicator Code']].drop_duplicates().set_index('Indicator Code')


**The 'indicators' DataFrame can operate now as a dictionary. <br/>**
By passing an 'Indicator Code' (key) it returns the associated 'Indicator Name' (value).<br/>

**14.** Using the **indicators** DataFrame, find the 'Indicator Code' associated with the following observables:
1. 'Population', find the 'Indicator Code' of the total population in a country;
2. 'GDP', find the GDP measured in current US Dollars;
3. 'GINI index'

*Hint: You can use the method STRING.str.contains('substring') to check whether a string contains a substring.*

In [22]:
population_code = indicators[indicators['Indicator Name'].str.contains('Population', case=False)]
population_code


Unnamed: 0_level_0,Indicator Name
Indicator Code,Unnamed: 1_level_1
EG.CFT.ACCS.ZS,Access to clean fuels and technologies for coo...
EG.CFT.ACCS.RU.ZS,Access to clean fuels and technologies for coo...
EG.CFT.ACCS.UR.ZS,Access to clean fuels and technologies for coo...
EG.ELC.ACCS.ZS,Access to electricity (% of population)
EG.ELC.ACCS.RU.ZS,"Access to electricity, rural (% of rural popul..."
...,...
SP.URB.TOTL.IN.ZS,Urban population (% of total population)
SP.URB.GROW,Urban population growth (annual %)
EN.POP.EL5M.UR.ZS,Urban population living in areas where elevati...
SH.MLR.NETS.ZS,Use of insecticide-treated bed nets (% of unde...


In [23]:
gdp_code = indicators[indicators['Indicator Name'].str.contains('GDP', case=False)]
gdp_code


Unnamed: 0_level_0,Indicator Name
Indicator Code,Unnamed: 1_level_1
NV.AGR.TOTL.ZS,"Agriculture, forestry, and fishing, value adde..."
FM.LBL.BMNY.GD.ZS,Broad money (% of GDP)
GC.DOD.TOTL.GD.ZS,"Central government debt, total (% of GDP)"
FS.AST.CGOV.GD.ZS,"Claims on central government, etc. (% GDP)"
FS.AST.DOMO.GD.ZS,Claims on other sectors of the domestic econom...
...,...
GC.TAX.TOTL.GD.ZS,Tax revenue (% of GDP)
NY.GDP.TOTL.RT.ZS,Total natural resources rents (% of GDP)
NE.TRD.GNFS.ZS,Trade (% of GDP)
BG.GSR.NFSV.GD.ZS,Trade in services (% of GDP)


In [24]:
gini_code = indicators[indicators['Indicator Name'].str.contains('GINI', case=False)]
gini_code


Unnamed: 0_level_0,Indicator Name
Indicator Code,Unnamed: 1_level_1
SI.POV.GINI,Gini index


## Extracting and Cleaning Data from WDI and PWT

**15.** From **wdi** extract the columns 'Indicator Code', 'Country Code', and '2012'.
Save the output in variable **wdi_sample**.

*Note: You should be able to perfom all operations in a single line of code. <br/>*

In [25]:
wdi_sample = wdi[['Indicator Code', 'Country Code', '2012']]
wdi_sample


Unnamed: 0,Indicator Code,Country Code,2012
0,EG.CFT.ACCS.ZS,AFE,16.466945
1,EG.CFT.ACCS.RU.ZS,AFE,6.202221
2,EG.CFT.ACCS.UR.ZS,AFE,37.485980
3,EG.ELC.ACCS.ZS,AFE,31.666038
4,EG.ELC.ACCS.RU.ZS,AFE,19.369552
...,...,...,...
395803,SG.VAW.REFU.ZS,ZWE,
395804,SP.M15.2024.FE.ZS,ZWE,
395805,SP.M18.2024.FE.ZS,ZWE,
395806,SH.DYN.AIDS.FE.ZS,ZWE,59.241862


**16.** Select from **wdi_sample** the lines associated with the Indicator Codes that you found in question 14., which concern the data of the 'Population total','GDP', and 'GINI index'.

In [26]:
wdi_sample_filtered = wdi_sample[wdi_sample['Indicator Code'].isin([population_code, gdp_code, gini_code])]
wdi_sample_filtered.head()


Unnamed: 0,Indicator Code,Country Code,2012


**17.** Create a pivot table, in which the **values** are column '2012', the **index** is 'Country Code', and the **columns** are the Indicator Codes. <br/>

*Hint: Pandas has a very useful method to create pivot tables.*

In [27]:
pivot_table = wdi_sample_filtered.pivot_table(values='2012', index='Country Code', columns='Indicator Code')
pivot_table.head()


Indicator Code
Country Code


In [28]:
wdi_sample.head()

Unnamed: 0,Indicator Code,Country Code,2012
0,EG.CFT.ACCS.ZS,AFE,16.466945
1,EG.CFT.ACCS.RU.ZS,AFE,6.202221
2,EG.CFT.ACCS.UR.ZS,AFE,37.48598
3,EG.ELC.ACCS.ZS,AFE,31.666038
4,EG.ELC.ACCS.RU.ZS,AFE,19.369552


**18.** Rename the column names of **wdi_sample** to 'Population', 'GDP', and 'GINI', accordingly.

**19.** From **pwt** select only the values of the year 2012. <br/>
Then, extract the columns 'countrycode' and 'hc' into a new variable **pwt_sample**. <br/>
Rename 'countrycode' to 'Country Code', so that it matches the same column in **wdi_sample**<br/>

*Note: in this case 'hc' stands for the Human Capital Index.<br/>*

**20.** Finally, create a new dataframe named **data** that contains the columns from **wdi_sample** and **pwt_sample**, matched by 'Country Code'. 

*Hint: Use the method concat(), and make sure both dataframes have the same index ('Country Code').*

# <span style="color:brown"> Part 3 - Analysing a Dataset </span>

**21.** Perform the necessary manipulations to answer the following questions, unless otherwise stated you can use the country codes to represent the countries in your solutions:
1. Which countries have a **population size of 10 million habitations +/- 1 million**?
2. What is  the **average** and the **standard deviation in the GDP** for the countries listed in 1?
3. What is  the **average** and the **standard deviation in the GDP per capita** for the countries listed in 1?
4. Consider the following classification of country size: <br/>
    Tiny - population < 1 000 000 <br/>
    Very Small - 1 000 000 <= population < 5 000 000 <br/>
    Small - 5 000 000 <= population < 15 000 000 <br/>
    Medium - 15 000 000 <= population < 30 000 000 <br/>
    Large - 30 000 000 <= population < 100 000 000 <br/>
    Huge - 100 000 000 <= population <br/>
   What is  the **average** and the **standard deviation in the GDP per capita for the countries in each size classification**?   
5. Create a **function** that will take a dataframe and a column name. This function should **return** a series with binary values indicating whether the **values from the column are above the mean value of that column** (indicated with a value of 1 or 0 otherwise). If the value in the column is missing (NaN) the value in the series should also be missing (NaN). Test your function. *Hint:* search how to check if something is None so that we can return None. <br/>
6. What is the **average GDP per capita of the countries after being grouped by size classification and whether the human capital was above or below average**? *Hint: as an example, two of the groups should be (1) tiny and human capital below average, (2) tiny and human capital above average.*
7. What is the **average GDP per capita of the countries after being grouped by whether the human capital was above or below average and whether the gini coefficient was above or below average?**
8. What is the **name of the country** that has the **highest GDP per capita, a Gini coefficient below average and a level of human capital below average**?
9. What is the **name of the country** that has the **highest GDP per capita, a Gini coefficient below average for its size classification, and a level of human capital below average for its size classification**?
10. What is the **name of the country** that has the **largest % increase in GDP between 1980 and 2010?** *HINT: You will need to use the wdi dataframe.*

Write the necessary code to answer each question in a single cell. <br/>
Print the answer at the end of that cell.

In [29]:
#1

In [30]:
#2

In [31]:
#3

In [32]:
#4

In [33]:
#5

In [34]:
#6

In [35]:
#7

In [36]:
#8

In [37]:
#9

In [38]:
#10