# GDP Data Analysis with Numpy

-  We will analyze the [World Bank national GDP data](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD) from 2012 to 2017.
-  Data file `GDP.csv` is downloadable from the class repository.

In [3]:
# import pandas
import pandas as pd
# import Numpy
import numpy as np

## Load the Data Set

-  The code snippet below reads the data set and generates `gdp`, a Numpy ndarray.

In [39]:
# load the data set
df = pd.read_csv('GDP.csv')
# gdp is a numpy ndarray
gdp = df.loc[:, '2012':'2017'].values
df

Unnamed: 0,Country Name,Country Code,Region,2012,2013,2014,2015,2016,2017
0,Afghanistan,AFG,South Asia,0.0200,0.0206,0.0205,0.0199,0.0194,0.0202
1,Albania,ALB,Europe & Central Asia,0.0123,0.0128,0.0132,0.0114,0.0119,0.0130
2,Algeria,DZA,Middle East & North Africa,0.2091,0.2098,0.2138,0.1660,0.1601,0.1676
3,American Samoa,ASM,East Asia & Pacific,0.0006,0.0006,0.0006,0.0007,0.0007,0.0006
4,Andorra,AND,Europe & Central Asia,0.0032,0.0033,0.0034,0.0028,0.0029,0.0030
...,...,...,...,...,...,...,...,...,...
192,Virgin Islands (U.S.),VIR,Latin America & Caribbean,0.0041,0.0038,0.0036,0.0037,0.0039,0.0039
193,West Bank and Gaza,PSE,Middle East & North Africa,0.0113,0.0125,0.0127,0.0127,0.0134,0.0145
194,"Yemen, Rep.",YEM,Middle East & North Africa,0.0354,0.0404,0.0432,0.0426,0.0310,0.0268
195,Zambia,ZMB,Sub-Saharan Africa,0.0255,0.0280,0.0272,0.0212,0.0210,0.0259


## Explore Data

-  In this section and the next, explore and manipulate the data in the Numpy array `gdp`.
-  The array contains the national GDP data (in trillion US Dollars) from 2012 through 2017. The countries are organized by rows. Each column includes the national GDP data in a year. 
-  Write a Python code snippet with Numpy to answer each question. Do **not** use any explicit loop.

### Question 1. How many rows (countries) are there in array `gdp`?

In [42]:
pass

gdp.shape[0]



197

### Question 2. How many columns (years) are there in array `gdp`?

In [37]:
pass
gdp.shape[1]




6

### Question 3. What is the data type of array `gdp`?

In [321]:
pass
x = np.array(gdp)   
print(x.dtype)     

float64


### Question 4. Output the first five countries' GDPs from 2013 through 2016 (from the second column through the fifth column). 

In [53]:
pass
gdp[:,:5]

array([[2.00000e-02, 2.06000e-02, 2.05000e-02, 1.99000e-02, 1.94000e-02],
       [1.23000e-02, 1.28000e-02, 1.32000e-02, 1.14000e-02, 1.19000e-02],
       [2.09100e-01, 2.09800e-01, 2.13800e-01, 1.66000e-01, 1.60100e-01],
       [6.00000e-04, 6.00000e-04, 6.00000e-04, 7.00000e-04, 7.00000e-04],
       [3.20000e-03, 3.30000e-03, 3.40000e-03, 2.80000e-03, 2.90000e-03],
       [1.28100e-01, 1.36700e-01, 1.45700e-01, 1.16200e-01, 1.01100e-01],
       [1.20000e-03, 1.20000e-03, 1.30000e-03, 1.40000e-03, 1.50000e-03],
       [5.46000e-01, 5.52000e-01, 5.26300e-01, 5.94700e-01, 5.57500e-01],
       [1.06000e-02, 1.11000e-02, 1.16000e-02, 1.06000e-02, 1.05000e-02],
       [2.50000e-03, 2.60000e-03, 2.60000e-03, 2.70000e-03, 2.60000e-03],
       [1.54620e+00, 1.57620e+00, 1.46750e+00, 1.35150e+00, 1.21000e+00],
       [4.09400e-01, 4.30100e-01, 4.42000e-01, 3.81800e-01, 3.94100e-01],
       [6.97000e-02, 7.42000e-02, 7.52000e-02, 5.31000e-02, 3.79000e-02],
       [1.07000e-02, 1.06000e-02, 1.10

### Question 5. Output the last ten countries' GDPs in 2017 (the last column).

In [296]:
pass
np.sum(gdp[:,-1], axis=0)

79.66299999999998

### Question 6. Was the eighth country's GDP in 2017 higher than 0.5 trillion US Dollars? (True or false?)

In [301]:
pass
gdp[8,5] > 0.5


False

## Manipulate and Aggregate Data

### Question 7. How many GDP values in the array are higher than 0.5 trillion US Dollars?

*Hint: use `np.sum()` to count the number of True elements.*

In [289]:
pass

np.sum(gdp > 0.5)

142

### Question 8. How many countries had a GDP higher than 0.5 trillion US Dollars in 2017? 

In [291]:
pass
np.sum(gdp[:,5] > 0.5)

23

### Question 9. Out of those countries that had a GDP higher than 0.5 trillion US Dollars in 2017, how many countries' GDP in 2016 was lower than 0.5 trillion US Dollars?

In [281]:
pass
np.sum(gdp[gdp[:,5] > 0.5, 4] < 0.5)

1

### Question 10. How many countries had a lower GDP in 2017 than in 2016?

In [285]:
pass
gdp[:,5] < gdp[:,4]

array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False,  True,
       False, False, False, False, False,  True, False, False, False,
       False, False, False, False, False, False,  True, False, False,
        True, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False,  True, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False,  True, False, False, False, False, False, False, False,
       False, False,

### Question 11. Output the row index of the country with the highest GDP in 2015.

*Hint: use `np.argmax()`.*

In [293]:
pass
np.argmax(gdp[:,3])

187

### Question 12. Output the first fifteen countries' respective average yearly GDP from 2012 through 2017.

- Hint: use parameter `axis` in `np.mean()`.

In [297]:
pass
np.mean(gdp[:15], axis=1)

array([2.01000000e-02, 1.24333333e-02, 1.87733333e-01, 6.33333333e-04,
       3.10000000e-03, 1.24983333e-01, 1.35000000e-03, 5.69866667e-01,
       1.09833333e-02, 2.61666667e-03, 1.41370000e+00, 4.12366667e-01,
       5.85000000e-02, 1.13500000e-02, 3.25666667e-02])

### Question 13. What was the global GDP in 2016 and 2017, respectively?

In [295]:
pass
np.sum(gdp[:,-2], axis=0)

74.9563

### Question 14. What was the fifth highest national GDP in 2017?

*Hint: use `np.sort()`.*

In [300]:
pass
np.sort(gdp[:,5])[-5]

2.6526

### Question 15. How many countries' GDP increased by at least 30% from 2012 to 2017?

In [304]:
pass
np.sum(gdp[:,5]> 1.3) * gdp[:,0]

array([2.800000e-01, 1.722000e-01, 2.927400e+00, 8.400000e-03,
       4.480000e-02, 1.793400e+00, 1.680000e-02, 7.644000e+00,
       1.484000e-01, 3.500000e-02, 2.164680e+01, 5.731600e+00,
       9.758000e-01, 1.498000e-01, 4.298000e-01, 1.867600e+00,
       6.580000e-02, 9.198000e-01, 6.970600e+00, 2.240000e-02,
       1.148000e-01, 2.520000e-02, 3.794000e-01, 2.408000e-01,
       2.254000e-01, 3.451280e+01, 2.660000e-01, 7.546000e-01,
       1.568000e-01, 3.220000e-02, 2.520000e-02, 1.974000e-01,
       4.074000e-01, 2.553600e+01, 3.500000e-02, 1.736000e-01,
       3.739400e+00, 1.194508e+02, 5.188400e+00, 1.400000e-02,
       4.102000e-01, 1.918000e-01, 6.510000e-01, 3.752000e-01,
       7.910000e-01, 1.023400e+00, 4.340000e-02, 3.500000e-01,
       2.903600e+00, 4.579400e+00, 1.960000e-02, 7.000000e-03,
       8.498000e-01, 1.230600e+00, 3.911600e+00, 2.996000e-01,
       3.136000e-01, 3.220000e-01, 6.720000e-02, 6.062000e-01,
       5.600000e-02, 3.593800e+00, 3.757320e+01, 2.4080