# GDP Data Analysis with Numpy

-  We will analyze the [World Bank national GDP data](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD) from 2012 to 2017.
-  Data file `GDP.csv` is downloadable from the class repository.

In [1]:
# import pandas
import pandas as pd
# import Numpy
import numpy as np

## Load the Data Set

-  The code snippet below reads the data set and generates `gdp`, a Numpy ndarray.

In [2]:
# load the data set
df = pd.read_csv('GDP.csv')
# gdp is a numpy ndarray
gdp = df.loc[:, '2012':'2017'].values

## Explore Data

-  In this section and the next, explore and manipulate the data in the Numpy array `gdp`.
-  The array contains the national GDP data (in trillion US Dollars) from 2012 through 2017. The countries are organized by rows. Each column includes the national GDP data in a year. 
-  Write a Python code snippet with Numpy to answer each question. Do **not** use any explicit loop.

### Question 1. How many rows (countries) are there in array `gdp`?

In [6]:
pass
np.size(gdp,0)

197

### Question 2. How many columns (years) are there in array `gdp`?

In [7]:
pass
np.size(gdp,1)

6

### Question 3. What is the data type of array `gdp`?

In [8]:
pass
gdp.dtype

dtype('float64')

### Question 4. Output the first five countries' GDPs from 2013 through 2016 (from the second column through the fifth column). 

In [29]:
pass
gdp[:5,1:5]

array([[0.0206, 0.0205, 0.0199, 0.0194],
       [0.0128, 0.0132, 0.0114, 0.0119],
       [0.2098, 0.2138, 0.166 , 0.1601],
       [0.0006, 0.0006, 0.0007, 0.0007],
       [0.0033, 0.0034, 0.0028, 0.0029]])

### Question 5. Output the last ten countries' GDPs in 2017 (the last column).

In [40]:
pass
gdp[-10:,-1]

array([1.94854e+01, 5.65000e-02, 5.92000e-02, 8.00000e-04, 2.23800e-01,
       3.90000e-03, 1.45000e-02, 2.68000e-02, 2.59000e-02, 2.28000e-02])

### Question 6. Was the eighth country's GDP in 2017 higher than 0.5 trillion US Dollars? (True or false?)

In [44]:
pass
x=gdp[7,-1]
y=x<0.5
y

False

## Manipulate and Aggregate Data

### Question 7. How many GDP values in the array are higher than 0.5 trillion US Dollars?

*Hint: use `np.sum()` to count the number of True elements.*

In [45]:
pass
np.sum(gdp[:]>0.5)

142

### Question 8. How many countries had a GDP higher than 0.5 trillion US Dollars in 2017? 

In [47]:
pass
np.sum(gdp[:,-1]>0.5)

23

### Question 9. Out of those countries that had a GDP higher than 0.5 trillion US Dollars in 2017, how many countries' GDP in 2016 was lower than 0.5 trillion US Dollars?

In [50]:
pass
np.sum((gdp[:,-1]>0.5)&(gdp[:,-2]<0.5))

1

### Question 10. How many countries had a lower GDP in 2017 than in 2016?

In [52]:
pass
np.sum(gdp[:,-1]<gdp[:,-2])

14

### Question 11. Output the row index of the country with the highest GDP in 2015.

*Hint: use `np.argmax()`.*

In [56]:
pass
np.argmax(gdp[:,-3])

187

### Question 12. Output the first fifteen countries' respective average yearly GDP from 2012 through 2017.

- Hint: use parameter `axis` in `np.mean()`.

In [57]:
pass
np.mean(gdp[:15,:],axis=1)

array([2.01000000e-02, 1.24333333e-02, 1.87733333e-01, 6.33333333e-04,
       3.10000000e-03, 1.24983333e-01, 1.35000000e-03, 5.69866667e-01,
       1.09833333e-02, 2.61666667e-03, 1.41370000e+00, 4.12366667e-01,
       5.85000000e-02, 1.13500000e-02, 3.25666667e-02])

### Question 13. What was the global GDP in 2016 and 2017, respectively?

In [65]:
pass
np.mean(gdp[:,-3:-1],axis=0)

array([0.37493452, 0.38048883])

### Question 14. What was the fifth highest national GDP in 2017?

*Hint: use `np.sort()`.*

In [71]:
pass
-np.sort(-gdp[:,-1])[4]

2.6526

### Question 15. How many countries' GDP increased by at least 30% from 2012 to 2017?

In [72]:
pass
np.sum((gdp[:,-1]-gdp[:,1])/gdp[:,1]>0.3)

21