# GDP Data Analysis with Numpy

-  We will analyze the [World Bank national GDP data](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD) from 2012 to 2017.
-  Data file `GDP.csv` is downloadable from the class repository.

In [35]:
# import pandas
import pandas as pd
# import Numpy
import numpy as np

## Load the Data Set

-  The code snippet below reads the data set and generates `gdp`, a Numpy ndarray.

In [3]:
# load the data set
df = pd.read_csv('GDP.csv')
# gdp is a numpy ndarray
gdp = df.loc[:, '2012':'2017'].values

## Explore Data

-  In this section and the next, explore and manipulate the data in the Numpy array `gdp`.
-  The array contains the national GDP data (in trillion US Dollars) from 2012 through 2017. The countries are organized by rows. Each column includes the national GDP data in a year. 
-  Write a Python code snippet with Numpy to answer each question. Do **not** use any explicit loop.

### Question 1. How many rows (countries) are there in array `gdp`?

In [5]:
gdp.shape[0]

197

### Question 2. How many columns (years) are there in array `gdp`?

In [6]:
gdp.shape[1]

6

### Question 3. What is the data type of array `gdp`?

In [7]:
gdp.dtype

dtype('float64')

In [14]:
full_data =gdp[:, :]
full_data

array([[0.02  , 0.0206, 0.0205, 0.0199, 0.0194, 0.0202],
       [0.0123, 0.0128, 0.0132, 0.0114, 0.0119, 0.013 ],
       [0.2091, 0.2098, 0.2138, 0.166 , 0.1601, 0.1676],
       ...,
       [0.0354, 0.0404, 0.0432, 0.0426, 0.031 , 0.0268],
       [0.0255, 0.028 , 0.0272, 0.0212, 0.021 , 0.0259],
       [0.0171, 0.0191, 0.0195, 0.02  , 0.0205, 0.0228]])

### Question 4. Output the first five countries' GDPs from 2013 through 2016 (from the second column through the fifth column). 

In [12]:
c= gdp[0:6, 2:5 ]
c

array([[0.0205, 0.0199, 0.0194],
       [0.0132, 0.0114, 0.0119],
       [0.2138, 0.166 , 0.1601],
       [0.0006, 0.0007, 0.0007],
       [0.0034, 0.0028, 0.0029],
       [0.1457, 0.1162, 0.1011]])

### Question 5. Output the last ten countries' GDPs in 2017 (the last column).

In [42]:
last_ten = gdp[-10:, -1:]
last_ten

array([[1.94854e+01],
       [5.65000e-02],
       [5.92000e-02],
       [8.00000e-04],
       [2.23800e-01],
       [3.90000e-03],
       [1.45000e-02],
       [2.68000e-02],
       [2.59000e-02],
       [2.28000e-02]])

### Question 6. Was the eighth country's GDP in 2017 higher than 0.5 trillion US Dollars? (True or false?)

In [88]:
gdp[7:8, 0:5]>0.5


array([[ True,  True,  True,  True,  True]])

## Manipulate and Aggregate Data

### Question 7. How many GDP values in the array are higher than 0.5 trillion US Dollars?

*Hint: use `np.sum()` to count the number of True elements.*

In [50]:
np.sum(gdp>0.5)

142

### Question 8. How many countries had a GDP higher than 0.5 trillion US Dollars in 2017? 

In [44]:
np.sum(gdp>0.5, axis=0)[5]

23

### Question 9. Out of those countries that had a GDP higher than 0.5 trillion US Dollars in 2017, how many countries' GDP in 2016 was lower than 0.5 trillion US Dollars?

In [100]:
np.sum(gdp[gdp[:,5]>0.5,4]<0.5)


1

### Question 10. How many countries had a lower GDP in 2017 than in 2016?

In [48]:
np.sum (gdp[:,5] < gdp[:,4])

14

### Question 11. Output the row index of the country with the highest GDP in 2015.

*Hint: use `np.argmax()`.*

In [64]:
np.argmax(gdp[:,3])

187

### Question 12. Output the first fifteen countries' respective average yearly GDP from 2012 through 2017.

- Hint: use parameter `axis` in `np.mean()`.

In [51]:
np.mean(gdp[:15], axis=1) #axis 1 will show the countries

array([2.01000000e-02, 1.24333333e-02, 1.87733333e-01, 6.33333333e-04,
       3.10000000e-03, 1.24983333e-01, 1.35000000e-03, 5.69866667e-01,
       1.09833333e-02, 2.61666667e-03, 1.41370000e+00, 4.12366667e-01,
       5.85000000e-02, 1.13500000e-02, 3.25666667e-02])

### Question 13. What was the global GDP in 2016 and 2017, respectively?

In [41]:
b= gdp[ : , -2: ] 
b

array([[1.94000e-02, 2.02000e-02],
       [1.19000e-02, 1.30000e-02],
       [1.60100e-01, 1.67600e-01],
       [7.00000e-04, 6.00000e-04],
       [2.90000e-03, 3.00000e-03],
       [1.01100e-01, 1.22100e-01],
       [1.50000e-03, 1.50000e-03],
       [5.57500e-01, 6.42700e-01],
       [1.05000e-02, 1.15000e-02],
       [2.60000e-03, 2.70000e-03],
       [1.21000e+00, 1.33080e+00],
       [3.94100e-01, 4.16800e-01],
       [3.79000e-02, 4.09000e-02],
       [1.18000e-02, 1.22000e-02],
       [3.23000e-02, 3.54000e-02],
       [2.21400e-01, 2.49700e-01],
       [4.50000e-03, 4.70000e-03],
       [4.77000e-02, 5.47000e-02],
       [4.69700e-01, 4.94900e-01],
       [1.80000e-03, 1.90000e-03],
       [8.60000e-03, 9.20000e-03],
       [2.20000e-03, 2.50000e-03],
       [3.39000e-02, 3.75000e-02],
       [1.69000e-02, 1.81000e-02],
       [1.56000e-02, 1.74000e-02],
       [1.79630e+00, 2.05360e+00],
       [1.14000e-02, 1.21000e-02],
       [5.32000e-02, 5.82000e-02],
       [1.09000e-02,

### Question 14. What was the fifth highest national GDP in 2017?

*Hint: use `np.sort()`.*

In [46]:
np.sort(gdp [:,5])[-5]

2.6526

### Question 15. How many countries' GDP increased by at least 30% from 2012 to 2017?

In [102]:
np.sum(gdp[5] > gdp[0] *1.3)

6