# STEP 1: Motivation Pitch

* The theory of economic complexity, developed at the [Center for International Development](http://www.cid.harvard.edu) at Harvard University, tries to explain why some countries are poor and unstable while others are rich and prosperous.
* Over the past decade we have worked hard to develop a theory of growth that explain variations in countries growth rates.


In [11]:
import urllib.request
import json

In [12]:
# Loading WDI data on countries (1995 - 2013)
with open("sourceData/countries_wdi.json") as data_file:    
    countries_wdi = json.load(data_file)

In [13]:
# Dictionnary with some WDI indicators
countries_wdi[0]

{'continent': 'Africa',
 'country_id': 4,
 'id_topo': '24',
 'latitude': -8.81155,
 'longitude': 13.242,
 'name': 'Angola',
 'name_2char': 'ao',
 'name_3char': 'ago',
 'years': [{'GDP (current US$)': 5039534776.4902,
   'GDP per capita (constant 2005 US$)': 1045.35860686813,
   'Life expectancy at birth, total (years)': 42.0514634146341,
   'Population, total': 12104952.0,
   'Scientific and technical journal articles': 2.9,
   'balance': 1471258357.2939997,
   'eci': -2.244171,
   'gdp': 12650000000.0,
   'population': 12104952,
   'total_export_value': 3128656139.2939997,
   'total_import_value': 1657397782.0,
   'year': 1995},
  {'GDP (current US$)': 7526446605.51712,
   'GDP per capita (constant 2005 US$)': 1130.04558917706,
   'Life expectancy at birth, total (years)': 42.5016341463415,
   'Population, total': 12451945.0,
   'Scientific and technical journal articles': 1.0,
   'balance': 2230857716.224998,
   'eci': -1.71968,
   'gdp': 14070000000.0,
   'population': 12451945,
   

In [14]:
# Temporal data are nested under the 'years' attribtue
countries_wdi_flat = {}
for index, item in enumerate(countries_wdi):
    countries_wdi_flat[index] = countries_wdi[index]
    countries_wdi_flat[index]['total_export_value'] =  item['years'][0]['total_export_value']
    countries_wdi_flat[index]['total_import_value'] =  item['years'][0]['total_import_value']
    countries_wdi_flat[index]['year'] =  item['years'][0]['year']
    countries_wdi_flat[index].pop('years', None)

In [15]:
# Flatten the data for the first year (1995)
countries_wdi_flat[0]

{'continent': 'Africa',
 'country_id': 4,
 'id_topo': '24',
 'latitude': -8.81155,
 'longitude': 13.242,
 'name': 'Angola',
 'name_2char': 'ao',
 'name_3char': 'ago',
 'total_export_value': 3128656139.2939997,
 'total_import_value': 1657397782.0,
 'year': 1995}

# Why some countries grow, and why some others don't?

In [16]:
# TODO: MAP OF THE WORLD BY GDP
# geomap = vistk.Geomap(id='name', color='continent', name='name',
#                                x='avg_products', y='nb_products', 
#                                r='value')
# geomap.draw(flat_data)

from IPython.display import IFrame
IFrame('http://cid-harvard.github.io/vis-toolkit/examples/geomap.html', width=700, height=350)

# GDP per capita
* Source http://www.theworldeconomy.org/statistics.htm

In [10]:
# sourceData/world_gdp_0_1998.json

from IPython.display import IFrame
IFrame('http://127.0.0.1/rv/Dev/atlas-labs/davos/world_gdp.html', width=700, height=350)

# STEP 2

This is a picture of the world.

<img src='img/MCP_matrix.png'>

* Dots on the matrix, represents a significant export for one of the world’s countries. 
* Each column in the graph is one country. 
* Each row in the graph is one exported product. 
* Strong color represents whether the product is ‘significant’ for that country or not. 

Finding

* We find that countries that make few products, make products made by many countries. 
* Countries that make many products, also make products made by few other countries.
Takeaway
* The rarest products are typically found in the most highly diversified countries.

TODO
* Find SITC-4 Products matrix data
* 772 products for 129 countries (year 2000)


In [None]:
from IPython.core.display import Image
Image('img/MCP_matrix.png')

In [None]:
# Generates a random Matrix
import numpy as np

In [None]:
mcp = np.fromfile("sourcedata/atlas-mcp-matrix.txt", dtype=np.int).reshape((124,1355))

In [None]:
len(mcp)

In [None]:
%matplotlib inline
import matplotlib.pylab as plt

In [None]:
plt.imshow(mcp, interpolation='nearest', cmap=plt.cm.ocean, extent=(0.5,10.5,0.5,10.5), 
           aspect='auto')

In [None]:
mcp = mcp[mcp[:,0].argsort()]

In [None]:
plt.imshow(mcp, interpolation='nearest', cmap=plt.cm.ocean, extent=(0.5,10.5,0.5,10.5), 
           aspect='auto')

In [None]:
mcp

In [None]:
mcp_sum = mcp.sum(axis=1, dtype=np.int)

In [None]:
mcp_sum

In [None]:
np.sort(mcp_sum, axis=None)

In [None]:

matrix = np.random.rand(1161, 146)

# Sort by first column/row
matrix[matrix[:,0].argsort()]

In [None]:

plt.imshow(matrix, interpolation='nearest', cmap=plt.cm.ocean, extent=(0.5,10.5,0.5,10.5), 
           aspect='auto')
plt.colorbar()
plt.show()
#

## Ubiquity vs Diversity
* 

In [17]:
# sourceData/diversification_ubiquity_hs4_1995_2012.json

from IPython.display import IFrame
IFrame('http://127.0.0.1/rv/Dev/atlas-labs/davos/scatterplot.html', width=700, height=350)

# STEP 3

Transition

* To understand which products represent the best opportunities for countries to export, we turn to look at them deeper.

Finding
* Diversity – the number of products each country exports
* Ubiquity – the number of countries the product is exported to
* When a country exports a certain product – it means the country has certain knowledge and capabilities to do so.

Takeaway
* Countries should focus on gathering the capabilities that will allow them to make more “valuable” products – product with higher scores of diversity and ubiquity, which we refer to as more “complex” products. 

TODO
* Formula
* Bipartite graph


In [None]:
from IPython.core.display import Image
Image('img/CP_bipartite.png')

In [None]:
from IPython.display import display, Math, Latex
display(Math(r'F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx'))

# STEP 4: The product space

What we see

* When presenting all of the products and export links between countries, we get a “network” of the world’s exports – which we call the “product space”.
Finding
•	To increase their complexity and grow, countries should diversify and develop capabilities in many nodes at the center of the network.

TODO
* Load product space
* Split in two to compare years

In [None]:
from IPython.core.display import Image
Image('img/PS_comparison.png')

# STEP 5: Economic Complexity

Finding
* CID researchers have developed a model that suggests present and future growth is driven by the ‘complexity’ of a country’s product space
* Our research shows that economic complexity is highly correlated with income per capita.

TODO
* Retrieve Data
* Plot scatterplot
* Add regression line

In [None]:
from IPython.core.display import Image
Image('img/ECI_scatterplot.png')

In [None]:
# STEP 6: DIVERSIFICATION

In [None]:
#

In [19]:
# Data from 
url_atlas = "http://atlas.cid.harvard.edu/api/export/fra/all/show/2013/?lang=en&amp;product_classification=hs4&amp;data_type=json"

#response = urllib.request.urlopen(url_atlas)

opener = urllib.request.build_opener()
f = opener.open(url_atlas)
json_file = json.loads(f.read().decode('utf-8'))
print(json_file['data'])

#data_atlas = json.load(response)
    

[{'rca': 0.8433344, 'value': 53800000.0, 'id': '0101', 'year': 1995, 'distance': 0.5653976, 'community_id': 106, 'pci': 0.9957224, 'share': 0.0006916276055926732, 'community_name': 'Animal & Animal Products', 'opp_gain': -0.0400303, 'item_id': 1, 'abbrv': '0101', 'name': 'Live horses, asses, mules or hinnies', 'code': '0101', 'color': '#FFE999'}, {'rca': 4.397496, 'value': 1360000000.0, 'id': '0102', 'year': 1995, 'distance': 0.518837, 'community_id': 106, 'pci': 1.335456, 'share': 0.01748352311535382, 'community_name': 'Animal & Animal Products', 'opp_gain': 0.0, 'item_id': 2, 'abbrv': '0102', 'name': 'Live bovine animals', 'code': '0102', 'color': '#FFE999'}, {'rca': 0.6037365, 'value': 57700000.0, 'id': '0103', 'year': 1995, 'distance': 0.5854357, 'community_id': 106, 'pci': 1.600859, 'share': 0.0007417641792322908, 'community_name': 'Animal & Animal Products', 'opp_gain': 0.054815, 'item_id': 3, 'abbrv': '0103', 'name': 'Live swine', 'code': '0103', 'color': '#FFE999'}, {'rca': 1.0

In [21]:
%matplotlib inline
import pandas as pd
import numpy as np

# Set some Pandas options
pd.set_option('display.notebook_repr_html', False)
pd.set_option('display.max_columns', 20)
pd.set_option('display.max_rows', 25)
plt.plot(np.random.normal(size=100), np.random.normal(size=100), 'ro')

NameError: name 'plt' is not defined

In [None]:
# https://github.com/ipython/ipywidgets/blob/477cb8046e3217b134762f53c66816c45d688a20/examples/Using%20Interact.ipynb

from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets

In [None]:
def f(x):
    return plt.plot(np.random.normal(size=x), np.random.normal(size=x), 'ro')

In [None]:
interact(f, x=[0, 1000000]);

In [None]:
def f(x):
    return x;

In [None]:
interact(f, x=True);

In [None]:
@interact(x=True, y=1.0)
def g(x, y):
    return (x, y)

In [None]:
interact(f, x=widgets.IntSlider(min=-10,max=30,step=1,value=10));

In [None]:
interact(f, x=['a', 'b'])

## Other datasets


https://www.quandl.com/api/v3/datasets/CHRIS/ICE_B1.json

In [None]:
url_atlas_countries = "http://atlas.cid.harvard.edu/api/dropdowns/countries/?lang=en"
opener = urllib.request.build_opener()
f = opener.open(url_atlas_countries)
file_atlas_countries = json.loads(f.read().decode('utf-8'))

In [None]:
file_atlas_countries

In [None]:
dict_atlas_countries = {}
for index, item in enumerate(file_atlas_countries):
    dict_atlas_countries[item[0]] = item[1]

In [None]:
interact(f, x=dict_atlas_countries)

In [None]:
my_dict

In [None]:
# TODO: update treemaps

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('2FeugaLv5Bo')