# Review Session 3 - Part 2

Linear algebra and complexity variables

Course: Tools of Economic Complexity

**Outline**

- Linear algebra basics
- Networks - construction and metrics
	- Adjacency matrix
	- Edge lists
	- Degree distributions
	- Centrality metrics
	- “Network backboning” - Michele / Frank
- Complexity metrics
	- ECI / PCI / Density
	- Industry spaces - co-production / co-location / co-coordination
    - Predicting product appearances
    - Backing out country CCA's
- Density regressions
	- Growth vs density, country FE
	- Product appearances vs density, country FE
- Growth regressions
    - Growth vs ECI etc.

In [1]:
%reset -f

In [2]:
# Helps while coding up modules to import
%reload_ext autoreload
%autoreload 2

In [3]:
# Basics
import os
import re
import sys
from pathlib import Path

# Data and plotting
import matplotlib.pyplot as plt

# Networks
import numpy as np
import pandas as pd
import seaborn as sns

In [4]:
# Set file paths
PROJ = Path(os.path.realpath("."))
ROOT = PROJ.parent
DATA = ROOT / "data/"

# Complexity Measures

Using minimum conditional probability method for calculating proximity
$$\phi_{pp'} = \frac{\sum_c M_{cp} M_{cp'}}{max(U_p, U_p')}$$

$$d_{cp} = \frac{\sum_c M_{cp'}\phi_{pp'}}{\sum_p \phi_{pp'}}$$

In [6]:
# Download HS-4 trade data
data_url = f"https://intl-atlas-downloads.s3.amazonaws.com/country_hsproduct4digit_year.csv.zip"
trade = pd.read_csv(data_url, compression="zip", low_memory=False)
# Select years to include
trade = trade[trade.year >=2010]
trade.head()

Unnamed: 0,year,export_value,import_value,export_rca,cog,distance,hs_eci,hs_coi,sitc_eci,sitc_coi,pci,location_code,location_name_short_en,hs_product_code,hs_product_name_short_en
15,2010,0.0,5350.0,0.0,0.067729,0.935516,1.262046,-0.145284,1.156783,0.269547,-0.00562,ABW,Aruba,101,Horses
16,2011,0.0,18966.0,0.0,0.224593,0.988999,-0.153565,-1.006434,0.074068,-0.815275,0.433045,ABW,Aruba,101,Horses
17,2012,0.0,29648.0,0.0,0.080129,0.982993,0.16714,-0.967966,0.372045,-0.742569,-0.183913,ABW,Aruba,101,Horses
18,2013,6199.0,110883.0,0.080352,0.115505,0.975545,0.487088,-0.83203,0.268743,-0.656154,0.079526,ABW,Aruba,101,Horses
19,2014,0.0,7500.0,0.0,0.144265,0.982488,-0.066792,-1.000541,-0.217015,-0.913747,0.270862,ABW,Aruba,101,Horses


In [7]:
# Proximities from the atlas
proxurl = (
    "http://intl-atlas-downloads.s3.amazonaws.com/atlas_2_16_6/hs92_proximities.csv"
)
proxdf = pd.read_csv(
    proxurl, dtype={"commoditycode_1": str, "commoditycode_2": str, "proximity": float}
)
proxdf.head()

Unnamed: 0,commoditycode_1,commoditycode_2,proximity
0,101,101,0.0
1,101,102,0.277778
2,101,103,0.352941
3,101,104,0.26087
4,101,105,0.296296


In [8]:
set(trade.hs_product_code.unique()) - set(proxdf.commoditycode_1.unique())

{'9999', 'XXXX', 'financial', 'ict', 'transport', 'travel', 'unspecified'}

In [9]:
# Filter trade data to include "valid" products
trade = trade[trade.hs_product_code.isin(proxdf.commoditycode_1.unique())]

In [10]:
# Rectangularize
def fillin(df, entities):
    """STATA style 'fillin', makes sure all combinations of entities in the
    index are in the dataset."""
    df = df.set_index(entities)
    df = df.reindex(pd.MultiIndex.from_product(df.index.levels, names=df.index.names))
    return df.reset_index()

In [11]:
len(trade)

2035610

In [12]:
# Rectangularize - fill in missing combinations
trade = fillin(trade, ["year", "location_code", "hs_product_code"])

In [13]:
len(trade)

2057160

In [14]:
from ecomplexity import ecomplexity, proximity

# Parameters
trade_cols = {
    "time": "year",
    "loc": "location_code",
    "prod": "hs_product_code",
    "val": "export_value",
}

# Calculate complexity
trade_complexity = ecomplexity(trade[list(trade_cols.values())], trade_cols)
trade_complexity.head()



2010
2011
2012
2013
2014
2015
2016


Unnamed: 0,location_code,hs_product_code,export_value,year,diversity,ubiquity,mcp,eci,pci,density,coi,cog,rca
0,ABW,101,0.0,2010,77.0,21.0,0.0,1.287289,0.465101,0.065219,0.011302,0.372326,0.0
1,ABW,102,0.0,2010,77.0,41.0,0.0,1.287289,-0.435292,0.071094,0.011302,-0.059898,0.0
2,ABW,103,0.0,2010,77.0,21.0,0.0,1.287289,1.996005,0.063716,0.011302,0.571354,0.0
3,ABW,104,0.0,2010,77.0,34.0,0.0,1.287289,-2.108885,0.070731,0.011302,-0.292291,0.0
4,ABW,105,2342.0,2010,77.0,31.0,0.0,1.287289,1.034382,0.069503,0.011302,0.420561,0.039898


In [15]:
# What do we know about these countries?
