# Trade network analysis
**Brian Dew (brianwdew@gmail.com)**

**03_deg_dist.ipynb**

This notebook estimates an alpha value for the degree distribution for each product in each year.

The distribution of weighted indegree and outdegree is estimated as $p(x) = Cx^{-\alpha}$.

Goal: optimize code to speed up performance. The file takes 76 seconds per year, with 30 seconds dedicated to reading the pandas dataframes and splitting them by product, 13 dedicated to building the networks, and another 30 or so dedicated to estimating the distribution. 

In [1]:
import pandas as pd
import numpy as np
import networkx as nx
import powerlaw
import os
import time
os.chdir('C:/Working/trade_network/data/')
if not os.path.exists( 'summary/.'):
    os.makedirs('summary/.')

In [2]:
def deg_dist(prod):
    "Calculates the degree distribution for a product"
    try:
        G = nx.from_pandas_dataframe(df[df.index == prod], 'i', 'j', 'v', nx.DiGraph())
        deg = G.out_degree(weight='v').values()
        pld[prod] = powerlaw.Fit(deg).power_law.alpha
    except Exception:
        pass
    return;

In [4]:
%%capture 
for y in map(str, range(2008,2015)):         # start year & end year + 1 
    # read csv file for year
    df = pd.read_csv('clean/baci07_' + y + '_clean.csv', index_col='hs6', header=0)
    df = df[['i','j','v']]         # take only relevant columns
    pld = {}                        # blank dictionary
    map(deg_dist,df.index.unique()) # This runs the program above
    pld = pd.Series(pld)            # One series from all dictionaries (fast)
    # Save as csv (fast)
    pld.to_csv('summary/deg_dist_alpha_' + y + '.csv', index=True, float_format='%g')