<a id="ID_top"></a>
## UNCOMTRADE API extractor

Outline purpose of workflow.

#### Notebook sections:
    
|| [0|Top](#ID_top) || [1|Part1](#ID_part1) || [2|Part2](#ID_part2) || [3|Part3](#ID_part3) || [4|Part4](#ID_part4) || [5|Part5](#ID_part5) ||

#### Install all packages required

In [1]:
# %load s_package_import.py
# package library, use to ensure consistency across notebooks, refresh periodically
# general packages
import os # use with os.listdir(_path_)
import requests
import csv
import time
from datetime import datetime
from shutil import copyfile

# data analysis packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# custom scripts
import s_file_export
import s_filepaths
import s_un_comtrade_extract as s_un

#=== network analysis
import networkx as nx


In [2]:
# import ref file (in case above script is not triggered)
import s_filepaths

# declare local variables to work with
path_raw = s_filepaths.path_raw
path_raw_dl = s_filepaths.path_raw_dl
path_store = s_filepaths.path_store
path_live = s_filepaths.path_live

**Notebook settings**

In [3]:
create_ref_doc = True

<a id="ID_part1"></a>
### Set up fetching URL
|| [0|Top](#ID_top) || [1|Part1](#ID_part1) || [2|Part2](#ID_part2) || [3|Part3](#ID_part3) || [4|Part4](#ID_part4) || [5|Part5](#ID_part5) ||

Key for URL get headings, [here](https://comtrade.un.org/api/swagger/ui/index#!/Data/Data_GetData) || Reporter explanation, [here](https://comtrade.un.org/data/doc/api/#reporters) || Reporter ids [here](https://comtrade.un.org/Data/cache/reporterAreas.json)

In [4]:
if create_ref_doc:
    # test run, country 4 (AFG) as test case with only TOTAL trade for one year (2010), with no copy of file
    un_extract = s_un.f_un_comtrade_data(p_r_country = ["4"],p_p_country = ["all"],p_ps_years=["2010"],p_extra = "cc=TOTAL")
    s_file_export.f_df_export(un_extract[0][0],"un_com_0_test_ref",p_copy=False,p_loc1 = path_raw_dl)
else:
    print("Skipped")

WORKING ON | Country 4| URL https://comtrade.un.org/api/get?r=4&p=all&freq=A&ps=2010&cc=TOTAL
OBLIGATORY PAUSE
Export | ../Data/0_raw/1_auto_download/store_un_com_0_test_ref_20200623_2147.csv | COMPLETE
COPY   | SKIP


<a id="ID_part3"></a>
### Match to BRI reference countries
|| [0|Top](#ID_top) || [1|Part1](#ID_part1) || [2|Part2](#ID_part2) || [3|Part3](#ID_part3) || [4|Part4](#ID_part4) || [5|Part5](#ID_part5) ||

Take file from the UN comtrade extractor reference and match with list of BRI countries

In [8]:
#=== import (1) UN codes reference list
un_ref_data = pd.read_csv(f"{path_live}input_un_codes_ref.csv.gzip", compression="gzip")

try:
    un_ref_data.drop("Unnamed: 0",axis = 1,inplace = True)
except:
    pass

un_ref_data_clean = un_ref_data.reset_index(drop = True)

#=== import (2) BRI reference list
df_bri_list = pd.read_csv(f"{path_live}input_bri_countries_Dumor_Yao.csv.gzip", compression="gzip")

#=== match the two (1) & (2)
df_bri_matched = df_bri_list.merge(un_ref_data_clean,left_on = "iso_3",right_on = "pt3ISO")
df_bri_matched.head()

Unnamed: 0.1,Unnamed: 0,BRI_Country,iso_3,pt3ISO,ptCode,ptTitle
0,0,Albania,ALB,ALB,8,Albania
1,1,Armenia,ARM,ARM,51,Armenia
2,2,Austria,AUT,AUT,40,Austria
3,3,Azerbaijan,AZE,AZE,31,Azerbaijan
4,4,Bangladesh,BGD,BGD,50,Bangladesh


<a id="ID_part4"></a>
### Loop through download
|| [0|Top](#ID_top) || [1|Part1](#ID_part1) || [2|Part2](#ID_part2) || [3|Part3](#ID_part3) || [4|Part4](#ID_part4) || [5|Part5](#ID_part5) ||

In [10]:
# URL settings
url_comma = "%2C"
url_add = "&"

extra_cc = f"cc=TOTAL"

In [20]:
# for every BRI country download data

df_collection = []
length = len(df_bri_matched.ptCode)

for index,entry in enumerate(list(df_bri_matched.ptCode)):
    #=== reporting
    temp_entry_name = list(df_bri_matched.BRI_Country)[index]
    print(f"Working on | {temp_entry_name} | {index+1}/{length} (~{round(((index+1)/length)*100)}%)")
    
    #=== run functions to extract
    dl_year = "2013"
    un_extract = s_un.f_un_comtrade_data(p_r_country = [str(entry)],p_p_country = ["all"],p_ps_years=[dl_year],p_extra = extra_cc)
    
    try:
        s_file_export.f_df_export(un_extract[0][0],f"un_com_{temp_entry_name}_{dl_year}_ref",p_copy=False,p_loc1=path_raw_dl,p_loc1_pre="dl_")
        df_collection.append(un_extract[0][0])
    except:
        df_collection.append(("Missing",entry))

Working on | Albania | 1/93 (~1%)
WORKING ON | Country 8| URL https://comtrade.un.org/api/get?r=8&p=all&freq=A&ps=2013&cc=TOTAL
OBLIGATORY PAUSE
Export | ../Data/0_raw/1_auto_download/dl_un_com_Albania_2013_ref_20200623_2232.csv | COMPLETE
COPY   | SKIP
Working on | Armenia | 2/93 (~2%)
WORKING ON | Country 51| URL https://comtrade.un.org/api/get?r=51&p=all&freq=A&ps=2013&cc=TOTAL
OBLIGATORY PAUSE
Export | ../Data/0_raw/1_auto_download/dl_un_com_Armenia_2013_ref_20200623_2232.csv | COMPLETE
COPY   | SKIP
Working on | Austria | 3/93 (~3%)
WORKING ON | Country 40| URL https://comtrade.un.org/api/get?r=40&p=all&freq=A&ps=2013&cc=TOTAL
OBLIGATORY PAUSE
Export | ../Data/0_raw/1_auto_download/dl_un_com_Austria_2013_ref_20200623_2232.csv | COMPLETE
COPY   | SKIP
Working on | Azerbaijan | 4/93 (~4%)
WORKING ON | Country 31| URL https://comtrade.un.org/api/get?r=31&p=all&freq=A&ps=2013&cc=TOTAL
OBLIGATORY PAUSE
Export | ../Data/0_raw/1_auto_download/dl_un_com_Azerbaijan_2013_ref_20200623_2232.c

DL ATTEMPT | Country 352| URL https://comtrade.un.org/api/get?r=352&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Malaysia | 31/93 (~33%)
WORKING ON | Country 458| URL https://comtrade.un.org/api/get?r=458&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 458| URL https://comtrade.un.org/api/get?r=458&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | India | 32/93 (~34%)
WORKING ON | Country 699| URL https://comtrade.un.org/api/get?r=699&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 699| URL https://comtrade.un.org/api/get?r=699&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Indonesia | 33/93 (~35%)
WORKING ON | Country 360| URL https://comtrade.un.org/api/get?r=360&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 360| URL https://comtrade.un.org/api/get?r=360&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGA

DL ATTEMPT | Country 508| URL https://comtrade.un.org/api/get?r=508&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Nepal | 60/93 (~65%)
WORKING ON | Country 524| URL https://comtrade.un.org/api/get?r=524&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 524| URL https://comtrade.un.org/api/get?r=524&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Netherlands | 61/93 (~66%)
WORKING ON | Country 528| URL https://comtrade.un.org/api/get?r=528&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 528| URL https://comtrade.un.org/api/get?r=528&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Norway | 62/93 (~67%)
WORKING ON | Country 579| URL https://comtrade.un.org/api/get?r=579&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 579| URL https://comtrade.un.org/api/get?r=579&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGA

Working on | Zambia | 88/93 (~95%)
WORKING ON | Country 894| URL https://comtrade.un.org/api/get?r=894&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 894| URL https://comtrade.un.org/api/get?r=894&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Zimbabwe | 89/93 (~96%)
WORKING ON | Country 716| URL https://comtrade.un.org/api/get?r=716&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 716| URL https://comtrade.un.org/api/get?r=716&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | South Africa | 90/93 (~97%)
WORKING ON | Country 710| URL https://comtrade.un.org/api/get?r=710&p=all&freq=A&ps=2013&cc=TOTAL
DL ATTEMPT | Country 710| URL https://comtrade.un.org/api/get?r=710&p=all&freq=A&ps=2013&cc=TOTAL | not ok, data not processed further
OBLIGATORY PAUSE
Working on | Spain | 91/93 (~98%)
WORKING ON | Country 724| URL https://comtrade.un.org/api/get?r=724&p=all&freq=A&ps=2013&cc=TOTAL
DL AT

KeyboardInterrupt: 

In [15]:
# check number of entries (should be 93 regardless)
#len(df_collection)

In [17]:
# merge all dataframes
df_un_com_master = pd.concat(df_collection)
# save entire download to download folder
s_file_export.f_df_export(df_un_com_master,f"un_com_master_{dl_year}",p_copy= False,p_loc1=path_raw_dl,p_loc1_pre="dl_")

Export | ../Data/0_raw/1_auto_download/dl_un_com_master_2012_20200623_2230.csv | COMPLETE
COPY   | SKIP


#### Clean dataframe and export to storage and live

In [18]:
            # Partner / reporter info (6)
columns =   ["rtCode","rt3ISO","rtTitle","ptCode","pt3ISO","ptTitle",
             # period and trade category and value information (3)
             "period","rgDesc","yr",
             
             # duplicate info? (6)
             "rgCode","cmdCode","TradeValue","periodDesc","pfCode","cmdDescE"]

df_un_com_focused = df_un_com_master.loc[:,columns]
df_un_com_focused.head()

Unnamed: 0,rtCode,rt3ISO,rtTitle,ptCode,pt3ISO,ptTitle,period,rgDesc,yr,rgCode,cmdCode,TradeValue,periodDesc,pfCode,cmdDescE
0,8,ALB,Albania,0,WLD,World,2012,Import,2012,1,TOTAL,4879829648,2012,H4,All Commodities
1,8,ALB,Albania,0,WLD,World,2012,Export,2012,2,TOTAL,1967918947,2012,H4,All Commodities
2,8,ALB,Albania,4,AFG,Afghanistan,2012,Import,2012,1,TOTAL,2853,2012,H4,All Commodities
3,8,ALB,Albania,12,DZA,Algeria,2012,Import,2012,1,TOTAL,4304512,2012,H4,All Commodities
4,8,ALB,Albania,12,DZA,Algeria,2012,Export,2012,2,TOTAL,118342,2012,H4,All Commodities


In [19]:
# save year specific data frame
s_file_export.f_df_export(df_un_com_focused,f"un_com_{df_un_com_focused.yr.unique()[0]}")

Export | ../Data/1_raw_processed_backup/store_un_com_2012_20200623_2232.csv | COMPLETE
COPY   | ../Data/2_raw_processed_input/input_un_com_2012.csv.gzip | COMPLETE


<a id="ID_part5"></a>
### Part 5
|| [0|Top](#ID_top) || [1|Part1](#ID_part1) || [2|Part2](#ID_part2) || [3|Part3](#ID_part3) || [4|Part4](#ID_part4) || [5|Part5](#ID_part5) ||