<br><br><br>
<span style="font-size:36px;color:#ffffff;font-weight:bold;background-color:#134DD1">Building a Network Graph of Global Trade</span>
<br><br><br>

<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">What is this graph for?</span>

<b>It allows you to quickly gain insight into how international trade is organized for a particular product, identify trade links that countries have not reported, and identify countries and territories that have closed trade reporting or do not report trade at all.</b>

The network connects each country/territory to its top N largest (in USD) commodity suppliers. It displays:

- the direction of trade flow between the country/territory and its supplier (the difference between country-supplier exports and supplier-country exports);
- total commodity trade (exports + imports in each country/territory;
- a parameter by choice:

    - the region each country/territory belongs to (Africa, Americas, Antarctica, Asia, Europe, Oceania, Special categories, i.e. bunkers, free zones);
    - the balance between exports and imports of the commodity for each country/territory (10/90%, 50/50%, 70/30%, etc.);
    - each country/territory's rank inside this network, depending on the number of its trade links (aka Google PageRank).
    
<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">What do we get by executing the code?</span>

An interactive network, colored according to one of our chosen parameters; its nodes, when clicked, will display the name of the country and will be highlighted with all its trade connections; this network can be saved in .png format.
 
<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Limitations</span>

To use the code, you need to register a Comtrade B2C account and get a free subscription key. [REGISTRATION GUIDE](https://unstats.un.org/wiki/display/comtrade/New+Comtrade+User+Guide#NewComtradeUserGuide-UNComtradeAPIManagement).

<br>
<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Contents</span><a class="anchor" id="contents"></a>

* [Imports, Data Requests](#0)
* [Data Processing](#1)
* [Creating the layout in PyGraphviz](#2)
* [From PyGraphviz to Cytoscape layout](#3)
* [Building the graph](#4)


<br>
<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Imports, Data Requests</span><a class="anchor" id="0"></a>

#### PACKAGES

In [1]:
import pandas as pd
import numpy as np
import math

import comtradeapicall                                      # To extract the data 
import time                                                 # from UN Comtrade

import pygraphviz as pgv                                    # To calculate the initial network positions

import networkx as nx                                       # To calculate nodes' PageRanks

import dash_cytoscape as cyto                               # To build the network
cyto.load_extra_layouts()
from plotly.colors import hex_to_rgb

from dash import Dash, html, Input, Output, callback, ctx   # To build the final layout
import dash_bootstrap_components as dbc
from datetime import datetime;

import warnings

warnings.filterwarnings('ignore')

#### COMTRADE SUBSCRIPTION KEY

In [2]:
subscription_key = 'YOUR_SUBSCRIPTION_CODE'

#### COMMODITY AND COUNTRY CODES

In [3]:
commodities = pd.read_csv('comtrade_codes/harmonized-system.csv')  # Commodity HS-codes used by Comtrade. Source: 
                                                                   # https://github.com/datasets/harmonized-system
commodities.head()

Unnamed: 0,section,hscode,description,parent,level
0,I,1,Animals; live,TOTAL,2
1,I,101,"Horses, asses, mules and hinnies; live",01,4
2,I,10121,"Horses; live, pure-bred breeding animals",0101,6
3,I,10129,"Horses; live, other than pure-bred breeding an...",0101,6
4,I,10130,Asses; live,0101,6


In [4]:
reporters = pd.read_csv('comtrade_codes/reporterAreas.csv')  # Comtrade reporters, partners, and their codes. Source: 
partners = pd.read_csv('comtrade_codes/partnerAreas.csv')    # https://unstats.un.org/wiki/display/comtrade/Comtrade+Country+Code+and+Name

reporters.head()

Unnamed: 0,id,text
0,all,All
1,4,Afghanistan
2,8,Albania
3,12,Algeria
4,20,Andorra


#### REQUESTING THE DATA

In this notebook, I'll build a wheat and meslin (Comtrade code 1001) global trade network for the year 2022.

Export and import data for each country are requested separately.

The Comtrade API imposes rate limits, which is why I put a 10-second pause between my two requests here.

In [5]:
comtrade_exp = comtradeapicall.getFinalData(
    subscription_key,
    typeCode='C',                                   # C = commodities
    freqCode='A',                                   # A = annual
    clCode='HS',
    period=
    '2022',                                         # year; several years should be listed in order and separated by commas
    reporterCode=None,                              # all exporters
    cmdCode=
    '1001',                                         # wheat and meslin hs-code; several codes should be separated by commas
    flowCode='X',                                   # X = export
    partnerCode=None,                               # all importers
    partner2Code='0',
    customsCode='C00',
    motCode='0',
    maxRecords=250000)

time.sleep(10)

comtrade_imp = comtradeapicall.getFinalData(
    subscription_key,
    typeCode='C',
    freqCode='A',
    clCode='HS',
    period='2022',
    reporterCode=None,                              # all importers
    cmdCode='1001',
    flowCode='M',                                   # M = import
    partnerCode=None,                               # all exporters
    partner2Code='0',
    customsCode='C00',
    motCode='0',
    maxRecords=250000)

comtrade_imp = comtrade_imp[
    comtrade_imp['reporterCode'] != comtrade_imp['partnerCode']]  # Getting rid of a country's trade with itself

In [6]:
comtrade_exp.head()

Unnamed: 0,typeCode,freqCode,refPeriodId,refYear,refMonth,period,reporterCode,reporterISO,reporterDesc,flowCode,...,netWgt,isNetWgtEstimated,grossWgt,isGrossWgtEstimated,cifvalue,fobvalue,primaryValue,legacyEstimationFlag,isReported,isAggregate
0,C,A,20220101,2022,52,2022,24,,,X,...,333.053,True,0.0,False,,1506.098,1506.098,6,False,True
1,C,A,20220101,2022,52,2022,24,,,X,...,1.0,False,0.0,False,,62.966,62.966,0,False,True
2,C,A,20220101,2022,52,2022,24,,,X,...,177.0,False,0.0,False,,1030.987,1030.987,0,False,True
3,C,A,20220101,2022,52,2022,24,,,X,...,153.053,True,0.0,False,,404.934,404.934,6,False,True
4,C,A,20220101,2022,52,2022,24,,,X,...,2.0,False,0.0,False,,7.211,7.211,0,False,True


In [7]:
comtrade_imp.head()

Unnamed: 0,typeCode,freqCode,refPeriodId,refYear,refMonth,period,reporterCode,reporterISO,reporterDesc,flowCode,...,netWgt,isNetWgtEstimated,grossWgt,isGrossWgtEstimated,cifvalue,fobvalue,primaryValue,legacyEstimationFlag,isReported,isAggregate
0,C,A,20220101,2022,52,2022,20,,,M,...,66183.47,True,0.0,False,30892.92,,30892.92,6,False,True
1,C,A,20220101,2022,52,2022,20,,,M,...,66183.47,True,0.0,False,30892.92,,30892.92,6,False,True
2,C,A,20220101,2022,52,2022,24,,,M,...,,False,0.0,False,385749200.0,328298400.0,385749200.0,0,False,True
3,C,A,20220101,2022,52,2022,24,,,M,...,202555000.0,True,0.0,False,107330600.0,87659320.0,107330600.0,6,False,True
4,C,A,20220101,2022,52,2022,24,,,M,...,29080500.0,False,0.0,False,29922170.0,25451280.0,29922170.0,0,False,True



<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Data Processing</span><a class="anchor" id="1"></a>

[UP](#contents)

<br>
<br>
We have two raw Comtrade dataframes now: 

* Each reporter's exports (in USD): total and to each of its partners. Exports are reported in Free On Board (FOB) terms.
* Each reporter's imports (in USD): total and from each of its partners. Imports are reported in Cost, Insurance and Freight (CIF) terms.

<b>Countries might not declare imports or exports to certain partners, partly or totally. For the network to be as complete as possible, we need to identify undeclared trade links:</b>

* If country A didn't report exports to country B but country B declared imports from A, I rely on B's data. 
* If country A didn't report any exports at all but other countries declared imports from A, I sum these countries' imports as A's total exports.

In [8]:
export_totals = comtrade_exp[comtrade_exp['partnerCode'] == 0][[             # Total exports by country
    'reporterCode', 'partnerCode', 'fobvalue'
]].groupby(['reporterCode', 'partnerCode']).agg('sum').reset_index()

import_totals = comtrade_imp[comtrade_imp['partnerCode'] == 0][[             # Total imports by country
    'reporterCode', 'partnerCode', 'cifvalue'
]].groupby(['reporterCode', 'partnerCode']).agg('sum').reset_index()

export_by_country = comtrade_exp[comtrade_exp['partnerCode'] != 0][[         # Exports by exporter & importer
    'reporterCode', 'partnerCode', 'fobvalue'
]].groupby(['reporterCode', 'partnerCode']).agg('sum').reset_index()

import_by_country = comtrade_imp[comtrade_imp['partnerCode'] != 0][[         # Imports by importer & exporter
    'reporterCode', 'partnerCode', 'cifvalue'
]].groupby(['reporterCode', 'partnerCode']).agg('sum').reset_index()


for dataset in [                                                             # I'll countr the initial node positions 
        export_totals, import_totals, export_by_country, import_by_country   # with PyGraphviz.                                                                  
]:                                                                           # This tool, when sizing the nodes, takes  
    for col in ['reporterCode', 'partnerCode']:                              # into account their indexes' lengths; 
        dataset[col] = [                                                     # to neutralize this factor, I create 
            '00' + str(x) if len(str(x)) == 1 else '0' +                     # equal-length 3-character string indexes.          
            str(x) if len(str(x)) == 2 else str(x)
            for x in dataset[col].tolist()
        ]

export_by_country.columns = ['exporter', 'importer', 'value']
import_by_country.columns = ['importer', 'exporter', 'value']

#### NODES 

For PyGraphviz, we need to set each node's width and height, which, in our case, mean the diameter (2\*sqrt(area/pi)). I'll size the nodes' areas by their countries' total trade. However, the nodes won't be smaller than a certain size to stay visible.

In [9]:
total_trade = export_totals[[                                                    # Exports and imports reported officially 
    'reporterCode',                                                              # by countries
    'fobvalue'                                                               
]].set_index('reporterCode').join(
    import_totals[['reporterCode', 'cifvalue']].set_index('reporterCode'),
    how='outer').reset_index().rename(columns={
        'reporterCode': 'country_code',
        'fobvalue': 'export',
        'cifvalue': 'import'
    }).set_index('country_code')

for col in ['export', 'import']:                                                 # Marking the rows with missing official
    total_trade[col + '_note'] = [                                               # exports or imports data
        1 if math.isnan(x) else 0 for x in total_trade[col].tolist()
    ]

imports_to_add = export_by_country[[                                             # Partners' data: country's total exports 
    'exporter',                                                                  # according to its importers, and country's
    'importer',                                                                  # total imports according to its exporters         
    'value'
]].groupby('importer').agg('sum')                                            
exports_to_add = import_by_country[[
    'importer',
    'exporter',                   
    'value'
]].groupby('exporter').agg('sum')

trade_to_add = exports_to_add.rename(columns={
    'value': 'export_by_partners'
}).join(imports_to_add.rename(columns={'value': 'import_by_partners'}),
        how='outer')

df_nodes = total_trade.join(trade_to_add, how='outer')[[
    'export', 'import', 'export_by_partners', 'import_by_partners'
]]

df_nodes[['export', 'import', 'export_by_partners',
          'import_by_partners']] = df_nodes[[
              'export', 'import', 'export_by_partners', 'import_by_partners'
          ]].fillna(0)

df_nodes['trade'] = [                                                             # Total trade (sizing parameter)
    ex + im if ex != 0 and im != 0                                                # Counting by partners' data if official 
    else exp + im if ex == 0 and im != 0                                          # data is lacking
    else ex + imp if ex != 0 and im == 0 else exp + imp for ex, im, exp, imp in
    zip(df_nodes['export'], df_nodes['import'], df_nodes['export_by_partners'],
        df_nodes['import_by_partners'])
]

df_nodes = df_nodes[df_nodes['trade'] >                                           # Keep only the countries with non-zero trade
                    0]  

df_nodes['trade_rescaled'] = df_nodes['trade'] / df_nodes['trade'].max(
) * 60                                                                            # Rescaling the sizes: max = 60 and min = 0.15
df_nodes['trade_rescaled'] = [
    0.15 if x <= 0.15 else x for x in df_nodes['trade_rescaled'].tolist()
]

df_nodes['diameter'] = [                                                          # Nodes' diameters
    np.sqrt(x / np.pi) * 2 for x in df_nodes['trade_rescaled'].tolist()
]

diameter_dict = df_nodes['diameter'].to_dict()

#### LINKS 

At the moment, our country-to-country trade data represents country A's exports to country B and country B's exports to country A as two separate rows. We need to unify these rows into a single "A-B trade link" row and define which country is the "source" and which is the "target" of this trade link. The source is the side that exports more than it gets back.

In [10]:
export_by_country = export_by_country[                                            # Keep only countries with non-zero trade
    (export_by_country['exporter'].isin(df_nodes.index.tolist()))
    & (export_by_country['importer'].isin(df_nodes.index.tolist()))]
import_by_country = import_by_country[
    (import_by_country['exporter'].isin(df_nodes.index.tolist()))
    & (import_by_country['importer'].isin(df_nodes.index.tolist()))]


df_list = []                                                                      # Creating a dataset of mutual trade flows
                                                                                  # out of export and import dataframes
for dataset in [export_by_country, import_by_country]:

    dataset['source_target'] = [                                                  # a unique country-to-country link index
        '_'.join(sorted([exporter, importer]))
        for exporter, importer in zip(dataset['exporter'], dataset['importer'])
    ] 

    dataset['order'] = dataset.groupby('source_target').cumcount()

    df = dataset[dataset['order'] == 0].set_index('source_target')[[
        'exporter', 'importer', 'value'
    ]].rename(columns={
        'value': 'to'
    }).join(dataset[dataset['order'] == 1].set_index('source_target')[[
        'value'
    ]].rename(columns={'value': 'back'}))

    df_list.append(df)


source_target_df = df_list[0].combine_first(df_list[1])                           # Filling the missing values/interactions 
                                                                                  # from exporters' data with corresponding 
source_target_df = source_target_df.fillna(0)                                     # values/interactions declared by the importers

source_target_df[[                                                                # Defining the source and target side, 
    'source', 'target'                                                            # depending on exports worth
]] = [[exporter, importer] if to - back >= 0 else [importer, exporter]
      for exporter, importer, to, back in
      zip(source_target_df['exporter'], source_target_df['importer'],
          source_target_df['to'], source_target_df['back'])]

df_links = source_target_df.reset_index()[['source', 'target']]


rank_df = export_by_country[['importer', 'exporter', 'value']].set_index([       # Ranking trade partners for each country 
    'importer', 'exporter'                                                       # depending on the total cost of the commodity 
]).combine_first(import_by_country[['importer', 'exporter', 'value'              # delivered
                                    ]].set_index(['importer',
                                                  'exporter'])).reset_index()

rank_df['supplier_rank'] = rank_df.sort_values(
    by='value', ascending=False).groupby('importer').cumcount() + 1

supplier_rank_dict = rank_df.set_index(['importer', 'exporter'
                                        ])['supplier_rank'].to_dict()

df_links['supplier_rank_source'] = df_links.set_index(
    ['target', 'source']).index.map(supplier_rank_dict)

df_links['supplier_rank_target'] = df_links.set_index(
    ['source', 'target']).index.map(supplier_rank_dict)


link_dict = dict()                                                               # In the network, each country will be 
                                                                                 # connected to only its two largest suppliers. 
for e in export_by_country.exporter:                                             # At the same time, one of our ways of coloring
    link_dict[e] = []                                                            # network nodes is the number of trade links. 
for i in export_by_country.importer:                                             # Here we create a dictionary in which we save 
    if i not in link_dict.keys():                                                # the real, and not truncated, number of 
        link_dict[i] = []                                                        # connections for each country.
for i in import_by_country.importer:
    if i not in link_dict.keys():
        link_dict[i] = []
for e in import_by_country.exporter:
    if e not in link_dict.keys():
        link_dict[e] = []
        
for e in export_by_country.exporter:
    data = export_by_country[export_by_country.exporter == e]
    importers_list = data['importer'].unique().tolist()
    for i in importers_list:
        link_dict[e].append(i)
        
for i in export_by_country.importer:
    data = export_by_country[export_by_country.importer == i]
    exporters_list = data['exporter'].unique().tolist()
    for e in exporters_list:
        link_dict[i].append(e)
        
for e in import_by_country.exporter:
    data = import_by_country[import_by_country.exporter == e]
    importers_list = data['importer'].unique().tolist()
    for i in importers_list:
        link_dict[e].append(i)
        
for i in import_by_country.importer:
    data = import_by_country[import_by_country.importer == i]
    exporters_list = data['exporter'].unique().tolist()
    for e in exporters_list:
        link_dict[i].append(e)
        
for key in link_dict.keys():
    link_dict[key] = len(list(dict.fromkeys(link_dict[key])))

<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Creating the layout in PyGraphviz</span><a class="anchor" id="2"></a>

[UP](#contents)
<br>
<br>
I'll connect each country to its top two wheat suppliers. You can control this number, depending on the particular network's density: sometimes there are too many unrelated countries, and sometimes they are so interconnected that three links already look messy.

I'll build a network graph in several steps: first, I'll create a network from my links dataframe; then I'll set node parameters; then I'll create the layout similar to the spring layout in Networkx.

In [11]:
n_links = 2 

In [12]:
df = df_links[
    (df_links['supplier_rank_source'].isin(list(np.arange(n_links + 1))[1:])) |
    (df_links['supplier_rank_target'].isin(list(np.arange(n_links +
                                                          1))[1:]))].copy()

layout_dict = dict()                                                              # A dict we'll pass to PyGraphviz

for s in df['source'].unique().tolist():                                          # For each country, set a list of trade  
    s_list = df[df['source'] == s]['target'].unique().tolist()                    # partners to connect with
    layout_dict[s] = s_list

G = pgv.AGraph(layout_dict)                                                       # The network is created

for i, node in enumerate(G.iternodes()):                                          # Some node attributes
    node.attr['shape'] = 'circle'
    node.attr['width'] = diameter_dict[node]
    node.attr['height'] = diameter_dict[node]
    node.attr['fixedsize'] = True
    node.attr[                                                                    # Minimumizing the label sizes for them 
        'fontsize'] = 1                                                           # not to affect the nodes' sizes

G.layout(prog='fdp')                                                              # Layout: a force-directed graph

graph_width = int(G.graph_attr['bb'].split(',')[2])                               # The width and height of a graph
graph_height = int(G.graph_attr['bb'].split(',')[3])                              # to fit its elements on a page

<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">From PyGraphviz to Cytoscape layout</span><a class="anchor" id="3"></a>

[UP](#contents)
<br>
<br>
Now I'll extract the PyGraphviz coordinates and add to them all the attributes needed to build the network chart in Dash Cytoscape. 

#### NODE COORDINATES

To draw the nodes, we just extract their coordinates and diameters from PyGraphviz.

In [13]:
node_dict = dict()

for node in G.nodes():
    node_dict[node] = dict()
    node_dict[node]['x'] = float(node.attr['pos'].split(',')[0])
    node_dict[node]['y'] = float(node.attr['pos'].split(',')[1])
    node_dict[node]['diameter'] = float(node.attr['width']) * 63.5

node_data = pd.DataFrame.from_dict(node_dict, orient='index')

for dataset in [reporters, partners]:                                       # Comtrade country codes: int -> str
    dataset['id'] = [
        '00' + x if len(x) == 1 else '0' + x if len(x) == 2 else x
        for x in dataset['id'].tolist()
    ]

country_codes = reporters.set_index('id')['text'].to_dict()
country_codes.update(partners.set_index('id')['text'].to_dict())
country_codes['380'] = 'Italy'

node_data['country'] = node_data.index.map(country_codes)                   # defining the country/territory for each node

#### COLORING PARAMETERS

The graph might be colored by one of the next parameters:

- the region each country or territory belongs to;
- the balance between exports and imports;
- each country's or territory's rank inside the network (PageRank).

Here we'll add all the necessary parameters to the nodes dataframe.

In [14]:
# UN Region

ccode_to_region_dict = dict()                                                  # Defining world regions: 
                                                                               # country code -> UN Region dictionary
regions = [
    'Africa', 'Oceania', 'Antarctica', 'Americas', 'Asia', 'Europe',
    'Special categories and unspecified areas'
]

ccode_lists = [
    [
        '012', '024', '072', '086', '108', '120', '132', '140', '148', '174', 
        '175', '178', '180', '204', '226', '231', '232', '260', '262', '266', 
        '270', '288', '324', '384', '404', '426', '430', '434', '450', '454',
        '466', '478', '480', '504', '508', '516', '562', '566', '577', '624',
        '638', '646', '654', '678', '686', '690', '694', '706', '710', '716', 
        '728', '729', '732', '736', '748', '768', '788', '800', '818', '834', 
        '854', '894'
    ],
    [
        '016', '036', '090', '162', '166', '184', '242', '258', '296', '316',
        '334', '520', '527', '540', '548', '554', '570', '574', '580', '581', 
        '583', '584', '585', '598', '612', '772', '776', '798', '876', '882'
    ], ['010'], 
    [
        '028', '032', '044', '052', '060', '068', '074', '076', '084', '092',
        '124', '136', '152', '170', '188', '192', '212', '214', '218', '222',
        '238', '239', '254', '304', '308', '312', '320', '328', '332', '340',
        '388', '473', '474', '484', '500', '531', '533', '534', '535', '558', 
        '591', '600', '604', '630', '636', '637', '652', '659', '660', '662',
        '663', '666', '670', '740', '780', '796', '840', '842', '850', '858', 
        '862'
    ], 
    [
        '004', '031', '048', '050', '051', '064', '096', '104', '116', '144',
        '156', '196', '268', '275', '344', '356', '360', '364', '368', '376', 
        '392', '398', '400', '408', '410', '414', '417', '418', '422', '446',
        '458', '462', '490', '496', '512', '524', '586', '608', '626', '634', 
        '682', '699', '702', '704', '760', '762', '764', '784', '792', '795',
        '860', '887'
    ], 
    [
        '008', '020', '040', '056', '070', '100', '112', '191', '203', '208',
        '233', '234', '246', '248', '250', '251', '276', '292', '300', '336', 
        '348', '352', '372', '380', '428', '438', '440', '442', '470', '492', 
        '498', '499', '528', '568', '578', '579', '616', '620', '642', '643',
        '674', '680', '688', '703', '705', '724', '744', '752', '756', '757',
        '804', '807', '826', '831', '832', '833'
    ], ['837', '838', '839', '899']
]

for r, c in zip(regions, ccode_lists):
    ccode_to_region_dict.update(dict.fromkeys(c, r))

node_data = node_data[node_data.index.isin(                                    # Filtering out undefined codes (EU-28, etc.)
    [item for sublist in ccode_lists for item in sublist])]

node_data['region'] = node_data.index.map(ccode_to_region_dict)


# Export/Import Balance

node_data = node_data.join(                                                    # appending some parameters from nodes dataframe
    df_nodes[['export', 'import',                                              # to the new dataframe
              'export_by_partners', 'import_by_partners']])

node_data['export_for_val'] = [
    ex if ex != 0 else exp for ex, exp in zip(node_data['export'], 
                                              node_data['export_by_partners'])
]
node_data['import_for_val'] = [
    im if im != 0 else imp for im, imp in zip(node_data['import'], 
                                              node_data['import_by_partners']) # Export/import balance in the country's
]                                                                              # trade: if export >= 50%, then we qualify 
node_data['export_share_val'] = node_data['export_for_val'] / node_data[[      # the country as an exporter; otherwise, we   
    'import_for_val', 'export_for_val']].sum(axis=1)                           # consider it an importer    
                                                                                      
node_data['export_share'] = pd.cut(                                            # Coding the values on a scale from 1 to 10
    node_data['export_share_val'], list(np.linspace(0, 1, 11)),                
    labels=['0' + str(n) if len(str(n)) == 1 else str(n)                       
            for n in np.arange(1, 11, 1)], include_lowest=True)               
                   
    
# Pagerank

node_data['pagerank_val'] = node_data.index.map(                               # Country's network rank, aka Google PageRank,
    nx.pagerank(nx.from_pandas_edgelist(df_links, 'source', 'target')))        # based on the number of links

node_data['pagerank'] = pd.cut(                                                # Coding the values on a scale from 1 to 10.
    node_data['pagerank_val'], list(np.linspace(
        node_data['pagerank_val'].min(),
        node_data['pagerank_val'].max(),
        11)),
    labels=['0' + str(n) if len(str(n)) == 1 else str(n)
            for n in np.arange(1, 11, 1)], include_lowest=True)

node_data['trade_links'] = node_data.index.map(link_dict)                      # Trade links number we'll use to decide
                                                                               # which nodes to label
node_data = node_data.drop(['export_for_val',
                            'import_for_val', 
                            'export_share_val',
                            'pagerank_val'], axis=1)

In [15]:
node_data.head()

Unnamed: 0,x,y,diameter,country,region,export,import,export_by_partners,import_by_partners,export_share,pagerank,trade_links
398,666.44,3189.5,263.7028,Kazakhstan,Asia,1920362000.0,331504200.0,1416165000.0,587054.2,9,3,36
4,1616.2,3033.5,80.25765,Afghanistan,Asia,0.0,0.0,0.0,208199900.0,1,1,5
31,873.57,3575.0,115.5319,Azerbaijan,Asia,15750.0,436551200.0,0.0,96445890.0,1,1,6
268,929.31,3458.6,44.978955,Georgia,Asia,19.34,64905950.0,1062091.0,2490408.0,1,1,10
364,346.28,2944.4,141.1097,Iran,Asia,0.0,0.0,6286.929,649441400.0,1,1,11


#### LINKS

Link coordinates are not needed in CytoScape; you just need to extract the source and target ID of each link.

In [16]:
sources = []
targets = []

for edge in G.edges():
    sources.append(edge[0])
    targets.append(edge[1])

edge_data = pd.DataFrame([sources, targets]).T
edge_data.columns = ['source', 'target']

for side in ['source', 'target']:                                    # Source and target parameters we take from the node data
    edge_data['country_' + side] = edge_data[side].map(
        node_data['country'].drop_duplicates().to_dict())

for parameter in ['region', 'export_share', 'pagerank']:
    for side in ['source', 'target']:
        edge_data[parameter + '_' + side] = edge_data[side].map(
            node_data[parameter].to_dict())

In [17]:
edge_data.head()

Unnamed: 0,source,target,country_source,country_target,region_source,region_target,export_share_source,export_share_target,pagerank_source,pagerank_target
0,398,4,Kazakhstan,Afghanistan,Asia,Asia,9,1,3,1
1,398,31,Kazakhstan,Azerbaijan,Asia,Asia,9,1,3,1
2,398,268,Kazakhstan,Georgia,Asia,Asia,9,1,3,1
3,398,364,Kazakhstan,Iran,Asia,Asia,9,1,3,1
4,398,417,Kazakhstan,Kyrgyzstan,Asia,Asia,9,1,3,1


#### LABELS

Labeling every node would create a mess, so I'll label only the largest ones (25%+ of maximum size) and also those with 50 trade links and more.

In [18]:
label_dict = dict()

label_dict_short = {                                                              # Some country names should be shortened
    'Bolivia (Plurinational State of)': 'Bolivia',                                # for labeling
    'Brunei Darussalam': 'Brunei',
    'Central African Rep.': 'CAR',
    'China, Hong Kong SAR': 'Hong Kong',
    'China, Macao SAR': 'Macao',
    "Dem. People's Rep. of Korea": 'North Korea',
    'Dem. Rep. of the Congo': 'Congo-Kinshasa',
    'Falkland Isds (Malvinas)': 'Falkland Isds',
    'Holy See (Vatican City State)': 'Holy See',
    "Lao People's Dem. Rep.": 'Laos',
    'Neth. Antilles and Aruba': 'Neth. Antilles',
    'Rep. of Korea': 'Korea',
    'Rep. of Moldova': 'Moldova',
    'Russian Federation': 'Russia',
    'Saint Barthelemy': 'St Barthelemy',
    'Saint Helena': 'St Helena',
    'Saint Lucia': 'St Lucia',
    'Saint Maarten': 'St Maarten',
    'Saint Kitts and Nevis': 'St Kitts and Nevis',
    'Saint Kitts, Nevis and Anguilla': 'St Kitts and Nevis',
    'Saint Pierre and Miquelon': 'St Pierre and Miquelon',
    'Saint Vincent and the Grenadines': 'St Vincent',
    'So. African Customs Union': 'SACU',
    'United Arab Emirates': 'UAE',
    'United Rep. of Tanzania': 'Tanzania',
    'USA (before 1981)': 'USA',
    'Africa CAMEU region, nes': 'Africa CAMEU, nes',
    'Br. Antarctic Terr.': 'Br. Antarctic',
    'Br. Indian Ocean Terr.': 'Br. Indian Ocean',
    'Fr. South Antarctic Terr.': 'Fr. South Antarctic',
    'Heard Island and McDonald Islands': 'Heard and McDonald Isds',
    'North America and Central America, nes': 'North and Centr America, nes',
    'South Georgia and the South Sandwich Islands': 'SGSSI',
    'United States Minor Outlying Islands': 'US Minor Outlying Isds'
}

for n in node_data['country'].sort_values().unique().tolist():
    if len(n) > 10 and n in label_dict_short.keys():
        label_dict[n] = label_dict_short[n]
    else:
        label_dict[n] = n
        
nlinks_filter = []                                                                # Countries with 50+ trade links

for index in node_data.index.unique().tolist():
    if node_data.loc[index]['trade_links'] >= 50:
        nlinks_filter.append(index)
        
max_diameter = node_data['diameter'].max()                                        # Maximum node size

node_data['label'] = [                                                            # Filtering the labels
    label_dict[country] if diameter >= max_diameter * 0.25
    or index in nlinks_filter else np.nan for country, diameter, index in zip(
        node_data['country'], node_data['diameter'], node_data.index)
]

node_data['label_selected'] = [                                                   # A column with all the labels
        label_dict[country] for country in node_data['country']
    ]

#### COLORING

In the graph, we need to color four groups of elements:
* nodes
* nodes' edges
* nodes' labels
* links

Besides that, I will provide an opportunity to select a specific country, so that the country will be brightened while all the other elements will be shadowed.

We'll color the links with a gradient to show the category change: to do this, we pass multiple HEX codes as a single string to Cytoscape.

I'll set the palette for the nodes' edges and generate other elements' colors out of it.

In [19]:
region_color_dict = {
    'Antarctica': '#7E6EBD',
    'Africa': '#FB9038',
    'Asia': '#D1085C',
    'Europe': '#08B0D1',
    'Americas': '#d4e2e8',
    'Oceania': '#134DD1',
    'Special categories and unspecified areas': '#F9F871'
}

export_share_color_dict = {
    '01': '#08b4d1',
    '02': '#39c3da',
    '03': '#6bd2e3',
    '04': '#9ce1ed',
    '05': '#cef0f6',
    '06': '#ffe5cd',
    '07': '#ffcc9b',
    '08': '#ffb268',
    '09': '#ff9936',
    '10': '#ff7f04'
}

pagerank_color_dict = {
    '01': '#3e26a8',
    '02': '#4746eb',
    '03': '#3e70ff',
    '04': '#2797eb',
    '05': '#08b4d1',
    '06': '#32c69f',
    '07': '#81cc59',
    '08': '#dbbd28',
    '09': '#fcd030',
    '10': '#f9fb15'
}

background_color = '#010103'

def rgb_to_hex(rgb):
    r = rgb[0]
    g = rgb[1]
    b = rgb[2]
    return '#{:02x}{:02x}{:02x}'.format(r, g, b)

def hex_gradient_list(hex_color1, hex_color2, n_colors):

    assert n_colors > 1
    color1_rgb = np.array(hex_to_rgb(hex_color1)) / 255
    color2_rgb = np.array(hex_to_rgb(hex_color2)) / 255
    ordered = np.linspace(0, 1, n_colors)
    gradient = [list(((1 - order) * color1_rgb + (order * color2_rgb)))
                for order in ordered]
    gradient_transformed = [[int(round(val * 255)) for val in color] for color in gradient]
    return [rgb_to_hex(color) for color in gradient_transformed]

def hex_gradient_str(hex_color1, hex_color2, n_colors):

    assert n_colors > 1
    color1_rgb = np.array(hex_to_rgb(hex_color1)) / 255
    color2_rgb = np.array(hex_to_rgb(hex_color2)) / 255
    ordered = np.linspace(0, 1, n_colors)
    gradient = [list(((1 - order) * color1_rgb + (order * color2_rgb)))
                for order in ordered]
    gradient_transformed = [[int(round(val * 255)) for val in color] for color in gradient]
    return " ".join([rgb_to_hex(color) for color in gradient_transformed])


# Node Edges

node_data['region_edge_color'] = node_data['region'].map(region_color_dict)
node_data['export_share_edge_color'] = node_data['export_share'].map(export_share_color_dict)
node_data['pagerank_edge_color'] = node_data['pagerank'].map(pagerank_color_dict)

# Nodes

node_data['region_color'] = [
    hex_gradient_list(color, background_color, 3)[1] for color in node_data['region_edge_color']
]
node_data['export_share_color'] = [
    hex_gradient_list(color, background_color, 3)[1] for color in node_data['export_share_edge_color']
]
node_data['pagerank_color'] = [
    hex_gradient_list(color, background_color, 3)[1] for color in node_data['pagerank_edge_color']
]

# Chosen Nodes

node_data['chosen_region_color'] = [
    hex_gradient_list(color, '#ffffff', 7)[3] for color in node_data['region_color']
    ]

node_data['chosen_export_share_color'] = [
    hex_gradient_list(color, '#ffffff', 7)[3] for color in node_data['export_share_color']
    ]

node_data['chosen_pagerank_color'] = [
    hex_gradient_list(color, '#ffffff', 7)[3] for color in node_data['pagerank_color']
    ]

# Node Labels

node_data['region_label_color'] = [
    hex_gradient_list(color, '#ffffff', 5)[3] for color in node_data['region_edge_color']
]
node_data['export_share_label_color'] = [
    hex_gradient_list(color, '#ffffff', 5)[3] for color in node_data['export_share_edge_color']
]
node_data['pagerank_label_color'] = [
    hex_gradient_list(color, '#ffffff', 5)[3] for color in node_data['pagerank_edge_color']
]

# Chosen Node Labels

node_data['chosen_region_label_color'] = [
    hex_gradient_list(color, '#ffffff', 6)[5] for color in node_data['region_edge_color']
]
node_data['chosen_export_share_label_color'] = [
    hex_gradient_list(color, '#ffffff', 6)[5] for color in node_data['export_share_edge_color']
]
node_data['chosen_pagerank_label_color'] = [
    hex_gradient_list(color, '#ffffff', 6)[5] for color in node_data['pagerank_edge_color']
]

# Edges

for parameter, d in zip(['region', 'export_share', 'pagerank'],
                        [region_color_dict, export_share_color_dict, pagerank_color_dict]):
    for side in ['source', 'target']:
        edge_data[parameter + '_color_' + side] = edge_data[parameter + '_' + side].map(d)
        
for parameter in ['region', 'export_share', 'pagerank']:
    edge_data[parameter + '_colors_source'] = [
        hex_gradient_list(background_color, color, 11)[2] for color in edge_data[parameter + '_color_source']
    ]
    edge_data[parameter + '_colors_target'] = [
        hex_gradient_list(background_color, color, 11)[7] for color in edge_data[parameter + '_color_target']
    ]

for parameter in ['region', 'export_share', 'pagerank']:
    edge_data[parameter + '_colors'] = [
        hex_gradient_str(color_source, color_target, 10) for color_source, color_target in zip(
            edge_data[parameter + '_colors_source'], edge_data[parameter + '_colors_target'])
    ]

<br>
<span style="font-size:20px;color:#ffffff;font-weight:bold;background-color:#000000">Building the graph</span><a class="anchor" id="4"></a>

[UP](#contents)
<br>

#### ELEMENTS

Here we create dictionaries with all the parameters and node positions, which we will then pass to Cytoscape. 

I will load the codes into the dictionary to color the graph by the number of trade links.

In [20]:
elements = []

# Node parameters and positions

for i in node_data.index:
    el_dict = dict()
    el_dict['data'] = dict()
    el_dict['position'] = dict()
    
    el_dict['data']['id'] = i
    el_dict['data']['label'] = node_data.loc[i]['label']
    el_dict['data']['label_selected'] = node_data.loc[i]['label_selected']
    el_dict['data']['color'] = node_data.loc[i]['pagerank_label_color']
    el_dict['data']['color_chosen'] = node_data.loc[i]['chosen_pagerank_label_color']
    el_dict['data']['background_color'] = hex_gradient_list(node_data.loc[i]['pagerank_color'], background_color, 4)[2] + ' ' + node_data.loc[i]['pagerank_color'] + ' ' + node_data.loc[i]['pagerank_edge_color']
    el_dict['data']['background_color_chosen'] = hex_gradient_list(node_data.loc[i]['chosen_pagerank_color'], background_color, 4)[2] + ' ' + node_data.loc[i]['pagerank_edge_color'] + ' ' + node_data.loc[i]['pagerank_edge_color']
    el_dict['data']['border_color'] = node_data.loc[i]['pagerank_edge_color']
    el_dict['data']['size'] = node_data.loc[i]['diameter']
    el_dict['data']['opacity'] = 0.75   
    el_dict['data']['font_size'] = 40
    
    el_dict['position']['x'] = node_data.loc[i]['x']
    el_dict['position']['y'] = node_data.loc[i]['y']
    
    elements.append(el_dict)

# Edge parameters

for i in edge_data[['source', 'target']].drop_duplicates().index:
    el_dict_e = dict()
    el_dict_e['data'] = dict()
    el_dict_e['data']['id'] = edge_data.loc[i]['source'] + '_' + edge_data.loc[i]['target']
    el_dict_e['data']['source'] = edge_data.loc[i]['source']
    el_dict_e['data']['target'] = edge_data.loc[i]['target']
    el_dict_e['data']['colors'] = edge_data.loc[i]['pagerank_colors']
    elements.append(el_dict_e)

#### FINAL LAYOUT

Here we create the final network. At the top we will have a small panel that will indicate the country the mouse is hovering over, the country the user clicked on, and a button to download the graph in .png format.

We need to set the default network stylesheet, as well as styles for the HTML elements of our page (buttons, text, etc.).

There are three callbacks for a page:
- display the name of the country over which you hovered the cursor;
- show the selected country with all trading partners and display its name;
- save the network as an image by clicking the button.

In [21]:
app = Dash(__name__)

default_stylesheet = [{                                                         # Default network attributes
    'selector': 'node',
    'style': {
        'label': 'data(label)',
        "background-fill": "radial-gradient",
        "background-gradient-stop-colors": 'data(background_color)',
        "background-gradient-stop-positions": '0, 80, 90, 100',
        'color': 'data(color)',
        'text-valign': 'center',
        'text-halign': 'center',
        'font-size': 'data(font_size)',
        'border-color': 'data(border_color)',
        'border-width': 1.5,
        "border-opacity": 1,
        'width': 'data(size)',
        'height': 'data(size)',
        'opacity': 0.98
    }
}, {
    'selector': 'edge',
    'style': {
        "line-fill": "linear-gradient",
        "line-gradient-stop-colors": 'data(colors)',
        "line-gradient-stop-positions": "10, 20, 30, 40, 50, 60, 70, 80, 90",
        'width': 2.5,
        'curve-style': 'bezier',
        'source-endpoint': 'outside-to-node',
        'target-endpoint': 'outside-to-node'
    }
}]

styles = {                                                                      # HTML Styling
    'Lab': {
        'height': '100%',
        'color': '#ffffff',
        'font-family': 'Courier New',
        'font-size': '80%',
        'padding-top': '10%',
        'padding-bottom': '10%',
        'padding-left': '10%'
    },
    'Output': {
        'height': '100%',
        'color': '#08B0D1',
        'font-family': 'Courier New',
        'font-size': '80%',
        'padding-top': '16%',
        'padding-bottom': '10%',
        'padding-left': '10%'
    },
    'Button': {
        'height': '100%',
        'width': '100%',
        'color': '#ffffff',
        'background-color': '#010103',
        'border-color': '#010103',
        'border-width': 0,
        'padding-left': '20%',
        'font-family': 'Courier New',
        'font-size': '80%',
        'text-align': 'left',
        'cursor': 'pointer'
    }
}

app.layout = dbc.Container([                                                    # Page structure                                           
    html.Div([
        html.Div([
            html.Div(html.P('Country Hovered:', style=styles['Lab']), 
                     style={'width': '15%', 
                            'height': '100%',
                            'margin-left': '2%'}),
            html.Div(html.Div(id='mouseoverNodeData',
                              style=styles['Output']), 
                     style={'width': '25%'}),
            html.Div(html.P('Country Highlighted:', 
                            style=styles['Lab']),
                     style={'width': '15%', 
                            'height': '100%', 
                            'margin-left': '2%'}),
            html.Div(html.Div(id='output-country', 
                              style=styles['Output']), 
                     style={'width': '25%'}),
            html.Div(html.Button("Save PNG", id="btn-get-png",
                                 style=styles['Button']), 
                     style={'width': '20%', 
                            'background-color': '#010103', 
                            'margin-left': '2%'})
        ], style={'width': '100%', 'display': 'flex', 'background-color': '#010103'}),
        html.Div([
            cyto.Cytoscape(
                id='cytoscape',
                layout={'name': 'cola'},
                style={
                    'width': '800px',
                    'height': str(graph_height*800/graph_width) + 'px', 
                    'background-color': '#010103'},
                elements=elements,
                stylesheet=default_stylesheet)
        ], style={'width': '100%', 'background-color': '#010103'})
    ], style={'display': 'inline-block', 'width': '800px'})
], style={'display': 'flex', 
          'justify-content': 'center',
          'margin-bottom': '50px'})
   
    
@app.callback(                                                                  # User Callbacks
    Output('mouseoverNodeData', 'children'),
    Input('cytoscape', 'mouseoverNodeData'))
def display_hover_data(data):
    if data:
        return data['label_selected']

    
@app.callback(
    [
        Output('cytoscape', 'stylesheet'),
        Output('output-country', 'children')
    ],
    [
        Input('cytoscape', 'tapNode'),
        Input('cytoscape', 'selectedNodeData')
    ]
)
def generate_stylesheet(node, data_list):

    if not data_list:
        return default_stylesheet, 'No country selected'

    elif node:
        node_id = node['data']['id']
    
        stylesheet = [
            {
                "selector": 'node',
                'style': {
                    'label': 'data(label)',
                    "background-fill": "radial-gradient",
                    "background-gradient-stop-colors": 'data(background_color)',
                    "background-gradient-stop-positions": "0, 80, 90, 100",
                    'color': 'data(color)',
                    'text-valign': 'center',
                    'text-halign': 'center',
                    'font-size': 'data(font_size)',
                    'border-color': 'data(border_color)',
                    'border-width': 1.5,
                    "border-opacity": 1,
                    'width': 'data(size)',
                    'height': 'data(size)',
                    'opacity': 0.3
                }
            }, {
                'selector': 'edge',
                'style': {
                    "line-fill": "linear-gradient",
                    "line-gradient-stop-colors": 'data(colors)',
                    "line-gradient-stop-positions": "10, 20, 30, 40, 50, 60, 70, 80, 90",
                    'width': 2,
                    'curve-style': 'bezier',
                    'source-endpoint': 'outside-to-node',
                    'target-endpoint': 'outside-to-node',
                    'opacity': 0.2
                }
            }, {
                "selector": 'node[id = "{}"]'.format(node_id),
                "style": {
                    'label': 'data(label_selected)',
                    "background-fill": "radial-gradient",
                    "background-gradient-stop-colors": 'data(background_color_chosen)',
                    "background-gradient-stop-positions": "0, 98, 99, 100",
                    'color': 'data(color_chosen)',
                    'text-valign': 'center',
                    'text-halign': 'center',
                    'font-size': 'data(font_size)',
                    'border-color': 'data(border_color)',
                    'border-width': 1.5,
                    "border-opacity": 1,
                    'width': 'data(size)',
                    'height': 'data(size)',
                    'opacity': 0.98,
                    'z-index': 9999
                }
            }
        ]

        for edge in node["edgesData"]:
            if edge['source'] == node_id:
                stylesheet.append(
                    {
                        "selector": 'node[id = "{}"]'.format(edge['target']),
                        "style": {
                            'label': 'data(label)',
                            "background-fill": "radial-gradient",
                            "background-gradient-stop-colors": 'data(background_color)',
                            "background-gradient-stop-positions": "0, 80, 90, 100",
                            'color': 'data(color)',
                            'text-valign': 'center',
                            'text-halign': 'center',
                            'font-size': 'data(font_size)',
                            'border-color': 'data(border_color)',
                            'border-width': 1.5,
                            "border-opacity": 1,
                            'width': 'data(size)',
                            'height': 'data(size)',
                            'opacity': 0.98
                        }
                    })
                stylesheet.append(
                    {
                        "selector": 'edge[id= "{}"]'.format(edge['id']),
                        "style": {
                            "line-fill": "linear-gradient",
                            "line-gradient-stop-colors": 'data(colors)',
                            "line-gradient-stop-positions": "10, 20, 30, 40, 50, 60, 70, 80, 90",
                            'width': 7,
                            'curve-style': 'bezier',
                            'source-endpoint': 'outside-to-node',
                            'target-endpoint': 'outside-to-node',
                            'opacity': 0.98,
                            'z-index': 5000
                        }
                    })

            if edge['target'] == node_id:
                stylesheet.append(
                    {
                        "selector": 'node[id = "{}"]'.format(edge['source']),
                        "style": {
                            'label': 'data(label)',
                            "background-fill": "radial-gradient",
                            "background-gradient-stop-colors": 'data(background_color)',
                            "background-gradient-stop-positions": "0, 80, 90, 100",
                            'color': 'data(color)',
                            'text-valign': 'center',
                            'text-halign': 'center',
                            'font-size': 'data(font_size)',
                            'border-color': 'data(border_color)',
                            'border-width': 1.5,
                            "border-opacity": 1,
                            'width': 'data(size)',
                            'height': 'data(size)',
                            'opacity': 0.98,
                            'z-index': 9999
                        }
                    })
                stylesheet.append(
                    {
                        "selector": 'edge[id= "{}"]'.format(edge['id']),
                        "style": {
                            "line-fill": "linear-gradient",
                            "line-gradient-stop-colors": 'data(colors)',
                            "line-gradient-stop-positions": "10, 20, 30, 40, 50, 60, 70, 80, 90",
                            'width': 6,
                            'curve-style': 'bezier',
                            'source-endpoint': 'outside-to-node',
                            'target-endpoint': 'outside-to-node',
                            'opacity': 0.98,
                            'z-index': 5000
                        }
                    })

        return stylesheet, node['data']['label_selected']

    
@callback(
    Output("cytoscape", "generateImage"),
    Input("btn-get-png", "n_clicks"),
    prevent_initial_call=True
)
def get_image(get_png_clicks):
    
    if ctx.triggered_id == 'btn-get-png':
        now = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        return {'type': "png",
                'action': "download",
                'options': {'bg': '#010103'}, 
                'filename': f'wheat_and_meslin_{now}'}

    
if __name__ == "__main__":
    app.run_server(debug=True)