First time use / Primeira utilização

Menu: Terminal -> New Terminal
pip install -r requirements.txt

# Teste da API Comtrade

Site: https://comtradedeveloper.un.org
Manual metodológico: https://comtrade.un.org/data/MethodologyGuideforComtradePlus.pdf


API specs: https://comtradedeveloper.un.org/api-details#api=comtrade-v1&operation=get-get



## Descrições dos códigos HS

Obter tabela em https://github.com/datasets/harmonized-system/blob/master/data/harmonized-system.csv

Copiar para directoria `support`


### Ler a tabela e criar dicionários para descodificação

In [37]:
import pandas as pd

hs_codes_df = pd.read_csv('support/harmonized-system.csv') # read table
hs_codes_map = dict(zip(hs_codes_df.hscode, hs_codes_df.description)) #  dict for decoding
hs_codes_l2 = hs_codes_df[hs_codes_df.level == 2]  # create subset of level 2 codes
hs_l2_map = dict(zip(hs_codes_l2.hscode, hs_codes_l2.description)) # dict for decodung


### Obter os códigos de países e regiões

Ver lista em https://unstats.un.org/wiki/display/comtrade/Country+Code

Em formato JSON disponível em:
 * https://comtrade.un.org/data/cache/reporterAreas.json
 * https://comtrade.un.org/data/cache/partnerAreas.json

Aqui descarregamos e guardamos em `support` para evitar
ligação próxima vez.


In [44]:
import os
import json
import requests
import pandas as pd

fname = 'support/reporter_codes.json'
if os.path.isfile(fname):
    with open(fname) as cached:
        reporter_codes = json.load(cached)
else:
    resp = requests.get("https://comtrade.un.org/data/cache/reporterAreas.json")
    codes = json.loads(resp.content)['results']
    reporter_codes = dict([(a['id'], a['text'])  for a in codes])
    with open(fname, mode="x") as outfile:
        json.dump(reporter_codes,outfile,indent=2)

fname = 'support/partner_codes.json'
if os.path.isfile(fname):
    with open(fname) as cached:
        partner_codes = json.load(cached)
else:
    resp = requests.get("https://comtrade.un.org/data/cache/partnerAreas.json")
    codes = json.loads(resp.content)['results']
    partner_codes = dict([(a['id'], a['text'])  for a in codes])
    with open(fname,mode="x") as outfile:
        json.dump(reporter_codes,outfile,indent=2)

# convert codes to int
m49_codes_map = { int(k):v for (k,v) in reporter_codes.items() if k!='all'}

Os dados devolvidos pela API incluem códigos no campo `Partner2` que não estão na lista oficial de códigos M49.

Outros utilizadores têm o mesmo problema.

Ver  https://rstudio-pubs-static.s3.amazonaws.com/92321_70509e47e7f041e68f383253cb85751b.html onde se encontra o resultado de cruzamentos dos códigos dos dados e várias versões da lista
M49 e que é útil para completar a lista da FAO.

Ver também esta nota da unstats: https://unstats.un.org/wiki/display/comtrade/Reporter+country+codes+and+their+customs+areas

__Códigos atualmente em falta:__
* __473__ China-Angola Import 2016 partner2Code.


In [52]:
m49_codes_map.get(473,"Not found")

'Not found'

## Obter os dados de comtrade.un.org

### Parâmetros gerais que não mudam



In [53]:
m49_angola = 24
m49_brazil = 76
m49_cabo_verde = 132
m49_china = 156
m49_hong_kong = 344
m49_macau = 446
m49_guine_equatorial = 226
m49_guine_bissau = 624
m49_mozambique = 508
m49_portugal = 620
m49_stome_principe = 678
m49_timor = 626

# make list of Portuguese Speaking Countries
m49_plp = [m49_angola,m49_brazil,m49_cabo_verde,m49_guine_bissau,
            m49_guine_equatorial,m49_mozambique,m49_portugal,
            m49_stome_principe,m49_timor]
m49_plp_list = ",".join(map(str,m49_plp))




### Função auxiliar para aceder à API un.comtrade


In [54]:
import json
import requests
import pandas as pd


def call_uncomtrade(typeCode: str, freqCode: str, 
                    reporterCode: str = '49', 
                    partnerCode: str = '024,076,132,226,624,508,620,678,626',
                    partner2Code: str = '0',
                    period: str = None,
                    clCode: str = "HS",
                    cmdCode: str = "TOTAL",
                    flowCode: str = "M,X",
                    timeout: int = 10,
                    echo_url: bool = False
                    )->pd.DataFrame:
    """ Makes a request to UN Comtrade API (public), returns a pandas DataFrame
    
    Parameters
        typeCode: required, C for commodities, S for Services
        freqCode: required, A for annual and M for monthly
        reporterCode: optional, list of M49 codes, default "049" (China)
        partnerCode: optional, list of M49 codes None for all countries, default PLP codes
        partner2Code: optional, list of M49 codes, None for all countries, 0 for agregate, default 0
        period:  optional, aaaa or aaaamm default None (all available periods)
        clCode: Trade classifications: HS, SITC, BEC or EBOPS.
                Available values : HS, SS, B4, B5, EB, EB10, EB02, EBSDMX
        cmdCode: optional, default, "TOTAL"
        flowCode: optional, M=import, X=export,more: RX, RM, MIP, XIP, MOP, XOP, MIF, XIF, DX, FM; default "M,X"
        timeOut: int, max wait time in seconds. Default 10
        echo_url: bool, print url of call, default False
     """

    baseUrl = "https://comtradeapi.un.org/public/v1"

    requestUrl=f"{baseUrl}/preview/{typeCode}/{freqCode}/{clCode}"
    resp = requests.get(requestUrl,
            {
            'reporterCode':reporterCode,
            'period':period,
            'partnerCode':partnerCode,
            'partner2Code':partner2Code,
            'cmdCode':cmdCode,
            'flowCode':flowCode
            },
            timeout=timeout)
    if echo_url:
        print(resp.url)
    results = json.loads(resp.content)['data']
    df = pd.DataFrame(results)

    # Convert the country codes to country names
    df.reporterDesc = df.reporterCode.map(m49_codes_map)
    df.partnerDesc = df.partnerCode.map(m49_codes_map)
    df.partner2Desc = df.partner2Code.map(m49_codes_map)
    # Convert the HS codes
    df.cmdDesc = df.cmdCode.map(hs_codes_map)
    # Generate a formated version of the value for readability here
    df['primaryValueFormated'] = df.primaryValue.map('{:,}'.format)
    # return the DataFrame
    return df

## Parâmetros para visualização

Colunas a visualizar, ordem das linhas



In [55]:
# Colunas mais interessantes do resultado
# escolher de 
#        'typeCode', 'freqCode', 'refPeriodId', 'refYear', 'refMonth',
#        'period', 'reporterCode', 'reporterISO', 'reporterDesc',
#        'flowCode', 'flowDesc', 'partnerCode', 'partnerISO', 'partnerDesc',
#        'partner2Code', 'partner2ISO', 'partner2Desc',
#        'classificationCode', 'classificationSearchCode',
#        'isOriginalClassification', 'cmdCode', 'cmdDesc', 'aggrLevel',
#        'isLeaf', 'customsCode', 'customsDesc', 'mosCode', 'motCode',
#        'motDesc', 'qtyUnitCode', 'qtyUnitAbbr', 'qty', 'isQtyEstimated',
#        'altQtyUnitCode', 'altQtyUnitAbbr', 'altQty', 'isAltQtyEstimated',
#        'netWgt', 'isNetWgtEstimated', 'grossWgt', 'isGrossWgtEstimated',
#        'cifvalue', 'fobvalue', 'primaryValue', 'legacyEstimationFlag',
#        'isReported', 'isAggregate', 'primaryValueFormated']

cols = ['typeCode','freqCode','reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdCode','cmdDesc','flowCode','isReported','primaryValueFormated','primaryValue']
sort_order = ['reporterDesc','partnerDesc','refYear','refMonth']

## Notas de utilização

### Mais do que uma linha por par de países nos anos de 2015,2016,2017

Alguns anos produzem mais do que uma linha por par _reporter/partner_  com diferentes valores:
2015, 2016, 2017
* Nesses anos existe uma linha por cada `partner2Code`, incluindo uma linha para o próprio partnerCode.
* Uma linha adicional com `partner2Code` igual a zero que contém o total agregado das outras linhas com `partner2Code`explícito.
* Isso significa que existe duplicação do total.
  
|    | reporterDesc   | partnerDesc       |   partner2Code | partner2Desc         |   refYear | cmdCode   | flowCode   | primaryValueFormated   |
|---:|:---------------|:------------------|---------------:|:---------------------|----------:|:----------|:-----------|:-----------------------|
|  3 | China          | Equatorial Guinea |            344 | China, Hong Kong SAR |      2015 | TOTAL     | M          | 59.0                   |
|  1 | China          | Equatorial Guinea |             56 | Belgium              |      2015 | TOTAL     | M          | 2,435.0                |
|  2 | China          | Equatorial Guinea |            226 | Equatorial Guinea    |      2015 | TOTAL     | M          | 1,166,493,970.0        |
|  0 | China          | Equatorial Guinea |              0 | nan                  |      2015 | TOTAL     | M          | 1,166,496,464.0        |


Para evitar isso tem de se chamar a API com partner2Code = 0, para que os resultados de 2015,2016,2017 excluam
a decomposição. Se partner2Code=None as linhas adicionais aparecem.


Exemplo de resultados se o `partner2Code` for None


In [56]:
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

cols2 = ['reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdCode','flowCode','primaryValueFormated']
period = "2016" ## if freqCode M  use aaaamm
flow = "M"
partnerCode = m49_guine_equatorial
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     partner2Code=None,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=30, echo_url=True
                     )
result = df.sort_values(['partnerDesc','flowCode'])[cols2]
# print(result.to_markdown())
result

https://comtradeapi.un.org/public/v1/preview/C/A/HS?reporterCode=156&period=2016&partnerCode=226&cmdCode=TOTAL&flowCode=M


Unnamed: 0,reporterDesc,partnerDesc,partner2Code,partner2Desc,refYear,cmdCode,flowCode,primaryValueFormated
0,China,Equatorial Guinea,0,,2016,TOTAL,M,631851506.0
1,China,Equatorial Guinea,24,Angola,2016,TOTAL,M,396344.0
2,China,Equatorial Guinea,178,Congo,2016,TOTAL,M,1457849.0
3,China,Equatorial Guinea,226,Equatorial Guinea,2016,TOTAL,M,589959003.0
4,China,Equatorial Guinea,251,France,2016,TOTAL,M,1341.0
5,China,Equatorial Guinea,344,"China, Hong Kong SAR",2016,TOTAL,M,634.0
6,China,Equatorial Guinea,458,Malaysia,2016,TOTAL,M,40036318.0
7,China,Equatorial Guinea,490,"Other Asia, nes",2016,TOTAL,M,17.0


Isto não acontece senão nos anos de 2015,2016 e 2017.

Por exemplo a mesma chamada para 2018 dá apenas uma linha


In [8]:
period = "2018" ## if freqCode M  use aaaamm
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     partner2Code=None,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=None,
                     echo_url=True
                     )
result = df.sort_values(['partnerDesc','flowCode'])[cols2]
# print(result.to_markdown())
result

https://comtradeapi.un.org/public/v1/preview/C/A/HS?reporterCode=156&period=2018&partnerCode=226&cmdCode=TOTAL&flowCode=M


Unnamed: 0,reporterDesc,partnerDesc,partner2Code,partner2Desc,refYear,cmdCode,flowCode,primaryValueFormated
0,China,Equatorial Guinea,0,,2018,TOTAL,M,2139372096.0


A função call_uncomtrade passou a colocar partner2Code = 0 quando não especificado,
para evitar o problema.

Neste exemplo não se inclui o parâmetro `partner2Code` e a função coloca a zero para obter o resultado
correcto.

In [9]:
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

cols2 = ['reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdCode','flowCode','primaryValueFormated']
period = "2016" ## if freqCode M  use aaaamm
flow = "M"
partnerCode = m49_guine_equatorial
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=60,
                     echo_url=True
                     )
result = df.sort_values(['partnerDesc','flowCode'])[cols2]
# print(result.to_markdown())
result

https://comtradeapi.un.org/public/v1/preview/C/A/HS?reporterCode=156&period=2016&partnerCode=226&partner2Code=0&cmdCode=TOTAL&flowCode=M


Unnamed: 0,reporterDesc,partnerDesc,partner2Code,partner2Desc,refYear,cmdCode,flowCode,primaryValueFormated
0,China,Equatorial Guinea,0,,2016,TOTAL,M,631851506.0


Para obter as linhas referentes a partner2 nos anos outros que 2015-2017

In [10]:
k = m49_codes_map.keys()
m49_all_list=",".join(map(str,k))
m49_all_list

'-1,4,-20,-2,8,12,16,20,24,660,10,28,32,51,533,-3,36,40,31,44,48,50,52,-4,112,56,-24,84,204,60,64,-15,68,535,70,72,74,76,86,92,96,100,854,108,132,116,120,124,136,140,148,-25,152,156,344,446,158,-26,162,166,170,174,178,184,188,191,192,531,196,203,384,-21,408,180,208,262,212,214,218,818,222,226,232,233,748,231,-5,238,234,242,246,250,254,258,260,266,270,268,276,288,292,-6,300,304,308,312,316,320,831,324,624,328,332,-9,-10,334,336,340,348,352,-11,356,360,364,368,372,833,376,380,388,-12,392,832,400,-7,398,404,296,-14,414,417,418,428,422,426,430,434,438,440,442,450,454,458,462,466,470,584,474,478,480,175,484,583,492,496,499,500,504,508,104,516,520,524,528,540,554,558,562,566,570,574,807,580,578,512,586,585,275,591,598,-16,600,604,608,612,616,620,630,634,410,498,642,643,646,638,652,654,659,662,666,670,663,882,674,678,682,-19,686,-18,688,690,694,702,534,703,705,90,706,710,239,728,724,-17,144,729,740,744,752,756,760,762,764,626,768,772,776,780,-8,788,792,795,796,798,581,800,804,784,826,834,850,

In [15]:
period = "2016" ## if freqCode M  use aaaamm

df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     partner2Code=m49_all_list,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=None,
                     echo_url=True
                     )
result = df.sort_values(['partnerDesc','flowCode'])[cols2]
# print(result.to_markdown())
result

https://comtradeapi.un.org/public/v1/preview/C/A/HS?reporterCode=156&period=2016&partnerCode=226&partner2Code=-1%2C4%2C-20%2C-2%2C8%2C12%2C16%2C20%2C24%2C660%2C10%2C28%2C32%2C51%2C533%2C-3%2C36%2C40%2C31%2C44%2C48%2C50%2C52%2C-4%2C112%2C56%2C-24%2C84%2C204%2C60%2C64%2C-15%2C68%2C535%2C70%2C72%2C74%2C76%2C86%2C92%2C96%2C100%2C854%2C108%2C132%2C116%2C120%2C124%2C136%2C140%2C148%2C-25%2C152%2C156%2C344%2C446%2C158%2C-26%2C162%2C166%2C170%2C174%2C178%2C184%2C188%2C191%2C192%2C531%2C196%2C203%2C384%2C-21%2C408%2C180%2C208%2C262%2C212%2C214%2C218%2C818%2C222%2C226%2C232%2C233%2C748%2C231%2C-5%2C238%2C234%2C242%2C246%2C250%2C254%2C258%2C260%2C266%2C270%2C268%2C276%2C288%2C292%2C-6%2C300%2C304%2C308%2C312%2C316%2C320%2C831%2C324%2C624%2C328%2C332%2C-9%2C-10%2C334%2C336%2C340%2C348%2C352%2C-11%2C356%2C360%2C364%2C368%2C372%2C833%2C376%2C380%2C388%2C-12%2C392%2C832%2C400%2C-7%2C398%2C404%2C296%2C-14%2C414%2C417%2C418%2C428%2C422%2C426%2C430%2C434%2C438%2C440%2C442%2C450%2C454%2C458%2C462%2C466%2

AttributeError: 'DataFrame' object has no attribute 'reporterCode'

## Testes

In [None]:


pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

cols2 = ['reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdCode','flowCode','primaryValueFormated']
period = "2016" ## if freqCode M  use aaaamm
flow = "M,X"
partnerCode = m49_plp_list
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=30
                     )
result = df.sort_values(['partnerDesc','flowCode'])[cols2]
# print(result.to_markdown())
result

In [None]:
pd.options.display.float_format = '${:,.2f}'.format

temp=df.pivot(index=['partnerDesc','refYear'],columns='flowCode',values='primaryValue')
temp['C'] = temp['M'] + temp['X']
forum_type = temp[['C','X','M']]
forum_type.columns=['Trocas','Exportações','Importações']
forum_type.div(10)


In [None]:
total = df.primaryValue.sum()
total_via_partner2 = df[df.partner2Code != 0].primaryValue.sum()
total_sem_partner2 = df[df.partner2Code == 0 ].primaryValue.sum()


print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} ",
                    f"{total:,}")
print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} com partner2Code",
                    f"{total_via_partner2:,}")
print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} sem partner2Code",
                    f"{total_sem_partner2:,}")
print(f"Soma verificiação  {m49_codes_map[partnerCode]} {flow} {period} sem partner2Code",
                    f"{total_sem_partner2+total_via_partner2:,}")

In [None]:
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

cols3 = ['reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdDesc','flowCode','primaryValueFormated']

df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     cmdCode='AG2',
                     period=period,
                     timeout=20
                     )
result = df.sort_values(['primaryValue','partner2Desc',])[cols3]
# print(result.to_markdown())
result

In [None]:
total = df.primaryValue.sum()
total_via_partner2 = df[df.partner2Code != 0].primaryValue.sum()
total_sem_partner2 = df[df.partner2Code == 0 ].primaryValue.sum()


print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} ",
                    f"{total:,}")
print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} com partner2Code",
                    f"{total_via_partner2:,}")
print(f"Total para {m49_codes_map[partnerCode]} {flow} {period} sem partner2Code",
                    f"{total_sem_partner2:,}")
print(f"Soma verificiação  {m49_codes_map[partnerCode]} {flow} {period} sem partner2Code",
                    f"{total_sem_partner2+total_via_partner2:,}")

### Filtrar resultados para partner2Code = 0

In [None]:
pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

cols2 = ['reporterDesc','partnerDesc','partner2Code','partner2Desc','refYear','cmdCode','flowCode','primaryValueFormated']
period = "2018" ## if freqCode M  use aaaamm
flow = "M,X"
partnerCode = m49_angola
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=partnerCode,
                     cmdCode='TOTAL',
                     period=period,
                     timeout=30
                     )
result = df.sort_values(['flowCode','primaryValue','partner2Desc',])[cols2]
# print(result.to_markdown())
result

### Gravar em Excel

In [None]:
filename_note=period+flow.replace(",","_")  # change to append to filename
df[cols].to_excel(f"./downloads/dados_comtrade_{filename_note}.xlsx")

## Cobertura dos dados: importações/exportações China-PLP anos disponíveis

In [None]:
import time

pd.set_option('display.max_columns', 100)
pd.set_option('display.max_rows', 500)

flow = "M,X"
for country_code in m49_plp:
    df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode=flow,
                     reporterCode=m49_china,
                     partnerCode=country_code,
                     period=None # period None returns all the available period
                     )
    print(m49_codes_cn_plp[country_code],  df.refYear.unique())
    time.sleep(1)  # avoid stressing the UN server.



## China, importações mais importantes dos PLP

In [None]:
rank_filter = 5  # número de importações mais relevantes
years = "2020,2021"
pco_cols = ['reporterDesc','partnerDesc','refYear','rank','cmdDesc',
            'flowCode','primaryValueFormated']
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode="M",
                     cmdCode="AG2",
                     reporterCode=m49_china,
                     partnerCode=m49_plp_list,
                     period=years 
                     )

pco = df.sort_values(['partnerDesc','refYear','primaryValue'], ascending=[True,True,False])
pco['rank'] = pco.groupby(['partnerDesc','refYear'])["primaryValue"].rank(method="dense", ascending=False)
pco_top5 = pco[pco['rank'] <= rank_filter]
pco_top5[pco_cols].set_index(['reporterDesc','partnerDesc','refYear'])

### Guardar formato excel


In [None]:
filename_note=years  # change to append to filename
pco_top5.to_excel(f"./downloads/china_plp_import_top5_{filename_note}.xlsx")

## China exportações mais importantes para os PLP

In [None]:
rank_filter = 5  # número de importações mais relevantes
years = "2021"
pco_cols = ['reporterDesc','partnerDesc','refYear','rank','cmdDesc',
            'flowCode','primaryValueFormated']
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode="X",
                     cmdCode="AG2",
                     reporterCode=m49_china,
                     partnerCode=m49_plp_list,
                     period=years 
                     )

pco = df.sort_values(['partnerDesc','refYear','primaryValue'], ascending=[True,True,False])
pco['rank'] = pco.groupby(['partnerDesc','refYear'])["primaryValue"].rank(method="dense", ascending=False)
pco_top5 = pco[pco['rank'] <= rank_filter]
pco_top5[pco_cols].set_index(['reporterDesc','partnerDesc','refYear'])

### Guardar formato excel


In [None]:
filename_note=years  # change to append to filename
pco_top5.to_excel(f"./downloads/china_plp_export_top5_{filename_note}.xlsx")

### China, detalhe das exportações mais importants dos PLP para a China

Em dois passos: 
* obtém as categorias de nível 2 mais importantes de cada país
* pesquisa todos as subcategorias de cada

In [None]:
rank_filter = 5  # número de importações mais relevantes
years = "2020,2021"
pco_cols = ['reporterDesc','partnerDesc','refYear','rank','cmdDesc',
            'flowCode','primaryValueFormated']
df = call_uncomtrade("C",# C for commodities, S for Services
                     "A",# (freqCode) A for annual and M for monthly
                     flowCode="M",
                     cmdCode="AG2",
                     reporterCode=m49_china,
                     partnerCode=m49_plp_list,
                     period=years 
                     )

pco = df.sort_values(['partnerDesc','refYear','primaryValue'], ascending=[True,True,False])
pco['rank'] = pco.groupby(['partnerDesc','refYear'])["primaryValue"].rank(method="dense", ascending=False)
pco_top5 = pco[pco['rank'] <= rank_filter]
# get the countries
countries = pco_top5.partnerDesc.unique()
country_cmd_top5_codes = dict()
for country in countries:
    l2_codes = pco_top5[pco_top5.partnerDesc == country]['cmdCode'].unique()
    print(country,l2_codes)
    hs_details = []
    for l2_code in l2_codes:
        l2_sub_codes = list(hs_codes_df[hs_codes_df.hscode.str.startswith(l2_code)]['hscode'])
        hs_details = hs_details + l2_sub_codes
    # print(hs_details)
    country_cmd_top5_codes[country] = hs_details.copy()


Exemplo dos códigos relevantes para o detalhe de Angola

In [None]:
country_cmd_top5_codes['Angola']