In [1]:
import pandas as pd
from tradefile import TradeFile
from customsgrabber import CustomsGrabber

As of 2021, unfortunately, while the Japanese trade data are open to consult, the official website of the Japanese Customs does not provide a dynamic connection to the trade data at the row level, and it is not possible to query the database in order to extract data for specific conditions (e.g. country and commodity). Instead, the website provides either a closed query interface or a set of files divided by month and by a subset of the commodity codes, either the Principal Commodity codes or the HS (Harmonized System) codes.

The class CustomsGrabber provides a downloader for the trade data from the Japanese Customs Website. The data are saved as a zip containing one or more csv file. The single csv files are original from the website.
Currently, CustomsGrabber is able to download one or more years of data along these two dimensions:
- *direction*: 'import' (goods to Japan) or 'export';
The *kind*, 'HS' (the international Harmonized System coding) or 'PC' (Principal Commodity, a summarization of HS codes by categories that the Japanese Government deems useful) is inferred by the columns of the file.

In [None]:
grabber = CustomsGrabber()
grabber.grabRange(from_year=2021, to_year=2021, direction='import', kind='PC') # data from 2020 only
grabber.grabRange(from_year=2020, to_year=2020, direction='import', kind='PC') # data from 2020 only

The class TradeFile provides the tools to open and transform a wide form csv file from the Japanese Customs website.
The files are in wide format, with month columns possibly multiplied by the number of units (e.g. Kgs, Number of units, and thousands JPY).

TradeFile can open these files from an archive, merge and normalize all the files so that the resulting data have the following form:
- The commodity code
- The target country
- The date (month and year) of acquisition
- The unit of measure
- The value or the measure

In [2]:
path = "../data/import_PC_2020-2020.zip"
tool = TradeFile(path, merge_file="../data/import_PC_2021-2021.zip")
tool.data.sample(10)

Loading the file...
Unpivoting the monthly columns. This might take a minute...
Unpivoting the metrics...
Loading the file...
Unpivoting the monthly columns. This might take a minute...
Unpivoting the metrics...


Unnamed: 0,code,country,value,date,unit
18936,7012703,519,0,2020-01,NO
347759,6111701,207,0,2021-10,JPY
169443,7010505,134,0,2020-03,JPY
491994,901,120,0,2020-09,MT
113344,7030903,516,0,2020-02,JPY
383318,61107011,118,0,2021-11,JPY
191977,70313,218,0,2021-06,KG
557293,507,538,0,2020-10,KG
252058,60301,331,0,2020-05,MT
524848,303,134,1121000,2020-09,JPY


In [None]:
tool.data.sample(20)

In [None]:
tool.data[tool.data['date'] == "2020-07"]['measure'].mean()

In [None]:
tool.data.to_csv("2020sep_PC_code_all_countries.csv", index=False)

In [None]:
timeseries = pd.read_csv("../../data/2020sep_PC_code_all_countries.csv",
                         parse_dates=['date'], index_col=['date'])

In [None]:
timeseries.loc['2017']