## The JSON RESTful API for data extraction from the IMF structure

The IMF's [JSON RESTful Web Service API](https://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service) allows access to macroeconomic data covering more than 180 countries.

In this notebook, we will explore data extraction methods of the API.

First, we begin with loading the necessary libraries for data extraction and data manupulation.

In [1]:
import requests, re
import pandas as pd
import time as tm
import json
import numpy as np

The start point to the service located in the following URL:


```
'http://dataservices.imf.org/REST/SDMX_JSON.svc/'
```



The [JSON RESTful Web Service](https://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service) exposes the following methods:
* Dataflow
* DataStructure
* CompactData
* MetadataStructure
* GenericMetadata
* CodeList
* MaxSeriesInResult

The **Dataflow** method returns the list of the datasets, registered for the Data Service. It offers JSON formatted information on which series are available through the API. So far, we have looked at the IFS series.
In order to obtain the data use the following request: 


```
http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow
```

Full list of series available through the IMF can be found [here](https://data.imf.org/?sk=388DFA60-1D26-4ADE-B505-A05A558D9A42&sId=1479329132316).
We can search through the series by name.

In [4]:
search_term = '' # 'Financial Statistics' 

In [6]:
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/'
key = 'Dataflow'

series_list = requests.get(f'{url}{key}').json()['Structure']['Dataflows']['Dataflow']

for series in series_list:
    name = series['Name']['#text'].lower()
    if (search_term.lower() in name) and not any(char.isdigit() for char in name):
        print(f"{series['Name']['#text']}: {series['KeyFamilyRef']['KeyFamilyID']}")

Historical Public Debt (HPDD): HPDD
Gender Equality: GENDER_EQUALITY
Public Sector Balance Sheet (PSBS)(FAD): PSBSFAD
Private and Public Capital Stock Dataset: PGCS
Gender Budgeting: GENDER_BUDGETING
Consumer Price Index (CPI): CPI
International Reserves and Foreign Currency Liquidity (IRFCL): IRFCL
International Financial Statistics (IFS), Discontinued Series: IFS_DISCONTINUED
Export Quality: EQ
Export Diversification: ED
Balance of Payments (BOP), World and Regional Aggregates: BOPAGG
Coordinated Direct Investment Survey (CDIS): CDIS
World Revenue Longitudinal Data (WoRLD): WoRLD
Sustainable Development Goals, IMF Inputs: UNSDG_IMF_INPUTS
Financial Development Index: FDI
Fiscal Decentralization: FISCALDECENTRALIZATION
Commodity Terms of Trade: PCTOT
Coordinated Portfolio Investment Survey (CPIS): CPIS
Currency Composition of Official Foreign Exchange Reserves (COFER): COFER
Primary Commodity Price System (PCPS): PCPS
Fiscal Monitor (FM): FM
Sub-Saharan Africa Regional Economic Outloo

The **DataStructure** method returns the structure of the dataset.
In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/{database ID}
```

The exact format of the key in the API request is determined by the structure of the series. For IFS data, the dimentions are area, frequency and indicator, which is exactly what we extrated in the IFS data extraction example.

The dimensions of the data are found with the DataStructure method and series specific, so that the full key becomes ```DataStructure/IFS```.

In [13]:
series = 'IFS'  # International Financial Statistics (IFS)
key = f'DataStructure/{series}'

dimension_list = requests.get(f'{url}{key}').json()['Structure']['KeyFamilies']['KeyFamily']['Components']['Dimension']

for n, dimension in enumerate(dimension_list):
    print(f"Dimension {n+1}: {dimension['@codelist']}")

Dimension 1: CL_FREQ
Dimension 2: CL_AREA_IFS
Dimension 3: CL_INDICATOR_IFS


In [14]:
series = 'GFSR'  # Government Finance Statistics (GFS), Revenue
key = f'DataStructure/{series}'

dimension_list = requests.get(f'{url}{key}').json()['Structure']['KeyFamilies']['KeyFamily']['Components']['Dimension']

for n, dimension in enumerate(dimension_list):
    print(f"Dimension {n+1}: {dimension['@codelist']}")

Dimension 1: CL_FREQ
Dimension 2: CL_AREA_GFSR
Dimension 3: CL_SECTOR_GFSR
Dimension 4: CL_UNIT_GFSR
Dimension 5: CL_INDICATOR_GFSR


The **CodeList** method returns the description of CodeLists.
In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/{codelist code}_{database ID}
```

To find the list of possible codes for each dimension, we can use the CodeList method for the area and indicator dimensions above, ``` CL_AREA_IFS, CL_INDICATOR_IFS```. Top 5 are selected for both ```CL_AREA_IFS``` and ```CL_INDICATOR_IFS```.

In [37]:
series = 'GFSR'

In [39]:
dimension = dimension_list[1]['@codelist']
key = f"CodeList/{dimension}"
print(dimension)
print('---------------------------------------------')
code_list = requests.get(f'{url}{key}').json()['Structure']['CodeLists']['CodeList']['Code']

for i in range(5) :
    print(f"{code_list[i]['Description']['#text']}: {code_list[i]['@value']}")

CL_AREA_GFSR
---------------------------------------------
Afghanistan: AF
Albania: AL
Algeria: DZ
Angola: AO
Anguilla: AI


In [43]:
dimension = dimension_list[4]['@codelist']
key = f"CodeList/{dimension}"
print(dimension)
print('---------------------------------------------')
code_list = requests.get(f'{url}{key}').json()['Structure']['CodeLists']['CodeList']['Code']

for i in range(5) :
    print(f"{code_list[i]['Description']['#text']}: {code_list[i]['@value']}")

CL_INDICATOR_GFSR
---------------------------------------------
Customs & other import duties: W0_S1_G1151
Dividend revenue: W0_S1_G1412
Excise taxes: W0_S1_G1142
General taxes on goods & services: W0_S1_G1141
Grants in cash: W0_S1_G1M13A


The **CompactData** method returns the compact data message. In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/{database ID}/{frequency}.{item1 from
dimension1}+{item2 from dimension1}+{item N from dimension1}.{item1 from
dimension2}+{item2 from dimension2}+{item M from dimension2}?startPeriod={start
date}&endPeriod={end date}
```

In the request above the different components mean the following: 

Database ID (Series): The broad group of indicators, in this case International Financial Statistics IFS;

Frequency: monthly M, quarterly Q, or annually A;

Aimention 1 (Area): The country, region, or set of countries, for example ```GB``` for the U.K., or ```GB+US``` for the U.K. and the U.S.;

Dimention 2 (Indicator): The code for the indicator of interest. IFS includes more than 2,500. In the example above, the code of interest is ```NGDP_R_NSA_XDC```;

Date Range (*Optional*): Use this to limit the data range returned, for example ```?startPeriod=2010&endPeriod=2017``` otherwise the full set of data is returned.

The order in which codes are combined is referred to as the dimensions of the data, in the IFS case: 
```
{Method}/{Series}/{Frequency}.{Area}.{Indicator}.{Date Range}
```

In [70]:
series = 'IFS'
frequency = 'Q'
area = 'GB'
code = 'NGDP_SA_XDC'
time_start = 2020
time_end = 2022

In [71]:
key = f'CompactData/{series}/{frequency}.{area}.{code}.?startPeriod={time_start}&endPeriod={time_end}'
print(f'The data access link: {url}{key}\n')

The data access link: http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IFS/Q.GB.NGDP_SA_XDC.?startPeriod=2020&endPeriod=2022



The data variable below contains all the data that fits the parameters defined above. We will peak into the latest observation. 

In [73]:
data = (requests.get(f'{url}{key}').json()['CompactData']['DataSet']['Series'])
print(f"Latest observation: {data['Obs'][-1]}")

Latest observation: {'@TIME_PERIOD': '2022-Q2', '@OBS_VALUE': '619570'}


*The* **MetadataStructure** method returns the metadata structure of the dataset.
In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/MetadataStructure/{database ID}
```

The metadata structute for IFS is the following:

In [67]:
series = 'IFS'

In [68]:
key = f'MetadataStructure/{series}'

metadata_list = requests.get(f'{url}{key}').json()['Structure']['Concepts']["ConceptScheme"][0]["Concept"]
for metadata in metadata_list:
  print(metadata["@id"])

OBS_VALUE
UNIT_MULT
TIME_FORMAT
FREQ
REF_AREA
INDICATOR
BASE_YEAR
TIME_PERIOD
OBS_STATUS


The **GenericMetadata** method returns the generic metadata message.
In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/GenericMetadata/{database ID}/{item1 from dimension1}+{item2 from dimension1}+{item N from dimension1}.{item1 from dimension2}+{item2 from dimension2}+{item M from dimension2}?startPeriod={start date}&endPeriod={end date}
```

Here is the metadata for the dimentions we specified above.

In [74]:
key = f'GenericMetadata/{series}/{frequency}.{area}.{code}'
metadata = requests.get(f'{url}{key}').json()

In [75]:
country = metadata['GenericMetadata']['MetadataSet']['AttributeValueSet'][1]['ReportedAttribute'][1]['ReportedAttribute'][3]['Value']['#text']
indicator = metadata['GenericMetadata']['MetadataSet']['AttributeValueSet'][2]['ReportedAttribute'][1]['ReportedAttribute'][4]['Value']['#text']

print(f'Country: {country}; Indicator: {indicator}')

Country: United Kingdom; Indicator: Gross Domestic Product, Nominal, Seasonally Adjusted


We can also retrieve all the metadata related to one of the dimentions, for example, the frequency.
Dimentions: ```'REF_AREA', 'INDICATOR', 'FREQ'```

In [76]:
def get_metadata(series = 'IFS'): 
  url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/'
  key = f'GenericMetadata/{series}'
  metadata = requests.get(f'{url}{key}').json()['GenericMetadata']['MetadataSet']['AttributeValueSet']
  return metadata

In [77]:
def print_metadata(metadata = metadata, indicator = 'FREQ'):
  for i in range(len(metadata)):
    ind = metadata[i]['ReportedAttribute'][1]['@conceptID']
    if ind == indicator:
      output = metadata[i]['ReportedAttribute'][1]['ReportedAttribute']
      print( output[0]['Value']['#text'],": ", output[2]['Value']['#text'] )

In [78]:
metadata = get_metadata(series = 'IFS')

In [81]:
print_metadata(metadata = metadata, indicator = 'FREQ') 

Annual :  A
Quarterly :  Q
Monthly :  M


The **MaxSeriesInResult** method returns the maximum number of time series that can be returned by CompactData.
In order to obtain the data use the following request:
```
http://dataservices.imf.org/REST/SDMX_JSON.svc/GetMaxSeriesInResult
```

In [None]:
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/GetMaxSeriesInResult'

num = requests.get(f'{url}').json()
print(num)

3000


The results shows how many series can be extracted using the url, where multiple dimention values are listed with '+' as a separator:<br>
```http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/{database ID}/{frequency}.{item1 from
dimension1}+{item2 from dimension1}+{item N from dimension1}.{item1 from
dimension2}+{item2 from dimension2}+{item M from dimension2}?startPeriod={start
date}&endPeriod={end date}```

In this notebook, we have explored the methods available in the IMF's JSON RESTful Web Service API for data and metadata extraction using the IFS series as example.