In [1]:
import requests
import json
import pandas as pd

# IMF API
The purpose of this notebook is to investigate the [IMF's JSON RESTful Web Service](http://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service)
According to the IMF's documentation the following methods are available:
* Dataflow 
* DataStructure 
* CompactData
* MetadataStructure
* GenericMetadata 
* CodeList
* MaxSeriesInResult

Since I did not find their documentation very helpful, I decided to investigate write a notebook to get a clearer picture of the data available through those methods.

I will go method by method, calling it and looking at the answer, transforming and cleaning data to dataframe in the process.

## Dataflow:

In [2]:
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow'
r = requests.get(url)
data = r.json()
data

{'Structure': {'@xmlns:xsd': 'http://www.w3.org/2001/XMLSchema',
  '@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
  '@xmlns': 'http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message',
  '@xsi:schemaLocation': 'http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message https://registry.sdmx.org/schemas/v2_0/SDMXMessage.xsd',
  'Header': {'ID': '7ffcb7c5-904a-4f65-a444-2313f6475410',
   'Test': 'false',
   'Prepared': '2019-02-25T13:18:33',
   'Sender': {'@id': '1C0',
    'Name': {'@xml:lang': 'en', '#text': 'IMF'},
    'Contact': {'URI': 'http://www.imf.org',
     'Telephone': '+ 1 (202) 623-6220'}},
   'Receiver': {'@id': 'ZZZ'}},
  'Dataflows': {'Dataflow': [{'@id': 'DS-FAS',
     '@version': '1.0',
     '@agencyID': 'IMF',
     '@isFinal': 'true',
     '@xmlns': 'http://www.SDMX.org/resources/SDMXML/schemas/v2_0/structure',
     'Name': {'@xml:lang': 'en', '#text': 'Financial Access Survey (FAS)'},
     'KeyFamilyRef': {'KeyFamilyID': 'FAS', 'KeyFamilyAgencyID': 'IMF'}}

In [3]:
data.keys()  # It starts off with one key

dict_keys(['Structure'])

In [4]:
data['Structure'].keys()  # In this key we found 6 keys were Dataflow contains 
                          # a list of dictionaries

dict_keys(['@xmlns:xsd', '@xmlns:xsi', '@xmlns', '@xsi:schemaLocation', 'Header', 'Dataflows'])

In [5]:
dataflows = pd.DataFrame.from_dict(data['Structure']['Dataflows']['Dataflow'])
# I will perform some transformations to retain the information that I though useful.
dataflows['Description'] = dataflows.Name.apply(lambda d: d['#text'])  # Access subdict. 
dataflows['KeyFamilyID'] = dataflows.KeyFamilyRef.apply(lambda d: d['KeyFamilyID'])  # Access subdict. 
dataflows['KeyFamilyAgencyID'] = dataflows.KeyFamilyRef.apply(lambda d: d['KeyFamilyAgencyID'])  # Access subdict. 
dataflows = dataflows[['Description', 'KeyFamilyID', 'KeyFamilyAgencyID']]
dataflows.sample(5)

Unnamed: 0,Description,KeyFamilyID,KeyFamilyAgencyID
148,Monetary and Financial Statistics (MFS),MFS,IMF
161,"Balance of Payments (BOP), 2018 M12",BOP_2018M12,IMF
160,"Balance of Payments (BOP), World and Regional ...",BOPAGG_2018,IMF
146,"International Financial Statistics (IFS), 2018...",IFS_2018M09,IMF
49,Government Finance Statistics Yearbook (GFSY 2...,GFSYMAB2014,IMF


In [6]:
dataflows.describe()

Unnamed: 0,Description,KeyFamilyID,KeyFamilyAgencyID
count,173,173,173
unique,173,173,1
top,"Balance of Payments (BOP), 2017 M06",BOPAGG_2018,IMF
freq,1,1,173


The dataflows method returns the available data sources. For example: the _Financial Access Survey_,  the _Fiscal Monitor_ ,the _Direction of Trade Statistics_ or the _International Financial Statistics_.
There are at the time of writting this notebbok 173 data sources all belonging to one "Family Agency" the IMF.

## Data Structure
As the documentation tells us, the DataStructure returns "the structure of the dataset". So lets see what that looks like. As an example lets look at the International Financial Statistics (IFS)

In [7]:
dataset = 'IFS'
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/{}".format(dataset)
r = requests.get(url)
ds_struct = r.json()
ds_struct.keys()

dict_keys(['Structure'])

In [8]:
ds_struct['Structure'].keys()   # Seems to follow a similar format as the dataflows.

dict_keys(['@xmlns:xsd', '@xmlns:xsi', '@xmlns', '@xsi:schemaLocation', 'Header', 'CodeLists', 'Concepts', 'KeyFamilies'])

In [9]:
# The information seems inside this dictionary as a list of dictionaries  
ds_struct = pd.DataFrame().from_dict(ds_struct['Structure']['CodeLists']['CodeList']) 

In [10]:
ds_struct

Unnamed: 0,@agencyID,@id,@isFinal,@version,@xmlns,Code,Description,Name
0,IMF,CL_UNIT_MULT,True,1.0,http://www.SDMX.org/resources/SDMXML/schemas/v...,"[{'@value': '0', 'Description': {'@xml:lang': ...",,"{'@xml:lang': 'en', '#text': 'Scale'}"
1,IMF,CL_FREQ,True,1.0,http://www.SDMX.org/resources/SDMXML/schemas/v...,"[{'@value': 'A', 'Description': {'@xml:lang': ...","{'@xml:lang': 'en', '#text': 'Frequency'}","{'@xml:lang': 'en', '#text': 'Frequency'}"
2,IMF,CL_AREA_IFS,True,1.0,http://www.SDMX.org/resources/SDMXML/schemas/v...,"[{'@value': 'AF', 'Description': {'@xml:lang':...",,"{'@xml:lang': 'en', '#text': 'Geographical Are..."
3,IMF,CL_INDICATOR_IFS,True,1.0,http://www.SDMX.org/resources/SDMXML/schemas/v...,"[{'@value': 'IAFR_BP6_USD', 'Description': {'@...",,"{'@xml:lang': 'en', '#text': 'Indicator'}"
4,IMF,CL_TIME_FORMAT,True,1.0,http://www.SDMX.org/resources/SDMXML/schemas/v...,"[{'@value': 'P1Y', 'Description': {'@xml:lang'...","{'@xml:lang': 'en', '#text': 'Time formats bas...","{'@xml:lang': 'en', '#text': 'Time format'}"


In this level there is a list of the kind of codes available. It contains the metadata for how the data is stored. For example: CL_UNIT_MULT contains a dictionary with the relation between a value for example 2 and a description "Hundreds". The same applies to the rest.
The most usefull seem to be CL_FREQ and CL_INDICATOR_IFS since they will be needed (as we will see) to obtain the data for an specific indicator.

In [11]:
pd.DataFrame().from_dict(ds_struct.loc[1]['Code']).head()

Unnamed: 0,@value,Description
0,A,"{'@xml:lang': 'en', '#text': 'Annual'}"
1,B,"{'@xml:lang': 'en', '#text': 'Bi-annual'}"
2,Q,"{'@xml:lang': 'en', '#text': 'Quarterly'}"
3,M,"{'@xml:lang': 'en', '#text': 'Monthly'}"
4,D,"{'@xml:lang': 'en', '#text': 'Daily'}"


In [12]:
pd.DataFrame().from_dict(ds_struct.loc[3]['Code']).head()

Unnamed: 0,@value,Description
0,IAFR_BP6_USD,"{'@xml:lang': 'en', '#text': 'International In..."
1,IADD_BP6_USD,"{'@xml:lang': 'en', '#text': 'International In..."
2,IADE_BP6_USD,"{'@xml:lang': 'en', '#text': 'International In..."
3,IAD_BP6_USD,"{'@xml:lang': 'en', '#text': 'International In..."
4,IADF_BP6_USD,"{'@xml:lang': 'en', '#text': 'International In..."


In [13]:
codes = pd.DataFrame.from_dict(ds_struct.loc[3]['Code'])
codes['Description'] = codes.Description.apply(lambda dct: dct['#text'])
codes.rename(columns={'@value': 'code'}, inplace=True)
codes.set_index('code', inplace=True)

In [14]:
codes.head()

Unnamed: 0_level_0,Description
code,Unnamed: 1_level_1
IAFR_BP6_USD,"International Investment Positions, Net acquis..."
IADD_BP6_USD,"International Investment Positions, Assets, Di..."
IADE_BP6_USD,"International Investment Positions, Assets, Di..."
IAD_BP6_USD,"International Investment Positions, Assets, Di..."
IADF_BP6_USD,"International Investment Positions, Financial ..."
