## Imports and setup

First, let's make the standard imports.

In [1]:
import requests 
import pandas as pd
import json
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
import matplotlib.ticker as mtick
import matplotlib.dates as mdates
from matplotlib import cm

token = "c4a583a30d7b08d569921c820673531b6019eee6" # Your TOKEN goes here
url = 'http://api.tukanmx.com/v1/retrieve/'

headers = {
"Content-Type": "application/json",
"Authorization": "Token " + token
}

## Structure of TUKAN's data model

Our data repository works as follows:

*   **Institutions** - the source of the information, and the highest level on our repository.
      *   **Tables** - the dataset that contains the information. Each dataset is associated to an institution or source.
          * **Variables** - the indicators contained in the dataset.

### Institutions

Institutions are the source of the information. These include governemnt and non-government entities that publish the original or raw-data, such as: INEGI, Banco de México, CONSAR, etc.

You can easily query which institutions are available in our data catalog with the following query:


In [2]:
response = requests.request("GET", url = "http://api.tukanmx.com/v1/institutions/", headers=headers)
institutions = pd.DataFrame(response.json())
institutions

Unnamed: 0,id,name,acronym,description,description_en,website,country
0,mex_banxico,Banco de México,Banxico,Banco central mexicano. Las finalidades sustan...,"Mexico's central bank, monetary authority and ...",https://www.banxico.org.mx/,mex
1,mex_grupo_bmv,Bolsa Mexicana de Valores,BMV,La Bolsa de Valores de México es una entidad f...,The BMV is a private financial entity that ope...,https://www.bmv.com.mx/,mex
2,mex_cnbv,Comisión Nacional Bancaria y de Valores,CNBV,Un órgano desconcentrado de la Secretaría de H...,A decentralized body of the Ministry of Financ...,https://www.gob.mx/cnbv,mex
3,mex_cnsf,Comisión Nacional de Seguros y Fianzas,CNSF,La Comisión Nacional de Seguros y Fianzas es u...,The National Insurance and Surety Commission i...,https://www.gob.mx/cnsf,mex
4,mex_consar,Comisión Nacional del Sistema de Ahorro para e...,CONSAR,CONSAR es la Comisión Nacional del Sistema de ...,CONSAR is the regulator in charge of managing ...,https://www.gob.mx/consar/,mex
5,mex_condusef,Comisión Nacional para la Protección y Defensa...,CONDUSEF,Es la encargada de promover y difundir la educ...,Is in charge of promoting transparency and res...,https://www.gob.mx/condusef,mex
6,mex_inegi,Instituto Nacional de Estadística y Geografía,INEGI,Organismo público autónomo responsable de norm...,The National Statistical and Geographic Inform...,https://www.inegi.org.mx/default.html,mex
7,mex_sct,Secretaría de Comunicaciones y Transportes,SCT,Una de las secretarías de Estado que integran ...,One of the state secretariats that make up the...,https://www.gob.mx/sct,mex
8,mex_segob,Secretaría de Gobernación,SEGOB,La Secretaría de Gobernación atiende el desarr...,SEGOB attends to the political devlopment of t...,https://www.gob.mx/segob,mex
9,mex_shcp,Secretaría de Hacienda y Crédito Público,SHCP,La Secretaría de Hacienda y Crédito Público ti...,The SHCP's (Ministry of Finance) mission is to...,https://www.gob.mx/shcp,mex


The columns represent:

|column| description|
|--|--|
|id| the TUKAN institution id|
|name| the official name of the institution|
|acronym| the common acronym associated to the instituion|
|description| a brief overview of the instituion's main functions in Spanish|
|description_en| a brief overview of the instituion's main functions in English|
|website| the official website of the institution|
|country*| the 3-letter ISO code of where the institution is based|

*If `country == wd` then the institution is an international organization.

### Tables

These are the datasets that contain the information. Each table is associated to an institution or source, and has a unique structure depending on the data it contains (don't worry we'll explain this more in detail later).

First, we need to understand which tables are associated to each source. We can do this through two different methods: 1) we use the Explore component on the [web-application](https://dashboard.tukanmx.com/) or 2) we query directly the tables associated to a particular institution.

For example, if you wanted to know which tables are associated to the INEGI, **you will need the institution's id** and then run the following code:

In [4]:
# Define a function for future use
def get_institution_tables(inst_id):

    global url
    global headers

    payload = {
    "type":"institution",
    "institution": inst_id,
    "operation": "data_tables_info"
    }

    response = requests.request("POST", url, headers=headers, data = json.dumps(payload))
    tables = pd.DataFrame(response.json()['data_tables'])
  
    return(tables)

get_institution_tables("mex_inegi").head(5)

Unnamed: 0,id,name,name_en,description,description_en,mode,website,last_updated,data_updated,institution_id,frequency_id,tag_id,categories
0,mex_inegi_api_employment,Estadísticas de Ocupación y Empleo,Employment Statistics,Con base en la Encuesta Nacional de Ocupación ...,Based on the National Occupation and Employmen...,standard,https://www.inegi.org.mx/temas/empleo/#Tabulados,2021-01-22,2021-08-19T06:45:05Z,mex_inegi,quarterly,,[adjustment_type]
1,mex_inegi_api_unemployment,Tasa de Desocupación,Unemployment Rate,Tasa de desocupación en series desestacionaliz...,"Unemployment rate, seasonally adjusted and tre...",standard,https://www.inegi.org.mx/temas/empleo/,2021-01-22,2021-09-28T06:19:06Z,mex_inegi,monthly,,[adjustment_type]
2,mex_inegi_census_households,Censo Población y Vivienda - Indicadores de Vi...,Census - Household Indicators,Proporciona la cuenta y características princi...,Provides information on the main characteristi...,standard,https://www.inegi.org.mx/programas/ccpv/2020/#...,2021-03-25,1977-06-08T05:20:00Z,mex_inegi,decennially,,[geography]
3,mex_inegi_census_people,Censo Población y Vivienda - Indicadores Pobla...,Census - Population Indicators,Proporciona la cuenta y características princi...,Provides information on the main characteristi...,standard,https://www.inegi.org.mx/programas/ccpv/2020/#...,2021-03-25,1977-06-08T05:20:00Z,mex_inegi,decennially,,"[geography, sex]"
4,mex_inegi_econ_census,Censo Económico,Economic Census,Los censos económicos contienen información ec...,The economic census contains economic informat...,standard,https://www.inegi.org.mx/programas/ce/2019/#In...,2021-08-27,2021-08-27T15:23:29Z,mex_inegi,quinquennial,,"[company_size, economic_activity, geography]"


This dictionary allows us to see all of the table's metadata, such as: its description, name, frequency, etc.

In essence, the columns represent:

|column| description|
|--|--|
|id| the TUKAN table id|
|name| the name of the table in Spanish|
|name_en| the name of the table in English|
|description| a brief description of the table in Spanish|
|description_en| a brief description of the table in English|
|mode| internal TUKAN metadata|
|website| the url from where the table was obtained|
|last_updated| the date when the data was last updated|
|institution_id| the TUKAN institution id|
|frequency_id| the frequency of the data|
|categories| a list of TUKAN categories associated to the table|

There are four main metadata items that you need to be aware of: `id`, `website`, `frequency_id` and `categories`. By being aware of these four attributes, you'll be able to have a full understanding of the table's structure and how to extract it properly. In summary:

*   The `id` is required to query the table's data and variable dictionary.
*   The `website` allows you to validate the data with the original source.
*   The `frequency_id` gives you information regarding the table's periodicity, i.e., monthly, quarterly, or daily data.

**Categories**, on the other hand, deserve a notebook of their own. We explore this in-depth in the following notebook.