# ISTAC playground

**NOTE: You must select the istac kernel to run this playbook!**

This playbook uses a virtualenv that must be setup beforehand. This is a one-time only process, recorded here for future reference.
The steps to create a virtualenv for this project and make it available as a jupyter kernel are:

```bash
# Move to the proper folder
cd /home/jovyan/work/istac
# Install pipenv
pip install pipenv
# Create the venv and install dependencies
pipenv install
# Activate the shell
pipenv shell
# Create a jupyter core
python -m ipykernel install --user --name=istac
```

Now you can select the "istac" core when running this playbook.

In [1]:
# Import istac lib into your app
import istac

In [2]:
# Collect all indicators
import aiohttp

async with aiohttp.ClientSession() as session:
    indicators = [ind async for ind in istac.indicators(session)]

In [3]:
# Now you can list the indicators, e.g.
from pprint import pprint

pprint([ind.code for ind in indicators[:10]])

['AFILIACIONES',
 'TURISTAS',
 'EMPLEO_REGISTRADO_AGRICULTURA',
 'EMPLEO_REGISTRADO_HOSTELERIA',
 'EMPLEO_REGISTRADO_INDUSTRIA',
 'EMPLEO_REGISTRADO_SERVICIOS',
 'POBLACION_INACTIVA',
 'POBLACION_INACTIVA_HOMBRES',
 'POBLACION_INACTIVA_MUJERES',
 'PARO_REGISTRADO']


In [4]:
# And optionally, turn the list into a dataframe
import pandas as pd

fields = istac.Indicator.fields()
ind_frame = pd.DataFrame((dict((field, getattr(ind, field)) for field in fields) for ind in indicators), columns=fields).set_index('id')
for col in ('selfLink', 'systemSurveyLinks', 'kind'):
    ind_frame = ind_frame.drop(col, axis=1)
ind_frame.head()

Unnamed: 0_level_0,code,version,title,subjectCode,subjectTitle,conceptDescription,notes
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
AFILIACIONES,AFILIACIONES,1.13,"{'es': 'Afiliaciones a la Seguridad Social', '...",51,"{'es': '051 Empleo', '__default__': '051 Empleo'}",{'es': 'Puestos de trabajo registrados en la S...,{'en': 'Affiliations registered on data collec...
TURISTAS,TURISTAS,1.19,"{'es': 'Turistas recibidos', 'en': 'Tourists a...",82,"{'es': '082 Hostelería y turismo', '__default_...",{'es': 'Número de turistas recibidos por vía a...,{'en': 'Tourists are visitors who overnight in...
EMPLEO_REGISTRADO_AGRICULTURA,EMPLEO_REGISTRADO_AGRICULTURA,1.23,"{'en': 'Registered employment. Agriculture', '...",51,"{'es': '051 Empleo', '__default__': '051 Empleo'}","{'en': 'Jobs registered in the primary sector,...",{'es': 'En el sector primario se contabiliza c...
EMPLEO_REGISTRADO_HOSTELERIA,EMPLEO_REGISTRADO_HOSTELERIA,1.23,"{'es': 'Empleo registrado. Hostelería', 'en': ...",51,"{'es': '051 Empleo', '__default__': '051 Empleo'}",{'es': 'Puestos de trabajo registrados en la s...,{'es': 'Se entiende por empleo registrado a la...
EMPLEO_REGISTRADO_INDUSTRIA,EMPLEO_REGISTRADO_INDUSTRIA,1.24,"{'es': 'Empleo registrado. Industria', 'en': '...",51,"{'es': '051 Empleo', '__default__': '051 Empleo'}",{'en': 'Jobs registered in the industry and en...,{'es': 'Se entiende por empleo registrado a la...


In [5]:
# Let's save the list
ind_frame.to_csv(r'indicadores.csv', index=None, header=True)

In [6]:
# And get data for some indicator, e.g.
async with aiohttp.ClientSession() as session:
    df = await istac.indicator_data(session, 'TURISTAS', {
        #'granularity': 'TIME[MONTHLY]',
        'representation': 'MEASURE[ABSOLUTE]',
        'fields': '-observationsMetadata'
    })

In [7]:
df.head(10)

Unnamed: 0_level_0,_meta,F,GEOGRAPHICAL,TIME,MEASURE
_offset,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,,1325924,ES70,2019M12,ABSOLUTE
1,,1289081,ES70,2019M11,ABSOLUTE
2,,1291653,ES70,2019M10,ABSOLUTE
3,,1128384,ES70,2019M09,ABSOLUTE
4,,1289910,ES70,2019M08,ABSOLUTE
5,,1272383,ES70,2019M07,ABSOLUTE
6,,1127367,ES70,2019M06,ABSOLUTE
7,,1066023,ES70,2019M05,ABSOLUTE
8,,1240146,ES70,2019M04,ABSOLUTE
9,,1477483,ES70,2019M03,ABSOLUTE


In [8]:
# Get dimensions for this same indicator
async with aiohttp.ClientSession() as session:
    dims = await istac.dimension_data(session, 'TURISTAS')

for dim_name, dim_data in dims.items():
    print(dim_name)
    print(dim_data.points.head())

GEOGRAPHICAL
                                                   title granularityCode  \
code                                                                       
ES70   {'en': 'Canarias', 'es': 'Canarias', '__defaul...         REGIONS   
ES708  {'en': 'Lanzarote', 'es': 'Lanzarote', '__defa...         ISLANDS   
ES704  {'es': 'Fuerteventura', 'en': 'Fuerteventura',...         ISLANDS   
ES705  {'es': 'Gran Canaria', 'en': 'Gran Canaria', '...         ISLANDS   
ES709  {'es': 'Tenerife', 'en': 'Tenerife', '__defaul...         ISLANDS   

        latitude  longitude  
code                         
ES70   28.286993 -15.833524  
ES708  28.958019 -13.563176  
ES704  28.498631 -13.860549  
ES705  28.107860 -15.419980  
ES709  28.466125 -16.247069  
TIME
                                                     title granularityCode
code                                                                      
2019M12  {'es': '2019 Diciembre', 'en': '2019 December'...         MONTHLY
2019M11  {'en'

In [9]:
# join data with dimensions
joined = df
for dim_name, dim_data in dims.items():
    joined = joined.join(dim_data.points, on=dim_name, rsuffix=f'_{dim_name}')

joined = joined.dropna(axis=1, how='all')
joined.head()

Unnamed: 0_level_0,F,GEOGRAPHICAL,TIME,MEASURE,title,granularityCode,latitude,longitude,title_TIME,granularityCode_TIME,title_MEASURE,decimalPlaces,type,unit,unitMultiplier
_offset,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
0,1325924,ES70,2019M12,ABSOLUTE,"{'en': 'Canarias', 'es': 'Canarias', '__defaul...",REGIONS,28.286993,-15.833524,"{'es': '2019 Diciembre', 'en': '2019 December'...",MONTHLY,"{'en': 'Data', 'es': 'Dato', '__default__': 'D...",0,AMOUNT,"{'de': 'Personen', 'fr': 'Personnes', 'es': 'P...","{'en': 'Units', 'es': 'Unidades', '__default__..."
1,1289081,ES70,2019M11,ABSOLUTE,"{'en': 'Canarias', 'es': 'Canarias', '__defaul...",REGIONS,28.286993,-15.833524,"{'en': '2019 November', 'es': '2019 Noviembre'...",MONTHLY,"{'en': 'Data', 'es': 'Dato', '__default__': 'D...",0,AMOUNT,"{'de': 'Personen', 'fr': 'Personnes', 'es': 'P...","{'en': 'Units', 'es': 'Unidades', '__default__..."
2,1291653,ES70,2019M10,ABSOLUTE,"{'en': 'Canarias', 'es': 'Canarias', '__defaul...",REGIONS,28.286993,-15.833524,"{'es': '2019 Octubre', 'en': '2019 October', '...",MONTHLY,"{'en': 'Data', 'es': 'Dato', '__default__': 'D...",0,AMOUNT,"{'de': 'Personen', 'fr': 'Personnes', 'es': 'P...","{'en': 'Units', 'es': 'Unidades', '__default__..."
3,1128384,ES70,2019M09,ABSOLUTE,"{'en': 'Canarias', 'es': 'Canarias', '__defaul...",REGIONS,28.286993,-15.833524,"{'es': '2019 Septiembre', 'en': '2019 Septembe...",MONTHLY,"{'en': 'Data', 'es': 'Dato', '__default__': 'D...",0,AMOUNT,"{'de': 'Personen', 'fr': 'Personnes', 'es': 'P...","{'en': 'Units', 'es': 'Unidades', '__default__..."
4,1289910,ES70,2019M08,ABSOLUTE,"{'en': 'Canarias', 'es': 'Canarias', '__defaul...",REGIONS,28.286993,-15.833524,"{'es': '2019 Agosto', 'en': '2019 August', '__...",MONTHLY,"{'en': 'Data', 'es': 'Dato', '__default__': 'D...",0,AMOUNT,"{'de': 'Personen', 'fr': 'Personnes', 'es': 'P...","{'en': 'Units', 'es': 'Unidades', '__default__..."


In [10]:
# further manipulate the data with pandas' DataFrame tools
joined.to_csv(r'istac.csv', index=None, header=True)
pprint(joined.head().to_csv())

('_offset,F,GEOGRAPHICAL,TIME,MEASURE,title,granularityCode,latitude,longitude,title_TIME,granularityCode_TIME,title_MEASURE,decimalPlaces,type,unit,unitMultiplier\n'
 '0,1325924,ES70,2019M12,ABSOLUTE,"{\'en\': \'Canarias\', \'es\': '
 "'Canarias', '__default__': "
 '\'Canarias\'}",REGIONS,28.2869925,-15.8335245,"{\'es\': \'2019 Diciembre\', '
 "'en': '2019 December', '__default__': '2019 "
 'Diciembre\'}",MONTHLY,"{\'en\': \'Data\', \'es\': \'Dato\', \'__default__\': '
 '\'Dato\'}",0,AMOUNT,Personas,Unidades\n'
 '1,1289081,ES70,2019M11,ABSOLUTE,"{\'en\': \'Canarias\', \'es\': '
 "'Canarias', '__default__': "
 '\'Canarias\'}",REGIONS,28.2869925,-15.8335245,"{\'en\': \'2019 November\', '
 "'es': '2019 Noviembre', '__default__': '2019 "
 'Noviembre\'}",MONTHLY,"{\'en\': \'Data\', \'es\': \'Dato\', \'__default__\': '
 '\'Dato\'}",0,AMOUNT,Personas,Unidades\n'
 '2,1291653,ES70,2019M10,ABSOLUTE,"{\'en\': \'Canarias\', \'es\': '
 "'Canarias', '__default__': "
 '\'Canarias\'}",REGIONS,28.2869