This notebook provides some examples of how the functions in the `epigraphhub_db.py` module can be used. 

### The function `get_agg_data()`

This function queries a table saved in the epigraphhub database and returns a column's aggregated value related to another column with location names.

Besides the `schema` and `table_name`it's necessary to provide a list with the name of three columns. The first column should contain dates, which will be used as an index. The second should contain locations to be considered in the aggregation. The third column should contain the values that will be aggregated.

With this function, we can transform the individual data of covid-19 in Colombia into a time series that represents the daily number of cases by `departamento`.

In [1]:
from epigraphhub.data.epigraphhub_db import get_agg_data
df = get_agg_data(schema = 'colombia', table_name = 'positive_cases_covid_d',
                  columns = ['fecha_de_notificaci_n', 'departamento_nom', 'id_'],
                  method = 'COUNT', ini_date = '2020-01-01'
                 )

df

{'dbname': 'epigraphhub', 'host': 'localhost', 'password': 'epigraph', 'port': 5432, 'username': 'epigraph'}


Unnamed: 0_level_0,departamento_nom,count
fecha_de_notificaci_n,Unnamed: 1_level_1,Unnamed: 2_level_1
2021-12-20,STA MARTA D.E.,102
2021-07-30,NORTE SANTANDER,169
2020-09-29,AMAZONAS,1
2020-04-19,STA MARTA D.E.,2
2020-04-11,META,17
...,...,...
2021-11-29,CAUCA,27
2020-03-30,CAUCA,2
2022-04-11,RISARALDA,6
2020-06-20,NARIÑO,103


### The function `get_data_by_location()`

This function queries a table saved in the epigraphhub database and has the possibility to filter the output given a list of locations and the name of the column to filter. 

For example, we have the `foph_cases_d` table, which represents the number of cases of covid-19 by canton in Switzerland. Using the function, we can get the output just for the cantons `GE` and `BE`.  

In [2]:
from epigraphhub.data.epigraphhub_db import get_data_by_location

df = get_data_by_location(schema = 'switzerland', table_name = 'foph_cases_d', 
                       loc = ['GE', 'BE'], columns = ['datum', 'georegion', 'entries'],
                       loc_column = 'georegion')

df

Unnamed: 0,datum,georegion,entries
0,2022-05-14,GE,68
1,2022-05-15,GE,48
2,2022-05-16,GE,146
3,2022-05-17,GE,118
4,2022-05-18,GE,118
...,...,...,...
1833,2022-05-09,GE,210
1834,2022-05-10,GE,146
1835,2022-05-11,GE,124
1836,2022-05-12,GE,146
