<a href="https://colab.research.google.com/github/coa-project/pycoa/blob/main/coabook/dataCheck.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/coa-project/pycoa/zenodo-backup)

# Data Check for Pycoa [ⓒpycoa.fr](http://pycoa.fr) <img src="https://www.pycoa.fr/fig/logo-anime.gif" alt="Pycoa" width="35">

## Importing packages

The following commands ensure that Pycoa is correctly installed and configured: 
* for local installation
* for use in Colab
* for use in Binder

**Execute these lines without asking too many questions...**

In [1]:
import sys
import subprocess
import importlib
coa_module_info = importlib.util.find_spec("coaenv")
if coa_module_info != None:
  import coaenv as pycoa
else:
  #dans le cas où on utilise colab de Google  
  subprocess.check_call(['pip3', 'install', '--quiet', 'git+https://github.com/coa-project/pycoa.git@ReadingDataModification'])
  sys.path.insert(1, 'pycoa')
  !pip install pandas==1.4.2
  import coa.front as pycoa

Defaulting to user installation because normal site-packages is not writeable


## Select the database

First select the database to get the data. For this use `pycoa.setwhom("DataBase")` replacing `DataBase` with the any of the databases in `pycoa.listwhom(True)`.

>If no epidemiological database is selected with `pycoa.setwhom`, the `JHU` database for John Hopkins University is loaded by default. JHU was one of the first institutions to aggregate epidemiological data from countries, and to offer a web-based dashboard to view the data: https://coronavirus.jhu.edu/map.html. 

PyCoA was designed to extend the possibilities of these *dashboards*. On the one hand, it allows you to retrieve epidemiological data directly; on the other, it allows you to display the data as you wish.

In [2]:
pycoa.setwhom("jhu",reload=False)

Info of jhu stored 
last update: Thu Nov 16 16:09:55 2023
Few information concernant the selected database :  jhu
Available key-words, which ∈ ['tot_confirmed', 'tot_deaths', 'tot_recovered']
Example of where :  Turkey, Bhutan, Congo, Zambia, Egypt  ...
Last date data  2023-03-09


## Pycoa instructions

* `pycoa.plot`: for time series
* `pycoa.map`: for a map representation
* `pycoa.hist`: for histograms, with the following options
  * `typeofhist='bycountry'` (default), for a histogram with horizontal bars, location by location
  * `typeofhist='byvalue'`, for a histogram with vertical bars, by value
  * `typeofhist='pie'`, for a pie chart
* `pycoa.get`: to retrieve data for further processing

### Documentation

A more complete documentation of Pycoa instructions and options can be found on the [Pycoa wiki](https://github.com/coa-project/pycoa/wiki/Accueil). In particular, see the menu at the bottom right: 
* [Data retrieval and processing](https://github.com/coa-project/pycoa/wiki/Donn%C3%A9es%2Ctraitements)
* [Curve graphs](https://github.com/coa-project/pycoa/wiki/Courbes)
* [Histograms, sectors and maps](https://github.com/coa-project/pycoa/wiki/Diagrammes%2CCartes)
* [Advanced](https://github.com/coa-project/pycoa/wiki/Avance)
* [Case studies](https://github.com/coa-project/pycoa/wiki/Etudes%2Cde%2Ccas)

### *Keywords*

Different keywords are possible for the various instructions:

* `which`: data selected from the current database. 
* `what`: cumulative, daily or weekly information.
* `where`: selection of the location, depending on the database used.
  * country, list of countries, regions, continents, etc.
  * department, list of departments, regions.
* `option`: "nonneg", "nofillnan", "smooth7", "sumall".
* `when`: date window for selected data.

You can use the following functions to get all available configurations `pycoa.listwhich()` `pycoa.listwhat()` `pycoa.listoption()` `pycoa.listwhere()` 

# Use the databases

List available databases

In [3]:
pycoa.listwhom(True)

Pandas has been pimped, use '.data' to get a pandas dataframe


Unnamed: 0,Database,WW/iso3,Granularité,WW/Name
0,dgs,PRT,region,Portugal
1,dpc,ITA,region,Italy
2,escovid19data,ESP,subregion,Spain
3,europa,WW,nation,Europe
4,govcy,CYP,nation,Cyprus
5,imed,GRC,region,Greece
6,insee,FRA,subregion,France
7,jhu,WW,nation,World
8,jhu-usa,USA,subregion,United States of America
9,jpnmhlw,JPN,subregion,Japan


Select 'owid' database

In [4]:
pycoa.setwhom('insee',reload=False)

Info of insee stored 
last update: Thu Nov 16 16:12:01 2023
Few information concernant the selected database :  insee
Available key-words, which ∈ ['tot_deaths_since_2018-01-01']
Example of where :  Lot, Bas-Rhin, Alpes-Maritimes, Drôme, Territoire de Belfort  ...
Last date data  2024-04-15


In [5]:
pycoa.listwhich() 

['tot_deaths_since_2018-01-01']

In [6]:
pycoa.listwhat() 

['standard', 'daily', 'weekly']

In [7]:
pycoa.listoption() 

['nonneg', 'nofillnan', 'smooth7', 'sumall']

In [8]:
pycoa.listwhere()

['Ain',
 'Aisne',
 'Allier',
 'Alpes-de-Haute-Provence',
 'Hautes-Alpes',
 'Alpes-Maritimes',
 'Ardèche',
 'Ardennes',
 'Ariège',
 'Aube',
 'Aude',
 'Aveyron',
 'Bouches-du-Rhône',
 'Calvados',
 'Cantal',
 'Charente',
 'Charente-Maritime',
 'Cher',
 'Corrèze',
 "Côte-d'Or",
 "Côtes-d'Armor",
 'Creuse',
 'Dordogne',
 'Doubs',
 'Drôme',
 'Eure',
 'Eure-et-Loir',
 'Finistère',
 'Corse-du-Sud',
 'Haute-Corse',
 'Gard',
 'Haute-Garonne',
 'Gers',
 'Gironde',
 'Hérault',
 'Ille-et-Vilaine',
 'Indre',
 'Indre-et-Loire',
 'Isère',
 'Jura',
 'Landes',
 'Loir-et-Cher',
 'Loire',
 'Haute-Loire',
 'Loire-Atlantique',
 'Loiret',
 'Lot',
 'Lot-et-Garonne',
 'Lozère',
 'Maine-et-Loire',
 'Manche',
 'Marne',
 'Haute-Marne',
 'Mayenne',
 'Meurthe-et-Moselle',
 'Meuse',
 'Morbihan',
 'Moselle',
 'Nièvre',
 'Nord',
 'Oise',
 'Orne',
 'Pas-de-Calais',
 'Puy-de-Dôme',
 'Pyrénées-Atlantiques',
 'Hautes-Pyrénées',
 'Pyrénées-Orientales',
 'Bas-Rhin',
 'Haut-Rhin',
 'Rhône',
 'Haute-Saône',
 'Saône-et-Loire',

## Store databases with Pycoa

Save with the functions `pycoa.getrawdb()` and `pycoa.saveoutput()`

In [17]:
#for database in pycoa.listwhom():
    #pycoa.setwhom(database,reload=False)
    #pycoa.saveoutput(pandas=pycoa.getrawdb(), saveformat='csv', savename=database+"-PyCoa-DF")

## List elements

In [11]:
whichAll = []
for database in pycoa.listwhom():
    pycoa.setwhom(database,reload=False)
    print(database)
    #print(pycoa.listwhat()) 
    for which in pycoa.listwhich():
        whichAll.append(which)
    #print("From: ", pycoa.getrawdb().iloc[0][0])
# Remove duplicates
whichAll = list(dict.fromkeys(whichAll))

Info of dgs stored 
last update: Thu Nov 16 16:12:04 2023
Few information concernant the selected database :  dgs
Available key-words, which ∈ ['tot_cases']
Example of where :  Coimbra, Santarém, Vila Real, Évora, Vila Real  ...
Last date data  2022-10-02
/////////////////////////////////////////////
Info of dpc stored 
last update: Thu Nov 16 16:12:09 2023
Few information concernant the selected database :  dpc
Available key-words, which ∈ ['tot_cases', 'tot_deaths']
Example of where :  Friuli Venezia Giulia, Piemonte, Liguria, Molise, Basilicata  ...
Last date data  2023-11-08
/////////////////////////////////////////////
Info of escovid19data stored 
last update: Thu Nov 16 16:12:13 2023
Few information concernant the selected database :  escovid19data
Available key-words, which ∈ ['cur_hosp', 'cur_hosp_per100k', 'cur_icu', 'cur_icu_per1M', 'incidence', 'population', 'tot_cases', 'tot_cases_per100k', 'tot_deaths', 'tot_deaths_per100k', 'tot_hosp', 'tot_recovered']
Example of where :

## Metadata

In [12]:
for database in pycoa.listwhom():
    pycoa.setwhom(database,reload=False)
    print(database)
    print(pycoa.getkeywordinfo())
    #pycoa.saveoutput(pandas=pycoa.getrawdb(), saveformat='csv', savename=database+"-PyCoa-DF")

Info of dgs stored 
last update: Thu Nov 16 16:12:04 2023
Few information concernant the selected database :  dgs
Available key-words, which ∈ ['tot_cases']
Example of where :  Beja, Vila Real, Beja, Bragança, Guarda  ...
Last date data  2022-10-02
                                                       tot_cases
Original name                                      confirmados_1
Description                                               FILLIT
URL            https://raw.githubusercontent.com/dssg-pt/covi...
Homepage               https://github.com/dssg-pt/covid19pt-data
/////////////////////////////////////////////
Info of dpc stored 
last update: Thu Nov 16 16:12:09 2023
Few information concernant the selected database :  dpc
Available key-words, which ∈ ['tot_cases', 'tot_deaths']
Example of where :  Campania, Molise, Friuli Venezia Giulia, Veneto, Umbria  ...
Last date data  2023-11-08
                                                      tot_deaths  \
Original name                    