# Introduction

Our project uses data from various sources (NASA satellite measurements, EU social and economic data, GitHub COVID-19 data, Kaggle UN countries economic & social data) aiming to understand if there are correlations between energy sectors development, pollution, transportation, various development indicators, unemployment, incidence and dynamics of COVID-19 cases.  

We focused on European countries.   

Data is ingested, processed, analyzed, filtered, and finally displayed in an easy to use for people interested to understand specifics of economic impact of COVID-19 in the context of each country.




# Objectives

For this challenge, we are aiming to few objectives:  

* Understand if there are correlations between NASA satellite data about pollution and various economic and social data, including transportation, energy, oil and coal production, unemployment as well as health and covid-19 incidence.

* Use these correlations to establish a risk score related to each country with respect with unemployment, various economic sectors evolution, as well as covid-19 evolution and risks.

* Build an user-friendly application to display these data.

# Method

Data is analyzed and processed using dedicated analysis Jupyter Notebooks.

Here we present the highlights of our analyses.

For each detailed analysis, follow the links to the respective Notebook.

## NASA satellite measurments ingestion, analysis and visualization

For each of these data, we collected first the satellite data, extracted the data from the netCDF4 format, registered on country level (using country coordinates and/or boundary limits).  

We look as well to correlations between pollution data (CO2, SO2) with energy data (coal and crude oil consumption) at country level for the European countries.

* [CO2 MERRA 2 Data extraction and preliminary analysis](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/earthdata-merra-2-eda.ipynb)
* [CO2 MERRA 2 Data extraction (bulk) and time evolution analysis](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/earthdata-merra-2-co-time-evolution.ipynb)
* [IASI METOP C CO Data extraction and analysis](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/iasi-metop-c-co-eda.ipynb)
* [SO2 & CO2 data ingestion and processing](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/nc4_ingest.py)
* Processed data for MERRA2 (CO2, SO2) can be found [here](https://github.com/gabrielpreda/nasa_2020/tree/master/data/processed).
* [Copernicus air quality - NO2, O3, PM10, PM2.5 data ingestion and processing](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/copernicus-air-quality.ipynb)

## COVID-19 data analysis

We ingest, validate, perform corrections on John Hopkins data. We are also analyzing the COVID-19 data for Europe, focusing on the epidemics evolution in each country as well as on correlations with social and economical factors extracted from UN data.   

We are also investigating the correlation of covid-19 metrics with the pollution data from NASA satellite measurements.

* [John Hopkins data corrections and validations](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-make-db.py)  
* [Covid-19 evolution in Europe (country level analysis) - initial exploration](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-eda.ipynb)  

* [Covid-19 corelation with UN social-economic data](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-un-data.ipynb)  

## Transport data analysis

We ingest, process and enrich data related to air and maritime transport extracted from UN open source data.
Starting from the initial data sets, we added information using the lookup files and transform, where needed the quarterly data into monthly data by using the quarterly data as a average value for 3 months.

We investigated the correlation of transport data with the pollution data from NASA satellite measurements.


* International intra-EU freight and mail air transport by main airports in each reporting country and EU partner country (https://github.com/gabrielpreda/nasa_2020/data/transport-data/International_air_transport_avia_goinac.tsv.gz)  
Original source:https://data.europa.eu/euodp/en/data/dataset/LzQnng3eGm3hbz8HfjcIQ


* Air passenger transport by main airports in each reporting country (https://github.com/gabrielpreda/nasa_2020/data/transport-data/Air_passenger_transport_avia_paoa.tsv.gz)
Original source: https://data.europa.eu/euodp/en/data/dataset/Ez3kc1cjABsZ83uw1uw

 
* Airport traffic data by reporting airport and airlines (https://github.com/gabrielpreda/nasa_2020/data/transport-data/Airport_traffic_data_avia_tf_apal.tsv.gz)
Original source: https://ec.europa.eu/eurostat/web/products-datasets/product?code=avia_tf_apal


* Gross weight of goods transported to/from main ports by direction and type of traffic (national and international) (https://github.com/gabrielpreda/nasa_2020/data/transport-data/Gross_weight_of_ports_mar_go_qm.tsv.gz)
Original source: https://data.europa.eu/euodp/en/data/dataset/5hX8WYrIVY6u08WnJ8Xrg



* [John Hopkins data corrections and validations](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-make-db.py)  
* [Covid-19 evolution in Europe (country level analysis) - initial exploration](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-eda.ipynb)  

* [Covid-19 corelation with UN social-economic data](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-un-data.ipynb)  



### EU social & economic data  

We investigate various factors which are related to impact of COVID-19, either directly (like correlation between morbidity and mortality to healthcare sector development in each country) or indirectly (through the measures imposed by each country) like recent months unemployment changes (spotting facts like different unemployment levels in EU countries for youngs, depending on sex) or prevalence of internet access in various countries (important due to massive shift to WFH and online education in recent months).

We were as well looking to sectors which sees now major shifts, like transportation, or might have relationship with pollution: energy and industrial sectors that relay heavily on large energy consumption. 

We also looked to factors that might have an impact on COVID-19 dynamics, like education, literacy, healthcare, population age, forest percent from total land as well as GDP per capita, population density.

Here are the analysis Notebooks:

* [EU Energy consumption (coal, oil, per country and sectors](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/energy%20-%20eda.ipynb)   
* [EU Unemployment, age, sex, 1983-2020, with a focus on recent changes](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/Unemployment%20EU.ipynb)  
* [Euro area international trade - monthly analysis](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/Euro%20area%2019%20international%20trade%20-%20monthly%20data%20-%20eda.ipynb)  
* [Household internet penetration in Europe - analysis per country and household type](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/Households%20-%20type%20of%20connection%20to%20the%20internet.ipynb)  
* [European countries UN Data - study correlation with Covid-19 and NASA pollution data](https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-un-data.ipynb)  
* [European Transportation Data - ingestion, processing, analysis](https://github.com/gabrielpreda/nasa_2020/tree/master/analysis), processed data [here](https://github.com/gabrielpreda/nasa_2020/tree/master/data/transport%20data)    



### Lookup files
Those files contain code, label and other nomenclators information

### Airport Codes 
This lookup contains airport code information associated with the municipality, country and region information. 
Original source: 
https://pkgstore.datahub.io/core/airport-codes/831/datapackage.json
https://pkgstore.datahub.io/core/airport-codes/airport-codes/archive/edda13b18a6832d040c1ff19fbd4a8fd/airport-codes.csv


### Maritime entity
Data available in this lookup is useful for mapping port abbreviation with country names.
Original source: https://ec.europa.eu/eurostat/cache/metadata/en/mar_esms.htm

### Transport coverage
In this lookup is needed for maritime port data.
Original source: http://dd.eionet.europa.eu/vocabulary/eurostat/tra_cov/view?page=1#vocabularyConceptResults

### Traffic_and_transport_measurement
This lookup contains information about the transport measurement abbreviation and labelling.
Original source: http://dd.eionet.europa.eu/vocabulary/eurostat/tra_meas

### UE Countries
Data available here is useful for mapping contry code information and use Latitude and Longitude coordinates.
Original source:
https://developers.google.com/public-data/docs/canonical/countries_csv
https://latitudelongitude.org/
https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv


### UE_Airports
The data available here contains ICAO and IATA codes associated with municipality and country information.
Original source: https://airportcodes.io/en/continent/europe/




# Results

Here we are showing few insights in the data processing and analyses we performed. 

For complete analysis review, navigate to each of the respective Notebook.


<img src="img/iasi.png" width="600">
<font size="2">IASI CO distribution accross Europe, April 2020, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/iasi-metop-c-co-eda.ipynb">link</a></font>
<img>  


<img src="img/merra-CO.png" width="800">
<font size="2">Earthdata MERRA2 CO Colum Burden  (COCL) distribution, april 2020, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/iasi-metop-c-co-eda.ipynb">link</a></font>
<img>  


<img src="img/gross_inland_delivery_coal.png" width="800">
<font size="2">Europe gross inland delivery coal (thousand tones), 2019-2020, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/energy%20-%20eda.ipynb">link</a></font>
<img>  

<img src="img/internet_access.png" width="800">
<font size="2">Europe PC / household with 2 adults and dependent children - 2019, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/Households%20-%20type%20of%20connection%20to%20the%20internet.ipynb">link</a></font>
<img>  


<img src="img/covid19_un_data_corr.png" width="800">
<font size="2">Europe: correlation between covid-19 aggregate indicators and UN economic and social indicators, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/pollution-energy-industrial-health-society.ipynb">link</a></font>
<img>  

<img src="img/total_unemployment_SE.png" width="600">
<font size="2">Evolution of unemployment in Sweden, per sex and group age 2019-2020, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/Unemployment%20EU.ipynb">link</a></font>
<img>  

<img src="img/covid19.png" width="600">
<font size="2">Evolution of Covid-19 in European countries, for more details follow this <a href="https://github.com/gabrielpreda/nasa_2020/blob/master/analysis/covid-19-europe-eda.ipynb">link</a></font>
<img> 
