# TIL6022 Project Main Notebook

Group 1: Cynthia Cai, Chenghua Yang, Daniel Auerbach, Runsheng Liu

# 1 Research Objective

The research objective is to study the difference ofCOVID-19’s influence in mobility and some other mobility-related situations in the Netherlands and Poland.

The sub domains are:

- mobility and economy
    - vehicle volumes
    - GDP
- travelling behaviour
    - visits and length of stay at different places change, like retail, grocery, workplace, etc
- transport modes and hospital rate
    - transportation modes include bus, subway, private cars
- Level of digitalization
    - digitalization will lead to less travelling needs

# 2 Methodology

## 2.1 Data Source

The details about data used is in the seperate process notebooks. Here we attach the main data sources of our project.

- Eurostat: https://www.statista.com/statistics/1171183/ghg-emissions-sector-european-union-eu/
- Google mobility data: https://www.google.com/covid19/mobility/
- CBS Open data Statline: https://opendata.cbs.nl/statline/portal.html?_la=en&_catalog=CBS&tableId=84755ENG&_theme=1159
- European Commission data: https://ec.europa.eu/info/statistics_en
- Hospitality(LCPS): https://lcps.nu/datafeed/

## 2.2 Pipeline
1. Select and download datasets based on research objective
2. Data preprocessing, like removing outliers, filling in null values (in seperate notebooks)
3. Data analysis and data visualization (in the main notebook)
4. Results

# 3 Analysis and Conclusion

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px

## 3.1 Mobility and economy

In [3]:
# Import processed data
df_GDP_Traffic_3countries = pd.read_csv('.\\data\\traffic volume\\processed\\GDP_Traffic_3countries.csv' , delimiter=',')
df_COVID_final = pd.read_csv('.\\data\\traffic volume\\processed\\COVID_final.csv' , delimiter=',')
df_GDP_Traffic_3countries_COVID = pd.read_csv('.\\data\\traffic volume.\\processed/GDP_Traffic_3countries_COVID.csv' , delimiter=',')

In [4]:
# Make graph showing the Traffic volume relative to GDP over time for the three countries
Traffic_volume = ["Netherlands traffic volume relative to GDP", "Poland traffic volume relative to GDP", "Romania traffic relative to GDP"]
fig = px.line(df_GDP_Traffic_3countries, x=df_GDP_Traffic_3countries.index, y=Traffic_volume, title="Traffic volume relative to GDP for three countries")
fig.show()

# Make graph showing the GDP of the three countries
GDP = ["GDP Netherlands", "GDP Poland", "GDP Romania"]
fig = px.line(df_GDP_Traffic_3countries, x=df_GDP_Traffic_3countries.index, y=df_GDP_Traffic_3countries.columns[0:3], title="GDP of three countries")
fig.show()

# Make bar plot showing the percentual change in Traffic volume
fig = px.bar(df_GDP_Traffic_3countries_COVID, x=df_GDP_Traffic_3countries_COVID.index, y=Traffic_volume, title="Traffic volume percentage change for three countries", barmode='group', labels={
                     "value": "Percentage change from year before"})           
fig.show()

# Make bar plot showing the percentual change in GDP
fig = px.bar(df_GDP_Traffic_3countries_COVID, x=df_GDP_Traffic_3countries_COVID.index, y=GDP, title="GDP percentage change for three countries", barmode='group', labels={
                     "value": "Percentage change from year before"})           
fig.show()

# Make bar plot showing the number of COVID cases 2019 and 2020
Cases = ["COVID_cases_Netherlands", 'COVID_cases_Poland', 'COVID_cases_Romania']
fig = px.bar(df_COVID_final, x="year", y=Cases, title="COVID cases 2019 and 2020", barmode='group', labels={
                     "value": "COVID cases"})           
fig.show()

# Make bar plot showing the number of COVID deaths 2019 and 2020
Deaths = ["COVID_deaths_Netherlands", 'COVID_deaths_Poland', 'COVID_deaths_Romania']
fig = px.bar(df_COVID_final, x="year", y=Deaths, title="COVID deaths 2019 and 2020", barmode='group', labels={
                     "value": "COVID deaths"})           
fig.show()

### Summary of the traffic volume relative to GDP

The first graph shows the traffic volume relative to GDP of the three countries over a 10 year period. This is quite stable untill 2019. From 2019 to 2020, however, a big change seems to occur. This is probably due to COVID. The second graph shows the GDP of the three countries over a 10 year period. This is quite stable aswell, so the big decline in 2020 is not caused by GDP changes. The third graph shows the percentage change of the traffic volume relative to the previous year for the three countries. The Netherlands has the biggest decline in traffic volume, followed by Poland and then Romania. In graph 5 and 6 the covid data for these countries can be found. These figures together indicate that there is indeed a relation between COVID and the traffic volume. Furthurmore, it seems to be the case that wealthier countries, like the Netherlands have a bigger decline in traffic volume than less wealthy countries, like Poland and Romania. The reason for this is probably that people in more wealthy countries have more service minded jobs, which are easier to perform from home. Less wealthy countries have less of these types of jobs and therefore are not as suited to work from home. Also, people in more wealthy countries generally have more room for a loss of income and can therefore more easily work less or stop working due to COVID. In less wealthy countries, this is not the case.

This research gives a good indication of the impact of COVID on mobility in countires of a different wealth category. However, based on this research, hard conclusions can not yet be made. This research could therefore be improved by looking at more countries and also monthly data, when it becomes available. Statistical tests can than be conducted to compare the mean traffic decline in a group of wealthy and a group of less wealthy countries. Also, more research behind the reason why this occurs is nessecary. A few possible explanations have been mentioned already.

<h2>Sub-question 2: Travelling behaviour</h2>

In [None]:
# to be added by Runsheng

<h2>Sub-question 3: Travelling mode</h2>

In [None]:
# to be added by Chenghua

<h2>Sub-question 4: Digitalization</h2>

In [5]:
# import processed data
df_digi = pd.read_csv('.\\data\\digitalization\\processed\\digi.csv')
df_digi = df_digi.set_index('Year')
#display(df_digi)

In [6]:
# make line chart
# human capital
human_col = ['EU Human Capital', 'NL Human Capital', 'PL Human Capital', 'RO Human Capital']
fig = px.line(df_digi, x=df_digi.index, y=human_col, title="Human capital of three European countries")
fig.show()

# Internet connectivity
conn_col = ['EU Connectivity', 'NL Connectivity', 'PL Connectivity', 'RO Connectivity']
fig = px.line(df_digi, x=df_digi.index, y=conn_col, title="Connectivity of three European countries")
fig.show()

# integration of digital technology by businesses
tech_col = ['EU Technology', 'NL Technology', 'PL Technology', 'RO Technology']
fig = px.line(df_digi, x=df_digi.index, y=tech_col, title="Integration of digital technology by businesses of three European countries")
fig.show()

# public service
pub_col = ['EU Public Service', 'NL Public Service', 'PL Public Service', 'RO Public Service']
fig = px.line(df_digi, x=df_digi.index, y=pub_col, title="E-Public service of three European countries")
fig.show()

# overal digitalization index
index_col = ['EU Digi Index', 'NL Digi Index', 'PL Digi Index', 'RO Digi Index']
fig = px.line(df_digi, x=df_digi.index, y=index_col, title="Digitalization index of three European countries")
fig.show()

text to be added (by Cynthia)

# Conclusion

text to be added (by whom?)

# Work Allocation
The initial idea is to let everyone carry out an analysis task. The eventual work allocation is:

- Daniel Auerbach: traffic volume
- Ruisheng Liu: travelling behaviour
- Chenghua Yang: travelling mode
- Cynthia digitalization