# Project Group

**Members & student numbers:** <br>
Emma van den Brink - 5136008 <br>
Caroline Aalders - 5230578 <br>
Pangiotis Papadopoulos - 5054443 <br>
Vendankur Kedar - 5693136 <br>
Thieme Brandwagt - 5232910 <br>


# Research Objective

*Requires data modeling and quantitative research in Transport, Infrastructure & Logistics*

Our research question: <br>
What is the effect of **EU sustainable transport policy** on the modal shift from airplanes to high speed train within Europe?



# Contribution Statement

*Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling*

**Caroline Aalders**: Manager of background research.

**Emma van den Brink**: Manager of data analysis.

**Vedankur Kedar**: Manager of data visualisation in graphs (and possibly maps).

**Panagiotis Papapoulos**: Manager of data modelling and coding logic.

**Thieme Brandwagt**: Manager of conceptualisation of theory and relevant policy.

# Data Used

1. **Data on W.I.P. EU transport policy:** https    ://www.consilium.europa.eu/en/policies/?filters=1650 (Relevant: Clean and sustainable mobility, European Green Deal, Rail Transport Policy)
2. **Data on historal EU rail policy directives:** https://www.europarl.europa.eu/factsheets/en/sheet/130/rail-transport#:~:text=The%20directive%20was%20amended%20by%20Directive%202001%2F13%2FEC%20of,Directive%202012%2F34%2FEU%20of%2021%20November%202012%20establishing%20
3. **Emission Trading Scheme and aviation:** https://climate.ec.europa.eu/eu-action/aviation-and-eu-ets_en
4. **Total amount of aviation passengers, excluding international flights.** https://ec.europa.eu/eurostat/databrowser/view/TTR00012/default/line?lang=en
5. **International flights:** https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Air_transport_statistics#Intra-EU_passenger_transport_was_almost_at_the_same_level_as_extra-EU_traffic_in_2021 
6. **Passenger flight per region:**
https://ec.europa.eu/eurostat/databrowser/view/TRAN_R_AVPA_NM/default/line?lang=en 
7. **Specific airports:** https://www.eurocontrol.int/Economics/DailyOperatedSchedules-Airports.html
8. **Rail fares:** https://data.europa.eu/data/datasets/fares?locale=nl
9. **Plane fares:** https://old.datahub.io/dataset/european-flight-price-data
10. **Other EU airline data:** https://ec.europa.eu/eurostat/web/main/data/database 
11. **Passengers Transported:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_TOTAL/default/table?lang=en
12. **Passengers transported (detailed reporting only) - (quarterly data):** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_QUARTAL/default/table?lang=en
13. **Passenger transport by type of transport (detailed reporting only):** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_TYPEPAS/default/table?lang=en
14. **International transport of passengers from the reporting country to the country of disembarkation:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_INTGONG/default/table?lang=en
15. **International transport of passengers from the country of embarkation to the reporting country:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_INTCMNG/default/table?lang=en
16. **Railway transport - national and international railway passengers transport by loading/unloading NUTS 2 region:** https://ec.europa.eu/eurostat/databrowser/view/TRAN_R_RAPA/default/table?lang=en
17. **Passengers by speed of train:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_SPEED/default/table?lang=en
18. **Passenger cars in accompanied passenger car railway transport, by type of transport:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_NBCAR/default/table?lang=en
19. **Passengers in accompanied passenger car railway transport, by type of transport:** https://ec.europa.eu/eurostat/databrowser/view/RAIL_PA_NBPASS/default/table?lang=en

# Data Pipeline

All the data needs to be combined in a graph over time with stamps for places when policies have changed. The passenger number difference is the greatest indicator of our main question. We could try to find other influencing factors. Like ticket price and the covid pandemic. 

Steps:
1. Data collection
2. Data cleaning and transformation
3. Data integration into a single dataset
4. Data aggregation for the most interesting time periods
5. Data analysis
6. Data enrichment with external factors, such as COVID events
7. Data presentation in graphs

# Country comparison (Thieme)

In [1]:
# import relevant libaries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


plane_passengers = pd.read_csv('datasets/Air_transport_of_passengers_by_country.csv')
plane_passengers.head(10)
train_passengers = pd.read_csv()

Unnamed: 0,DATAFLOW,LAST UPDATE,freq,unit,tra_meas,tra_cov,schedule,geo,TIME_PERIOD,OBS_VALUE,OBS_FLAG
0,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2011,25137612,
1,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2012,25965977,
2,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2013,25749724,
3,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2014,26378676,
4,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2015,26754007,
5,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2016,27181511,
6,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2017,28327279,
7,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2018,31138417,
8,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2019,35644188,
9,ESTAT:TTR00012(1.0),05/10/23 23:00:00,A,PAS,PAS_CRD,TOTAL,TOT,AT,2020,9168431,
