# The effect of COVID19 on flight patterns at Dutch airports

## TIL6022 Python Programming

## Group 7: 
1. Charlotte van Rijsoort (5400546)
2. Fleur van Steekelenburg (5313066)
3. Romy Lambregts (4881036)
4. Jasper van den Broek (5262887)
5. Konstantina Mountouri (6074170)

### Introduction
<p>Nowadays, flying is one of the most popular and fast travelmodes. The number of estimated passengers in 2019 was 4.5 billion and this was an increasing trend (1). However, in 2019 the COVID19-virus started to settle down in the first countries. In 2020, it reached the Netherlands and everything changed. People got more or less locked up in their houses. We stopped meeting friends, going to work, etc. and we limited our movements to solely necessary ones. This was, of course, also the case for travelling. Going on holiday was not really an option anymore and so was flying to the other side of the world to meet family or friends.</p>

<p>The total timespan of the pandemic was about 3 years, however the impact of COVID19 on travel patterns was not equally the same over those years. In this project, we aim to find out how it impacted travel patterns by air at the major Dutch airports. We will not only consider passengers, but also the amount of cargo [tons] and mail [tons] and compare them. This brings us to the research question:</p>

>*What is the impact of COVID19 on travel patterns for passengers-, cargo- and mail flights from Dutch airports to other regions in the world and the other way around between 2019 and 2023?*

#### Scope
<p>To make this research question a bit more specific, the scope needs to be crearified more:

1. Regions: EU countries (European Union), Other Europe, North Africa, West Africa, Central Africa, East Africa, South Africa, North America, Central America, South America, West Asia, South East Asia, North East Asia and Oceania.
2. Dutch airports: Schiphol, Rotterdam The Hague Airport, Eindhoven Airport, Maastricht Aachen Airport and Groningen Airport.
3. Baseline (pre-COVID) data: 2019
4. Timespan of COVID19 in the Netherlands: march 2020 till january 2023 (2)
</p>

#### Subquestions
Finding the answer to the research question, it is necessary to take a look at the three different types of movements (passengers, cargo and mail) individually. For each type, data will be analysed and every subquestion will be researched and answered. For each subquestion, the movements to and from all regions and/or the arrival/departure rates of the movement types at the airports will be considered.

>1. *What did flight patterns look like before the COVID19 pandemic started?*
>
>2. *How did flight patterns change during the COVID19 pandemic?*
>
>3. *What do flight patterns look like nowadays?*

Sources:
(1) https://www.icao.int/annual-report-2019/Pages/the-world-of-air-transport-in-2019.aspx#:~:text=According%20to%20ICAO's%20preliminary%20compilation,a%201.7%20per%20cent%20increase.

(2) COVID19 dataset


### Chapter 1: Overview of airports
<p>In this first chapter, the total amount of all flights will be considered for every airport. This gives a quick and clear overview about the basic situation from 2019 till now. Herefore, two databases have been used:

1. 37478eng_UntypedDataSet_09102023_105711.csv (3)
2. COVID_19_aantallen_gemeente_cumulatief.csv (4)
</p>


(3) https://opendata.cbs.nl/statline/#/CBS/en/dataset/37478eng/table?ts=1696840376267

(4) https://data.rivm.nl/covid-19/

In [92]:
# Import libraries
import pandas as pd
import plotly.express as px

# Import flight-data from folder
file_path = '37478eng_UntypedDataSet_09102023_105711.csv'
data = pd.read_csv(file_path, delimiter=';')

# Set ID as index an change name of the airports for a better readability
data.set_index('ID')

data.loc[ data['Airports'] == 'A045844', 'Airports'] = 'TOTAL'
data.loc[ data['Airports'] == 'A043590', 'Airports'] = 'Schiphol'
data.loc[ data['Airports'] == 'A043596', 'Airports'] = 'Rotterdam The Hague'
data.loc[ data['Airports'] == 'A043591', 'Airports'] = 'Eindhoven'
data.loc[ data['Airports'] == 'A043595', 'Airports'] = 'Maastricht Aachen'
data.loc[ data['Airports'] == 'A043593', 'Airports'] = 'Groningen'

# Change name of particular columns for better readability
data.rename(columns={"TotalFlights_3": "Total Flights", 
                    "TotalPassengers_12": "Total Passengers [pax]",
                    "TotalCargo_43": "Total Cargo [tons]",
                    "TotalMail_74": "Total Mail [tons]",
                    "TotalArrivalsPassengers_15": "Total Arrived Passengers",
                    "TotalDeparturesPassengers_18": "Total Departed Passengers",
                    "TotalUnloadedCargo_46": "Total Unloaded Cargo [tons]",
                    "TotalLoadedCargo_49": "Total Loaded Cargo [tons]",
                    "TotalUnloadedMail_77": "Total Unloaded Mail [tons]",
                    "TotalLoadedMail_80": "Total Loaded Mail [tons]",
},
            inplace=True
            )

# Show table
data



Unnamed: 0,ID,Airports,Periods,CrossCountryFlights_1,LocalFlights_2,Total Flights,Scheduled_4,NonScheduled_5,TotalArrivalsFlights_6,Scheduled_7,...,SouthAfrica_93,America_94,NorthAmerica_95,CentralAmerica_96,SouthAmerica_97,Asia_98,WestAsia_99,SouthEastAsia_100,NorthEastAsia_101,Oceania_102
0,0,TOTAL,1997JJ00,467579,206214,400118,364095,36023,200040,182043,...,.,.,.,.,.,.,.,.,.,.
1,1,TOTAL,1998JJ00,485852,201265,425608,387560,38048,212751,193792,...,.,.,.,.,.,.,.,.,.,.
2,2,TOTAL,1999MM01,36810,14868,33145,31222,1923,16565,15616,...,.,.,.,.,.,.,.,.,.,.
3,3,TOTAL,1999MM02,34356,11314,30912,29015,1897,15444,14513,...,.,.,.,.,.,.,.,.,.,.
4,4,TOTAL,1999MM03,41290,18152,35591,33278,2313,17802,16644,...,.,.,.,.,.,.,.,.,.,.
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2515,2515,Groningen,2023MM05,1100,4939,126,0,126,56,0,...,0,0,0,0,0,0,0,0,0,0
2516,2516,Groningen,2023MM06,1194,5484,126,0,126,64,0,...,0,0,0,0,0,0,0,0,0,0
2517,2517,Groningen,2023KW02,3324,15193,313,0,313,150,0,...,0,0,0,0,0,0,0,0,0,0
2518,2518,Groningen,2023MM07,1016,4229,126,0,126,62,0,...,0,0,0,0,0,0,0,0,0,0


In [93]:
# Only select monthly data from 2019 till 2023
data_total = data.loc[data['Periods'].str.contains('2019MM|2020MM|2021MM|2022MM|2023MM', case=False, regex=True)]

# Make a graph about the total amount of flights
figtotal = px.line(data_total, 
                x = "Periods", 
                y = "Total Flights",
                color = "Airports",
                symbol = "Airports",
                )

# Adjust the layout of the graph
figtotal.update_layout(
                title={'text': 'Number of total flights per Dutch airport between 2019 and 2023'},
                height=600,
                )

# Show graph
figtotal.show()

<p>In this graph, the total amount of flight is shown. These flights include passenger-, cargo- and mail flights. There are some interesting points, for example the big drop in April 2020, on the graph that will be discussed in later chapters. However, it is very clear that by far most of the flights depart and arrive from Schiphol (red line). This portion is so big, that the other main airports in the Netherlands almost fall into insignificance. Also, the trendline of the total amount of flights of all airports together (blue line) corresponds neathly with Schiphol's trendline. Herefore, it is decided to continue researching only Schiphol Airport and leave the other airports out.</p>

<p>Next, the COVID19 data will be analysed and discussed.</p>

In [94]:
# Import COVID19-data
covidfile_path = 'COVID_19_aantallen_gemeente_cumulatief.csv'
coviddata = pd.read_csv(covidfile_path, delimiter=';')

# The data needs to be filtered. Only data at the last day of each month is needed
end_of_months = '2020-03-31|2020-04-30|2020-05-31|2020-06-30|2020-07-31|2020-08-31|\
                2020-09-30|2020-10-31|2020-11-30|2020-12-31|2021-01-31|2021-02-28|\
                2021-03-31|2021-04-30|2021-05-31|2021-06-30|2021-07-31|2021-08-31|\
                2021-09-30|2021-10-31|2021-11-30|2021-12-31|2022-01-31|2022-02-28|\
                2022-03-31|2022-04-30|2022-05-31|2022-06-30|2022-07-31|2022-08-31|\
                2022-09-30|2022-10-31|2022-11-30|2022-12-31|2023-01-31|2023-02-28|\
                2023-03-31|2023-04-30|2023-05-31|2023-06-30|2023-07-31|2023-08-31'

coviddf = coviddata[coviddata['Date_of_publication'].str.contains(end_of_months) == True]

# Unnecessary columns can be removed
coviddf = coviddf.drop(['Version', 'Date_of_report', 'Municipality_code',
                        'Municipality_name', 'Hospital_admission', 'Deceased']
                        , axis=1
                        )

# Reset index
coviddf = coviddf.reset_index()

# Notation of the dates needs to be changed into the notation of the flightdatafile to compare them later on
for i in range(len(coviddf)):
    date_list = coviddf['Date_of_publication'][i].rstrip('10:00:00').split('-')
    new_date = date_list[0] + 'MM' + date_list[1]
    coviddf.loc[i, 'Date_of_publication'] = new_date

# Calculate the cumulative number of registered COVID19 cases of all municipalities of every month
sum = 0
date = coviddf['Date_of_publication'][0]
cumulative_per_date = {}
cumulative = 0
sum_last_month = 0

for i in range(len(coviddf)):
    if coviddf['Date_of_publication'][i] == date:
        sum += coviddf['Total_reported'][i]
    else:
        sum -= cumulative
        cumulative += sum
        cumulative_per_date.update({date: sum})
        date = coviddf['Date_of_publication'][i]
        last_sum = sum
        sum = 0

# Make a graph about the number of registered COVID19 cases
figcovid = px.bar(x = cumulative_per_date.keys(),
              y = cumulative_per_date.values(),
              title = 'Amount of reported COVID19 cases per month in the Netherlands',
              color_discrete_sequence = ['orange']*len(cumulative_per_date),
              opacity = 0.8
            )

# Adjust the axes of the graph
figcovid.update_xaxes(tickangle=90, title = 'Period')
figcovid.update_yaxes(title = 'Reported COVID19 cases')

# Show graph
figcovid.show()

# CHECK WHETHER THIS IS ACTUALLY TRUE!! PEAKS SEEM REASONABLE, BUT ALMOST 14.000 CASES IN THE FORST EVER MONTH IS TOO MUCH! SHOULD BE BELOW/AROUND 4.000

Check with:     https://coronadashboard.rijksoverheid.nl/landelijk/positieve-testen

<p>In this graph, the total amount of reported COVID19 cases in the Netherlands are shown. The two graphs will now be combined to improve readability. The adjustment for the first graph (narrowing down from all airports to just Schiphol) will be made as well. Also, the three different type of flights will be presented as well.</p>

In [95]:
# Select the right columns and rows from the flightdata
data_schiphol = data_total.iloc[:, [1, 2, 5, 14, 45, 76]]
data_schiphol = data_schiphol.groupby('Airports').get_group('Schiphol')

# Data has to be transformed into right format
data_schiphol['Total Flights'] = pd.to_numeric(data_schiphol['Total Flights'], errors='coerce')
data_schiphol['Total Passengers [pax]'] = pd.to_numeric(data_schiphol['Total Passengers [pax]'], errors='coerce')
data_schiphol['Total Cargo [tons]'] = pd.to_numeric(data_schiphol['Total Cargo [tons]'], errors='coerce')
data_schiphol['Total Mail [tons]'] = pd.to_numeric(data_schiphol['Total Mail [tons]'], errors='coerce')

# Make a graph about the total number of flights and amount of movements of every type
figschiphol = px.line(data_schiphol, 
                x = 'Periods', 
                y = ["Total Flights",
                     "Total Passengers [pax]",
                     "Total Cargo [tons]",
                     "Total Mail [tons]"],
                color = "variable",
                symbol = "variable",
                log_y = True,   # Set to logarithmic scale to improve readability of the graph
                labels = {
                    "Periods": "Period",
                    "value": "Amount in [flights], [pax] or [tons]",
                    "variable": "Type of movement"
                    }
                )

# Add the covidgraph into the schipholgraph
for trace in figcovid.data:
    figschiphol.add_trace(trace)

# Adjust the layout of the graph
figschiphol.update_layout(
                    title = {'text': 'Total number of flights and movements per type at Schiphol, combined with registered COVID19 cases'}
                    )

# Show graph
figschiphol.show()

--PUT THE MAP GRAPH HERE--

<p>ANALYSE, include:

1. big drop with first lockdown
2. First 10 cases was a lot, later 5000 was not much, so less impact on flying
3. Cargo remains more or less stable, much less COVID impact
4. Total amount of flights drops big time, passengers drop big time, but cargo and mail are both less affected
5. still not same amount of flights and passengers as before covid
6. mail is still slowly decreasing
7. something about map graph </p>

<p>In the next chapters, the three different type of movements (passengers, cargo and mail) will be analysed more through.</p>

### Chapter 2: Passenger flight patterns

<p>In this chapter, the passenger data from Schiphol will be analysed and the subquestions will be answered.</p>

In [96]:
# Select the right data from the dataset
data_passengers = data.loc[data['Periods'].str.contains('2019MM|2020MM|2021MM|2022MM|2023MM', case=False, regex=True)]
data_passengers = data_passengers[data_passengers['Airports'] == 'Schiphol']

# Make a graph about the total number and type of passengers at Schiphol between 2019 and 2023
figpassengers = px.line(data_passengers, 
                x = "Periods", 
                y = ["Total Arrived Passengers", "Total Departed Passengers"],
                labels = {
                    "Periods": "Period",
                    "value": "Amount of passengers",
                    "variable": "Arrivals/Departures"
                    },
                )

# Adjust the layout and axes of the graph
figpassengers.update_layout(
                title={'text': 'Total amount of arrived and departed passengers at Schiphol between 2019 and 2023'},
                )

figpassengers.update_yaxes(categoryorder = 'category ascending')

# Show graph
figpassengers.show()

--PUT THE MAP GRAPH HERE--

#### Analysis

##### *What did flight patterns look like before the COVID19 pandemic started?*
<p>write here</p>

##### *How did flight patterns change during the COVID19 pandemic?*
<p>write here</p>

##### *What do flight patterns look like nowadays?*
<p>write here</p>

### Chapter 3: Cargo flight patterns

<p>In this chapter, the passenger data from Schiphol will be analysed and the subquestions will be answered.</p>

In [133]:
# Select the right data from the dataset
data_cargo = data.loc[data['Periods'].str.contains('2019MM|2020MM|2021MM|2022MM|2023MM', case=False, regex=True)]
data_cargo = data_cargo[data_cargo['Airports'] == 'Schiphol']

# Make a graph about the total number and type of cargo [tons] at Schiphol between 2019 and 2023
figcargo = px.line(data_cargo, 
                x = "Periods", 
                y = ["Total Unloaded Cargo [tons]", "Total Loaded Cargo [tons]"],
                labels = {
                    "Periods": "Period",
                    "value": "Amount of cargo in [tons]",
                    "variable": "Type of cargo"
                    },
                )

# Adjust the layout and axes of the graph
figcargo.update_layout(
                title={'text': 'Total amount of (un)loaded cargo [tons] at Schiphol between 2019 and 2023'},
                )

figcargo.update_yaxes(categoryorder = 'category ascending')

# Show graph
figcargo.show()

--PUT THE MAP GRAPH HERE--

#### Analysis

##### *What did flight patterns look like before the COVID19 pandemic started?*
<p>write here</p>

##### *How did flight patterns change during the COVID19 pandemic?*
<p>write here</p>

##### *What do flight patterns look like nowadays?*
<p>write here</p>

### Chapter 4: Mail flight patterns

<p>In this chapter, the passenger data from Schiphol will be analysed and the subquestions will be answered.</p>

In [131]:
# Select the right data from the dataset
data_mail = data.loc[data['Periods'].str.contains('2019MM|2020MM|2021MM|2022MM|2023MM', case=False, regex=True)]
data_mail = data_mail[data_mail['Airports'] == 'Schiphol']

# Make a graph about the total number and type of mail [tons] at Schiphol between 2019 and 2023
figmail = px.line(data_mail, 
                x = "Periods", 
                y = ["Total Unloaded Mail [tons]", "Total Loaded Mail [tons]"],
                labels = {
                    "Periods": "Period",
                    "value": "Amount of mail in [tons]",
                    "variable": "Type of mail"
                    },
                # log_y = True,
                )

# Adjust the layout and axes of the graph
figmail.update_layout(
                title={'text': 'Total amount of (un)loaded mail [tons] at Schiphol between 2019 and 2023'},
                )

figmail.update_yaxes(categoryorder = 'category ascending'
                     )

# Show graph
figmail.show()

--PUT THE MAP GRAPH HERE--

#### Analysis

##### *What did flight patterns look like before the COVID19 pandemic started?*
<p>write here</p>

##### *How did flight patterns change during the COVID19 pandemic?*
<p>write here</p>

##### *What do flight patterns look like nowadays?*
<p>write here</p>

### Chapter 5: Conclusion

<p>WRITE OVERALL CONCLUSION</p>

### Chapter 6: Recommendations

<p>WRITE RECOMMENDATIONS</p>