# Project Group - 5

Members: Ahmad Nabil Maulana, Daan Michel, Gijs Aben, Philine Cremers

Student numbers: 5943442, 4684559 ,    4713656   ,     5036534

# Research Objective

*Requires data modeling and quantitative research in Transport, Infrastructure & Logistics*

The research objective is to analyse the Impact of Gross Domestic Product pet capita (GDP) on the number of air traffic movements within a country. 
**(We would like to have some feedback on our research objective if is defined enough or specific enough. Furthermore we would like to know if you foresee any problems that we could have with the data packages that we're going to use, especially for the air traffic data. At last we would like to hear your suggestions regarding data visualization, because we think this project is really straightforward, so we like some suggestions on how we're going to handle this project)**

# Contribution Statement

*Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling*

**Author 1**: Ahmad Nabil Maulana : Creating the repository,

**Author 2**: Daan Michel

**Author 3**: Philine Cremers

**Author 4**: Gijs Aben

# Background and Context

For this project, we will analyze the relationship between a country's GDP per capita and its air traffic mobility. GDP measures the monetary value of final goods and services produced within a country's borders during a specific period, such as a quarter or a year. We aim to determine whether countries with higher GDPs exhibit greater air traffic mobility compared to those with lower GDPs, or if there is no discernible correlation

For the flight data we found two different data sets with each its own pros and cons. \
The different data sets form OpenSky provide information about all the aircraft movements. We can use all air traffic movements to and from all the airports in a country. The data will be filtered by only picking comercial aircraft and cargo aircraft, private aircraft will not be considered. Furthermore the origin country of the departed flights will be used, which gives us an overview of amount of takeoffs per country. This should be a good indicator for the relative amount of air traffic movements per country when compared to the GDP. However the amount of data that needs to be processed for this is huge and could potentially lead to problems.\
The second data set we found is from The World Bank: 'Air transport, registered carrier departures worldwide'. This data discribes the amount of registerd aircraft takeoffs on scheduled services (i.e. plannend commercial flights both passenger and cargo) by country the aircraft is registered in. The data set gives us takeoffs per country per year which should make processing the data not too complicated and thus make it possible to go for an overview of multiple years. However, this data might not completely represent the goals of our research objective since it does not discripe the amount of aircraft movements per country. An aircraft can be registered in a country where it does not depart from or arrive in.

As the first step, we need to import the necessary libraries.

In [3]:
# this can be changed
import pandas as pd
from pathlib import Path
import numpy as np
import math
import scipy
from scipy.signal import find_peaks
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
import geopandas as gpd
import os

# Part I - Data Import

First, we're going to import and combine dataframes from the two datas that we will be using:

* Countries' GDP data from The World Bank.
* Countries' air traffic mobility data from OpenSky or The World Bank.

In [4]:
# import the file for the GDP data
file_path_GDP = 'https://raw.githubusercontent.com/anmaulana1/big-project/main/Data/GDP-US/GDP_per_state_all_states.csv'
df_GDP = pd.read_csv(file_path_GDP)

df_GDP.head()

Unnamed: 0,GeoFIPS,GeoName,Region,TableName,LineCode,IndustryClassification,Description,Unit,2005:Q1,2005:Q2,...,2020:Q4,2021:Q1,2021:Q2,2021:Q3,2021:Q4,2022:Q1,2022:Q2,2022:Q3,2022:Q4,2023:Q1
0,"""01000""",Alabama,5,SQGDP9,1,...,All industry total,Millions of chained 2012 dollars,182600.4,184625.7,...,204400.7,207001.8,209857.1,210029.1,213029.2,212789.5,212311.6,212946.3,215011.6,215084.4
1,"""02000""",Alaska,8,SQGDP9,1,...,All industry total,Millions of chained 2012 dollars,45176.1,45776.9,...,51069.4,50690.3,50707.9,50935.9,51143.5,49070.4,48963.4,49999.1,50501.7,50700.8
2,"""04000""",Arizona,6,SQGDP9,1,...,All industry total,Millions of chained 2012 dollars,256521.4,260884.6,...,338182.1,339659.7,344936.2,348706.0,357322.0,355309.3,353565.6,356966.0,359827.3,362191.9
3,"""05000""",Arkansas,5,SQGDP9,1,...,All industry total,Millions of chained 2012 dollars,104693.3,105271.5,...,119689.9,121860.2,123015.2,123676.9,124837.0,126803.5,125830.9,126248.3,127245.8,127312.2
4,"""06000""",California,8,SQGDP9,1,...,All industry total,Millions of chained 2012 dollars,1896451.4,1913819.0,...,2746311.9,2802569.5,2858504.7,2894880.5,2942968.5,2870410.5,2866766.2,2893948.3,2911384.3,2919913.3


In [8]:
# import the file for the air mobility data
file_path_registered_carrier = 'https://raw.githubusercontent.com/anmaulana1/big-project/main/air_mobility_worldbank.csv'
df_air_mobility = pd.read_csv(file_path_registered_carrier, on_bad_lines='skip')
df_air_mobility.head(15)

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,Aruba,ABW,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,,,,,2132.0,2276.0,,,,
1,Africa Eastern and Southern,AFE,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,548834.8,534810.3,556341.0,562927.0,630147.0,705127.5,717795.3,286064.1974,399895.848,
2,Afghanistan,AFG,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,21696.0,25920.0,23532.0,22770.0,24207.0,10454.0,7334.0,4635.714,2865.737,
3,Africa Western and Central,AFW,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,155038.0,145526.8,164614.0,157788.0,151203.0,157126.7,158874.8,92611.349,134532.013,
4,Angola,AGO,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,14496.0,13716.0,13116.0,15482.0,13494.0,13978.0,13647.0,3792.0,3805.0,
5,Albania,ALB,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,11196.0,1992.0,,306.0,1904.0,2935.0,2558.0,1274.0,1471.0,
6,Andorra,AND,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,,,,,,,,,,
7,Arab World,ARB,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,1222184.0,1289639.0,1375862.0,1476812.0,1503553.38,1539593.0,1599362.0,664879.575,920600.238,
8,United Arab Emirates,ARE,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,327076.0,352224.0,437638.0,463947.0,459137.0,455956.0,426157.3,185260.0,234612.0,
9,Argentina,ARG,"Air transport, registered carrier departures w...",IS.AIR.DPRT,,,,,,,...,145136.0,134521.0,145585.0,149334.0,146631.0,161862.0,163106.0,27447.0,54218.0,


# Part II - Data Processing