<h1>Preprocessing - Design of the Viseu Aerodrome Runway</h1>
<br> In this file you will find information about databases, data preparation, and data processing.

<h2>Basic info</h2>

© Tiago Tamagusko (tamagusko@gmail.com) - <a href="https://tamagusko.github.io">https://tamagusko.github.io</a>
<br> Version: 0.6 (2020/03/13)
<br> Using: Jupyter Notebook 6.0.2, Python 3.8, Linux 4.19.88-1-MANJARO x64, UTF-8.
<br> Required libraries: pandas 0.25.3, sys 3.8.0, geopy 1.20.0.
<br> Project Page: <a href="https://github.com/tamagusko/ViseuAirportStudy">github.com/tamagusko/ViseuAirportStudy</a>
<br> Licence: Apache-2.0, see for more details. <a href="https://github.com/tamagusko/ViseuAirportStudy/blob/master/LICENSE.md">LICENSE.md</a>


<h2>Problem</h2>

Viseu Aerodrome is interested in servicing bigger aircraft, which enable longer range and carry more passengers per aircraft [1]. This requires improving the structural strength of the pavement and increasing the length of its runway.

<h2>Proposal</h2>

In order to design an airport pavement, it is necessary to know two main characteristics, namely the Pavement Classification Number¹ (PCN) and the extension of the runway.

Thus, the proposed challenge is to design a pavement with capacity (PCN) and extension to meet a desired group of routes.

<h2>Project data structure</h2>
    
    ├── 1-preprocessing.ipyng    # Database preprocessing 
    ├── 2-analysis.ipynb               # Data analysis
    ├── data                  
    │          ├── raw                       # Raw data
    │          ├── processed           # Data processed
    ├── reports                              # Outputs
    
<h2>Input data</h2> https://github.com/tamagusko/ViseuAirportStudy/tree/master/data/processed

1. Aircraft database
   <br>a. **Aircraft**: Aircraft identification;
   <br>b. **ACN**²: Aircraft Classification Number;
   <br>c. **RequiredExtension**: Runway length required for landing and takeoff operations;
   <br>d. **Autonomy**: Aircraft autonomy;
   <br>e. **Passagers**³: Number of passengers carried.
   <br> **Updated: 25/12/2019**

2. Airport database [2] 
   <br>a. **Airport ID**: 	Unique identifier for this airport;
   <br>b. **Airport**: Name of airport;
   <br>c. **City**:  Main city served by airport;
   <br>d. **Country**: 	Country or territory where airport is located;
   <br>e. **IATA**:  3-letter IATA code - **Unused data**;
   <br>f. **ICAO**:  4-letter ICAO code;
   <br>g. **Latitude**: 	Latitutide in decimal degrees. Negative is South, positive is North;
   <br>h. **Longitude** :	Longitude in decimal degrees. Negative is West, positive is East;
   <br>i. **Altitude**: 	Altitude in feet - **Unused data**;
   <br>j. **Timezone**: 	Hours offset from UTC  - **Unused data**;
   <br>k. **DST**: 	Daylight savings time. One of E (Europe), A (US/Canada), S (South America), O (Australia), Z (New Zealand), N (None) or U (Unknown)  - **Unused data**;
   <br>l. **Tz**: Timezone in "tz" (Olson) format, eg. "America/Los_Angeles" - **Unused data**;
   <br>m. **Type**: 	Type of the airport - **Unused data**;
   <br>n. **Source**: 	Source of this data - **Unused data**.
   <br> **Updated: 23/12/2019**
3. Airport Database⁴ with extra data [3] 
   <br>a. **ICAO**:  4-letter ICAO code;
   <br>b. **RWYLenght**:  Runway length in meters;
   <br>c. **PCN**:  Runway structural strength number.
   <br> **Updated: 28/12/2019**
   
The data is UTF-8 encoded.

If you want to change the list of **aircraft** under study, edit the file: **Aircraft database (AircraftData.csv)**
<br>If you would like to change the list of **airports** under study, edit the file: **Airport extra database (AirportExtraData.csv)**

Notes: 

**¹** Indicates, among other elements, the structural strength of the pavement. <br>
**²** It is related to the impact of the aircraft on the pavement, and must be less than or equal to the runway PCN. <br>
**³** Information not required. <br>
**⁴** Shengen area airports only (lower airport security requirements).


<h1>1. Data Preparation</h1>

In [1]:
# Import needed libraries

import pandas as pd
import sys
!{sys.executable} -m pip install geopy
from geopy import distance

# Reference airport, change this value (RefAirportICAO) to choose another airport
RefAirportICAO = "LPVZ"



In [2]:
# Reading data

# Database developed by the author
aircrafts = pd.read_csv("data/processed/AircraftData.csv", sep=",", index_col ="Aircraft")
# Database adapted from: https://openflights.org/data.html
airports = pd.read_csv("data/processed/AirportData.csv", sep=",", index_col ="ICAO")
# Database developed with data from https://worldaerodata.com
airportsExtra = pd.read_csv("data/processed/AirportExtraData.csv", sep=",", index_col="ICAO")

In [3]:
# Resulting dataframe - aircrafts: Aircraft,ACN,RequiredExtension,Autonomy,Passagers
aircrafts  # Printing the data to test

Unnamed: 0_level_0,ACN,RequiredExtension,Autonomy,Passagers
Aircraft,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Dornier_228,6.4,792,1037,19
DCH-8_Q200,16.5,1000,2084,40
ATR42-600,18.6,1165,1326,48
ERJ140,21.1,1850,3058,44
ATR72-600,23.0,1175,1528,72
CRJ200,23.0,1768,2500,50
DCH-8_Q400,30.5,1425,2040,82
CRJ700,34.0,1605,2553,70
E170,38.6,1644,3982,72
E175,40.4,2244,4074,80


In [4]:
# Importing values from AirportTestExtraData.csv and loading on airports dataframe
for index, row in airportsExtra.iterrows():
    toImport = airportsExtra.loc[[index], ["RWYLenght", "PCN"]]
    toImport = (float(toImport.iloc[:, 0]), float(toImport.iloc[:, 1]))
    # Loading data in airports dataframe
    airports.loc[airports.index == index, ["RWYLenght", "PCN"]] = toImport

# Dropping the unwanted data in airports dataframe
airports = airports.drop("ID", 1)
airports = airports.drop("IATA", 1)
airports = airports.drop("Timezone", 1)
airports = airports.drop("DST", 1)
airports = airports.drop("TZ", 1)
airports = airports.drop("Type", 1)
airports = airports.drop("Source", 1)
# if RWYLenght or PCN are 0: delete row
airports = airports.drop(
    airports[(airports.RWYLenght == 0) | (airports.PCN == 0)].index)

# NOTE: Airports with incomplete or questionable data:
# LIRF/Roma - no values for PCN (29/12/2019)

In [5]:
airports.shape  # Printing the number of rows and columns in data

(118, 9)

In [6]:
# Resulting dataframe - airports:  ICAO,Airport,City,Country,Latitude,Longitude,Altitude,DistanceToRef(Null),RWYLenght,PCN
airports.head(5)  # Printing data to test

Unnamed: 0_level_0,Airport,City,Country,Latitude,Longitude,Altitude,DistanceToRef,RWYLenght,PCN
ICAO,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
BIKF,Keflavik International Airport,Keflavik,Iceland,63.985000610352,-22.6056,171.0,0,3065.0,80.0
EBBR,Brussels Airport,Brussels,Belgium,50.9014015198,4.48444,184.0,0,3638.0,80.0
EBCI,Brussels South Charleroi Airport,Charleroi,Belgium,50.459202,4.45382,614.0,0,2550.0,64.0
EDDB,Berlin-Schönefeld Airport,Berlin,Germany,52.380001,13.5225,157.0,0,3000.0,140.0
EDDK,Cologne Bonn Airport,Cologne,Germany,50.8658981323,7.14274,302.0,0,3815.0,75.0


<h1>2. Data processing</h1>

The distance (**DistanceToRef**) in kilometers to the reference airport (LPVZ) shall be calculated.
<br><br>Current Dataframe (airports): ICAO, Airport, City, Country, Latitude, Longitude, Altitude, RWYLenght, PCN.
<br>Required dataframe (airports): ICAO, Airport, City, Country, Latitude, Longitude, Altitude, **DistanceToRef**, RWYLenght, PCN.

In [7]:
# Saving values to the reference airport
RefAirport = airports.loc[[RefAirportICAO], ["Latitude", "Longitude", "Altitude"]]
RefAirport = (float(RefAirport.iloc[:, 0]), float(RefAirport.iloc[:, 1]), float(RefAirport.iloc[:, 2]))

# Calculating distance from Viseu Airport (LPVZ)
for index, row in airports.iterrows():
    toAirport = airports.loc[[index], ["Latitude", "Longitude", "Altitude"]]
    toAirport = (float(toAirport.iloc[:, 0]), float(
        toAirport.iloc[:, 1]), float(toAirport.iloc[:, 2]))
    # Calculation geodesic distance
    geodesicDistance = distance.geodesic(RefAirport, toAirport).km
    # Loading results into dataframe
    airports.loc[index, "DistanceToRef"] = round(geodesicDistance, 1)

# Resulting dataframe ICAO, Airport, City, Country, Latitude, Longitude, Altitude, DistanceToRef, RWYLenght, PCN.

In [8]:
airports.head(5)  # Printing data to test

Unnamed: 0_level_0,Airport,City,Country,Latitude,Longitude,Altitude,DistanceToRef,RWYLenght,PCN
ICAO,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
BIKF,Keflavik International Airport,Keflavik,Iceland,63.985000610352,-22.6056,171.0,2759.7,3065.0,80.0
EBBR,Brussels Airport,Brussels,Belgium,50.9014015198,4.48444,184.0,1480.6,3638.0,80.0
EBCI,Brussels South Charleroi Airport,Charleroi,Belgium,50.459202,4.45382,614.0,1444.6,2550.0,64.0
EDDB,Berlin-Schönefeld Airport,Berlin,Germany,52.380001,13.5225,157.0,2077.9,3000.0,140.0
EDDK,Cologne Bonn Airport,Cologne,Germany,50.8658981323,7.14274,302.0,1617.8,3815.0,75.0


In [9]:
# Saving the results to a csv file (AirportProcessedData.csv) in the reports folder
airports.to_csv('reports/AirportProcessedData.csv')

<h3>Go to <a href="https://github.com/tamagusko/ViseuAirportStudy/blob/master/2-analysis.ipynb">2-analysis.ipyng</a> to view data analysis.</h3>

<h1> References</h1>

[1] Martins, J. P. F. (2018). Reflexão sobre a viabilidade e localização de uma infraestrutura aeroportuária na região Centro de Portugal (Universidade do Porto). https://doi.org/10.13140/RG.2.2.34944.69124
<br>[2] OpenFlights (2019). Airport, airline and route data. Retrieved December 23, 2019, from https://openflights.org/data.html
<br>[3] World Aero Data (2019). World Aeronautical Database. Retrieved December 28, 2019, from https://worldaerodata.com<br><br>