# Air Data 2024
This notebook explores + analyses data from the UK's Civil Aviation Authority (CAA) regarding various statistics for UK-registered airlines. This analysis will aim to answer the following research question:
> **What can explain patterns in air travel data during 2024?**
### Features and their context
The CAA dataset has collected the following statistics. Their description is also available in the 'Dataset overview.pdf' file.
| Feature | Description | Unit |
| --- | --- | --- |
| type_of_operations | Whether the airline carries passengers or cargo. | |
| airline_name | The trading name of the airline. | |
| aircraft_km_x1000 | The number of flights performed, multiplied by the stage distance for that flight. | 1000km |
| no_flights | The number of flights that the airline performed in the time period. | |
| aircraft_hours | The total number of hours spent operating aircraft, 'block-to-block', i.e. from the moment of pushback to the moment of parking. | hours |
| total_passengers_uplifted | The total number of passengers carried by the airline in the time period. Passengers are only counted once per flight number, not duplicated on separate stages of the same journey. | |
| seat_km_available_x1000 | The number of seats available to be booked on the flight, multiplied by its stage distance, summed over all flights. | 1000km|
| seat_km_x1000 | The number of seats purchased on the flight, multiplied by its stage distance, summed up over all flights. | 1000km |
| cargo_tonnes_uplifted | The total amount of cargo carried by the airline in the time period. | tonnes (1000kg) |
| total_tonne_km_x1000 | The total amount of revenue cargo carried on a flight, multiplied by the stage distance, summed up over all flights | 1000 tonne km |
| tot_mail_tonne_km_used_x1000 | The total amount of tonne kilometers of mail cargo. | 1000 tonne km |
| total_freight_tonne_km_used_x1000 | The total amount of tonne kilometers of freight cargo. | 1000 tonne km |
| total_passenger_tonne_km_used_x1000| The total amount of tonne kilometers of passenger cargo. | 1000 tonne km |

### Initial imports & cleaning

In [12]:
# package imports
import numpy as np
import pandas as pd

In [13]:
# import dataset
FILEPATH = 'annual_data.csv'
df = pd.read_csv(FILEPATH)

# Drop unnecessary 'rundate' and 'reporting_period' columns
df.drop(['rundate', 'reporting_period'], axis=1, inplace=True)