# Airline Operational Performance Dashboard — Research Proposal
**Course:** TIL6022 Python Programming Q1 2025/26

**Group:** 6

**Authors:** [Toni Manjani: 5337100],  


---

## 0. Abstract
This project proposes the development of an airline operational performance dashboard using open flight data retrieved from OpenSky Network. The aim of this dashboard is to allow stakeholders to monitor historical and near-real-time flight performance, including relevant performance metrics such as flight duration, variability, outlier rates, and reactionary delays. Users will be able to view all airline routes or focus on individual routes to understand operational performance and identify areas for improvement.

---

## 1. Motivation & Relevance

Airline operational performance is a critical component of the transport system, as it directly influences efficiency, reliability, and passenger satisfaction. The operational side of an airline tends to be a very complex problem in terms of optimization and efficient management. This is mostly due to the sheer amount of resources that have to collaborate efficiently, while still resiting pressure from external factors influencing the daily performance, mainly the weather. Monitoring key performance metrics such as route-level flight durations, deviations from scheduled times, and reactionary delays enables airlines and airports to allocate resources effectively, optimize crew assignments, manage gates and turnaround operations, while still being able to identify bottlenecks in fleet rotations. This project is particularly relevant to the TIL domain because it applies quantitative data analysis to real-world airline operations, using publicly available data to provide actionable insights for stakeholders.


---

## 2. Research Question & Objectives

Given the problem statement mentioned on the previous section, the following reserach question is proposed: 

**Primary Research Question**  

How can open ADS-B data be transformed into a dashboard providing **operational performance insights** for an airline, allowing stakeholders to monitor historical trends and route-level reliability?


As a result, a number of sub-questions will also be addressed in order to support the research questions. The main subquestions at this time are stated below: 

**Sub-questions**

1. Which routes consistently have the largest deviations from expected flight durations, and what operational patterns contribute to these deviations?  
2. How do delays propagate across consecutive flights operated by the same aircraft, and which routes or aircraft are most affected by reactionary delays?  
3. What is the relationship between flight frequency and operational reliability for high-demand routes in an ariline's network?  
4. How do turnaround times influence the likelihood and magnitude of reactionary delays within the fleet?  
5. Can historical trends in flight duration deviations be used to identify routes that require operational adjustments or additional monitoring?  
6. How can aggregated flight metrics be visualized effectively in a dashboard to support operational decision-making and enhance stakeholder insights?

**Objectives**

The objectives of this project are to build a Python-based data pipeline to ingest, clean, and aggregate flight data from OpenSky and Eurocontrol, calculate operational metrics including flight duration deviations, outliers, and reactionary delays, and develop an interactive dashboard that allows stakeholders to explore both route-level and fleet-level performance. The project aims to provide actionable insights on operational efficiency and reliability while demonstrating the application of quantitative analysis within the TIL domain.

---

## 3. Scope & Constraints

**Scope**

The scope of this project is to mainly focus specifically on one european airline with a large fleet and destination network. Typically, this condition is often met by the so called 'legacy carriers', which basically means the more traditional, established airlines such as Lufthansa, KLM, British Airways, Air France etc. This is important to know, because such airlines opearte at different scales, from regional/domestic flights, to intercontinental routes, which make them different from more modern and upcoming airlines, that tend to focus on low-cost services (Wizz Air, Ryan Air, Easy Jet). In addition, legacy carriers often tend to incldue a variety of aircraft in their fleet, and they mostly operate out of major hubs, which shows that operation management for these types of airlines can be much more complex. 

Secondly, the number of routes and the time to be included for research needs to be specified. At this moment, the idea is to show a dashboard for a single month of opeartions of a chosen airline. This is because it is not known yet how much data there is for a single airline and for a full year, thus it is important to start at a smaller time-step first. Should it be possible, the goal is to adapt the dashboard to include a full historical register for the chosen airline. In addition, the historical data should not go before 2023-24, as the COVID period included a number of restrictions and variations in the data, which does not represent regular operations of an airline. 

**Constraints & Simplifying Assumptions**


Again, as it has been mentioned many times throughout this proposal, this idea can get very difficult to model in a realistic way, because there is much more behind it than explained in this proposal. An airlie does not operate on it's own, but it is a massive combination
For this project, scheduled departure and arrival times will be approximated using **historical median flight durations per route**. Weather conditions and other external factors are **ignored** to maintain feasibility and simplicity. Since ADS-B data may contain **coverage gaps**, incomplete flights will be excluded from the analysis. Reactionary delays will be estimated based on tail numbers (`icao24`)*, recognizing that actual operational connections may differ from the observed sequences.
  


*The `icao24` identifier, often referred to as the aircraft's tail number in ADS-B datasets, is a unique 24-bit hexadecimal address assigned to every aircraft transponder by the International Civil Aviation Organization (ICAO). It acts as a digital fingerprint for the aircraft and remains constant regardless of the route, airline callsign, or flight number. By using the `icao24` code, it is possible to track the movements of individual aircraft across multiple flights, which is essential for estimating reactionary delays and understanding how delays on one flight can propagate to subsequent flights operated by the same aircraft.

---


## 4. Data Sources & Usage

### OpenSky Network (ADS-B Data)
[OpenSky Network](https://opensky-network.org/data) provides open access to ADS-B surveillance data collected from aircraft transponders. Each aircraft is identified by the unique `icao24` code. The dataset includes key flight information such as callsign, origin and destination airports, timestamps, positions, velocity, and altitude. Within this project, OpenSky data will be used to determine actual departure and arrival times, calculate actual flight durations, and identify outlier flights that significantly exceed expected duration thresholds.

### Eurocontrol – Central Flow Management Unit (CFMU) / DDR2
[Eurocontrol](https://www.eurocontrol.int/ddr) provides flight plan and network management data through its Central Flow Management Unit (CFMU) and Demand Data Repository (DDR2). These datasets include scheduled flight plans, airport slot times, delay indices, and en-route restrictions. In this project, Eurocontrol data will serve as the reference layer against which actual flight data from OpenSky is benchmarked. It will be used to calculate expected route-level median durations, identify deviations between planned and actual flight times, and support the delay propagation analysis by incorporating network-level constraints such as airport slot assignments and en-route flow management measures.


#### Note on Data: 

While ADS-B data from OpenSky provides detailed records of actual flight trajectories and timings, it does not include scheduled departure or arrival times. Eurocontrol data complements this by supplying planned flight schedules, route durations, and delay benchmarks. Combining the two allows for meaningful delay analysis by comparing actual operations against planned expectations, calculating deviations from schedule, and assessing reactionary delays. 



---


## 5. Methods — Data Pipeline


The following workflow is suggested to turn raw ADS-B and scheduling data into operational insights:

1. **Data Collection** : Retrieve actual flight data from **OpenSky** and scheduled plans from **Eurocontrol** for the same period.  
2. **Preprocessing** : Clean and filter flights by route and time, remove incomplete records, and standardize timestamps and airport codes.  
3. **Flight Matching** : Align actual flights with scheduled plans using the callsign, origin/destination, and date to estimate planned durations.  
4. **Analysis & Visualization** : Quantify delay patterns (primary and reactionary) and present them in an interactive dashboard showing route-level performance and delay propagation.



