There lies a delta between the actual trajectory an aircraft flies versus the trajectory generated by Air Traffic Control. Reducing the delta can improve Air Traffic operations, improving safety and efficiency. The Base of Aircraft Data (BADA) is used to calculate the trajectories however this data makes generalisations on the aircraft specifications such as giving an estimation of the mass. This work uses flight track data and meteorological data to calculate the actual flight parameters for an Airbus A320, these being the mass, Calibrated Air Speed (CAS) and Mach speed for the flight and uses machine learning to predict these parameters from the flight plan. For the purposes of this work, only the Climb phase was focussed on for the flight profile.
This work requires the following datasets:
- Flight Track data: For this work, flight track data from the OpenSky network in the form of ADS-B. Junzi Sun provides a python library called pyOpenSky library to access this data. A login is required which needs to be requested from OpenSky.
- Flight Plan data: EUROCONTROL DDR2 was used however OpenSky does provide a flights data table which could be a potential feature set for prediction.
- Meteorological data: This will come in the form of Mode-S messages, this can also come from OpenSky and Junzi Sun also has a python library called pyModeS. Note that only messages from the past year be retrieved.
- BADA files: These contain the aircraft data to generate the trajectory of the aircraft. This code uses the aircraft data to calculate the flight parameters. This data requires a license from EUROCONTROL.
The code also require a flight phase identifier. At the time of writing, I came across two which were openly available. Junzi provides flight phase identification through Fuzzy phase identification and Emy Arts uses k-means clustering to identify the phases of the flight track. I used the method by Arts as it demonstrated to be of better quality. Arts also conducts radar preprocessing to remove outliers from the flight track data which is important for parameter calculation later in this work.
The code python files are numbered and processed. The processing of flight data does incur losses for the following reasons: Only flight track data with climb profiles are processed, there might be no meteorological data for the flight and calculating the flight parameters may result in losses too. An identification number is given in each script to help you keep track of each set of data with respect to the flight plan you have chosen e.g. flight data to be processed from plans (May 2022) are given the number 2 and thus any flight data starts with the number 2.
This code uses the Demand Data Repository 2 from EUROCONTROL to obtain Flight Plan data.
Junzi pyopensky library is used to retrieve the flight track data. This is limited to data within the last 12 months due to the time limitation with retrieving Mode-S messages from OpenSky. Advice here is to retrieve data in one month block for a certain set of ICAO aircraft numbers otherwise it will take a long time to retrieve the ADS-B data.
After this step, flight data needs to have the flight phases identified. This can be done publicly available by the work on using LSTM to identify flight phases by Emy Arts or Junzi Sun Flight Data Processor.
A preprocessing file is taken from Art's flight phase LSTM and can be used prior to clean the flight data. It can be found in the 'src' folder called 'radar_preprocessing.py"
This comes in the form of Mode-S messages which is retrieved from OpenSky. Junzi Sun has a pyModeS library which can obtain these messages within python. Only messages from the past year can be obtained from OpenSky.
Parameters we are calculating are
- Mass - This is calculated for every available meteorological data
- Calibrated Air Speed (V1 and V2) - The formula used is only effective to calculate speeds up to 10000ft, this is for V1. To calculate V2 which is for speeds between 10000ft and the Mach Transition Altitude.
- Mach Speed - Speeds operating above the Mach Transition Altitude.
This file makes a number of assumptions to simplify the calculation however this will need to be ironed out in the future.
- V2 is not calculated.
- Bank Angle is assumed to be zero.
- CAS is assumed to be constant for the flight.
For the calculations the pyBADA library is used though this is not publicly available. It can only be retrieved with a BADA licence from EUROCONTROL. Future improvements will remove this.
These flight parameter values are outputted to a flight parameters file. The flight records file is used later to match up the flight parameters with the processed flight track data.
Processing of the flight plan data and the flight parameter response variables. These response variables are label encoded.
The notebook contains the machine learning models to predict the previous parameters. Principal Component analysis is carried out to reduce the dimensionality of the dataset. The following models are tested:
- Linear Regression
- Random Forest Regressor
- ElasticNet
- Support Vector Regressor
- K-Nearest Neighbors Regressor
- Lasso Regressor