
# Traffic and Airprox Correlations
> Author: A.Pilko@soton.ac.uk

2019 Air traffic data and 2000-2021 airprox data is used to investigate correlations in the datasets.

## Hypotheses:
- Airprox locations will have less ordered traffic flow, concretely the variance of traffic direction will positively correlate with airprox locations
- Airprox locations will positively correlate with traffic density
- Airprox locations will positively correlate with mean traffic flow speed
- Airprox locations will positively correlate with the variance of the flow speed


Import required libraries and pre-cleaned data

In [2]:
import geopandas as gpd
import pandas as pd
import seaborn as sns
import traffic
import numpy as np
import pyproj
from scipy.spatial.distance import cdist
import matplotlib.pyplot as plt
import joblib as jl

from cartopy.crs import Projection
from traffic.drawing import countries, lakes, ocean
from traffic.data import airports

%matplotlib notebook

In [2]:
airprox_gdf = gpd.GeoDataFrame(pd.read_pickle('../data/airprox_asp_2000_2021.pkl'))
# tfc_clean = traffic.core.Traffic.from_file('../data/cornwall/cornwall_tfc_clean_30s_lt3000ft_2019_f16.pkl.bz2')
tfc_clean = traffic.core.Traffic.from_file('../data/southeng/southeng_tfc_clean_lt5000ft_2019.pkl.bz2')

In [3]:
tfc_df = tfc_clean.data
tfc_df = tfc_df.dropna(axis=0)
tfc_df = tfc_df[(tfc_df['altitude'] > 0) & (tfc_df['altitude'] <= 1524)]
tfc_clean = traffic.core.Traffic(tfc_df)

In [4]:
# tfc_clean_data = pd.read_pickle('../data/cornwall/cornwall_tfc_clean_30s_lt3000ft_2019_f16.pkl.bz2')

In [7]:
tfc_clean.data.describe()

(6356586, 17)

## Airspace

There isn't much point analysing the traffic patterns for controlled airspace where ATC are issuing instructions or aircraft are (usually) following standard routes (SIDs, STARs). The UK airspace is used to filter out the traffic state vectors that are located in controlled airspace. All the traffic that is only in uncontrolled airspace is then used for the actual analysis.

In [5]:
import requests

req = requests.get('https://storage.googleapis.com/29f98e10-a489-4c82-ae5e-489dbcd4912f/gb_asp.geojson')
with open('gb_asp.geojson', 'w') as f:
    f.write(req.text)

In [6]:
ASP_TYPES = {
    0: "Other",
    1: "Restricted",
    2: "Danger",
    3: "Prohibited",
    4: "Controlled Tower Region (CTR)",
    5: "Transponder Mandatory Zone (TMZ)",
    6: "Radio Mandatory Zone (RMZ)",
    7: "Terminal Maneuvering Area (TMA)",
    8: "Temporary Reserved Area (TRA)",
    9: "Temporary Segregated Area (TSA)",
    10: "Flight Information Region (FIR)",
    11: "Upper Flight Information Region (UIR)",
    12: "Air Defense Identification Zone (ADIZ)",
    13: "Airport Traffic Zone (ATZ)",
    14: "Military Airport Traffic Zone (MATZ)",
    15: "Airway",
    16: "Military Training Route (MTR)",
    17: "Alert Area",
    18: "Warning Area",
    19: "Protected Area",
    20: "Helicopter Traffic Zone (HTZ)",
    21: "Gliding Sector",
    22: "Transponder Setting (TRP)",
    23: "Traffic Information Zone (TIZ)",
    24: "Traffic Information Area (TIA)",
    25: "Military Training Area (MTA)",
    26: "Controlled Area (CTA)",
    27: "ACC Sector (ACC)",
    28: "Aerial Sporting Or Recreational Activity",
    29: "Low Altitude Overflight Restriction"
}

ASP_CLASS = {
    0: "A",
    1: "B",
    2: "C",
    3: "D",
    4: "E",
    5: "F",
    6: "G",
    7: "Special Use Airspace (SUA)",
    8: "Unclassified"
}

ASP_ACTIVITIES = {
    0: "None - No specific activity (default)",
    1: "Parachuting Activity",
    2: "Aerobatics Activity",
    3: "Aeroclub And Arial Work Area",
    4: "Ultra Light Machine (ULM) Activity",
    5: "Hang Gliding/Paragliding"
}

ASP_ALT_UNIT = {
    0: "Meter",
    1: "Feet",
    6: "Flight Level",
}

ASP_ALT_DATUM = {
    0: "GND",
    1: "MSL",
    2: "STD",
}

In [7]:
asp_gdf = gpd.read_file('gb_asp.geojson')
asp_gdf = asp_gdf[(asp_gdf['approved'] == True) & (asp_gdf['onDemand'] == False) & (asp_gdf['onRequest'] == False) & (
        asp_gdf['byNotam'] == False) & (asp_gdf['specialAgreement'] == False) & (asp_gdf['icaoClass'] < 4)]
asp_gdf = asp_gdf.cx[
          tfc_clean.data.longitude.min():tfc_clean.data.longitude.max(),
          tfc_clean.data.latitude.min(): tfc_clean.data.latitude.max()
          ].reset_index()
asp_upper_lims = pd.DataFrame(pd.json_normalize(asp_gdf.upperLimit))
asp_lower_lims = pd.DataFrame(pd.json_normalize(asp_gdf.lowerLimit))
asp_upper_lims.columns = ['upperLimit_value', 'upperLimit_unit', 'upperLimit_ref']
asp_lower_lims.columns = ['lowerLimit_value', 'lowerLimit_unit', 'lowerLimit_ref']
asp_lim_df = pd.concat([asp_lower_lims, asp_upper_lims], axis=1)
asp_gdf = pd.concat([asp_gdf, asp_lim_df], axis=1)
asp_gdf = asp_gdf.drop(
    labels=['_id', 'approved', 'specialAgreement', 'onDemand', 'onRequest', 'byNotam', 'createdAt', 'createdBy',
            'updatedAt', 'updatedBy', 'upperLimit', 'lowerLimit'], axis=1)
for col in ['type', 'icaoClass', 'activity']:
    asp_gdf[col] = pd.Categorical(asp_gdf[col])
asp_gdf['type'] = asp_gdf['type'].cat.rename_categories(ASP_TYPES)
asp_gdf['icaoClass'] = asp_gdf['icaoClass'].cat.rename_categories(ASP_CLASS)
asp_gdf['activity'] = asp_gdf['activity'].cat.rename_categories(ASP_ACTIVITIES)
asp_gdf

Unnamed: 0,index,name,type,icaoClass,activity,country,geometry,lowerLimit_value,lowerLimit_unit,lowerLimit_ref,upperLimit_value,upperLimit_unit,upperLimit_ref
0,7,BERRY HEAD CTA,Other,A,None - No specific activity (default),GB,"POLYGON ((-2.89306 51.25139, -2.99333 50.70444...",105,6,2,195,6,2
1,35,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.52833 51.47333, -2.52833 51.47333...",1500,1,1,105,6,2
2,36,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.95883 51.37879, -2.95878 51.38819...",1500,1,1,105,6,2
3,37,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.43722 51.47667, -2.43722 51.47667...",2000,1,1,105,6,2
4,39,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.51250 51.30722, -2.51250 51.30722...",3000,1,1,105,6,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,908,LONDON STANSTED CTR,Other,D,None - No specific activity (default),GB,"POLYGON ((0.44806 51.90444, 0.21917 51.75222, ...",0,1,0,3500,1,1
110,941,SOUTHAMPTON CTR 120.230,Other,D,None - No specific activity (default),GB,"POLYGON ((-1.33806 51.08306, -1.33806 51.08306...",0,1,0,2000,1,1
111,942,SOUTHEND CTR 130.780,Other,D,None - No specific activity (default),GB,"POLYGON ((0.48417 51.57917, 0.75583 51.70167, ...",0,1,0,3500,1,1
112,943,SOUTHEND CTR 130.780,Other,D,None - No specific activity (default),GB,"POLYGON ((0.75583 51.70167, 0.79667 51.72000, ...",0,1,0,4500,1,1


In [8]:
def alt_std(row):
    cr = row.copy()
    if cr['upperLimit_unit'] == 1:
        cr['upperLimit_value'] /= 3.28084
    elif cr['upperLimit_unit'] == 6:
        cr['upperLimit_value'] *= 100/3.28084

    if cr['lowerLimit_unit'] == 1:
        cr['lowerLimit_value'] /= 3.28084
    elif cr['lowerLimit_unit'] == 6:
        cr['lowerLimit_value'] *= 100/3.28084

    return cr


asp_gdf = asp_gdf.apply(alt_std, axis=1).dropna()
asp_gdf = asp_gdf[asp_gdf['lowerLimit_value'] <= tfc_clean.data['altitude'].max()]
asp_gdf = asp_gdf.drop(labels=['upperLimit_unit', 'upperLimit_ref', 'lowerLimit_unit', 'lowerLimit_ref', 'index'], axis=1)

asp_gdf

Unnamed: 0,name,type,icaoClass,activity,country,geometry,lowerLimit_value,upperLimit_value
1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.52833 51.47333, -2.52833 51.47333...",457.199985,3200.399898
2,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.95883 51.37879, -2.95878 51.38819...",457.199985,3200.399898
3,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.43722 51.47667, -2.43722 51.47667...",609.599980,3200.399898
4,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.51250 51.30722, -2.51250 51.30722...",914.399971,3200.399898
5,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,"POLYGON ((-2.35111 51.47972, -2.35111 51.47972...",1066.799966,3200.399898
...,...,...,...,...,...,...,...,...
109,LONDON STANSTED CTR,Other,D,None - No specific activity (default),GB,"POLYGON ((0.44806 51.90444, 0.21917 51.75222, ...",0.000000,1066.799966
110,SOUTHAMPTON CTR 120.230,Other,D,None - No specific activity (default),GB,"POLYGON ((-1.33806 51.08306, -1.33806 51.08306...",0.000000,609.599980
111,SOUTHEND CTR 130.780,Other,D,None - No specific activity (default),GB,"POLYGON ((0.48417 51.57917, 0.75583 51.70167, ...",0.000000,1066.799966
112,SOUTHEND CTR 130.780,Other,D,None - No specific activity (default),GB,"POLYGON ((0.75583 51.70167, 0.79667 51.72000, ...",0.000000,1371.599956


In [9]:
tfc_gdf = gpd.GeoDataFrame(tfc_clean.data,
                           geometry=gpd.points_from_xy(tfc_clean.data['longitude'], tfc_clean.data['latitude']), crs='epsg:4326')

In [10]:
# del tfc_clean, airprox_gdf, asp_lower_lims, asp_upper_lims, tfc_df, asp_lim_df

Since shapely only supports 2D geometries, we need to get creative to filter based on 3D airspace volumes. We iterate airspaces and select all traffic that is between the floor and ceiling of that airspace. A 2D point-in-polygon test is then run as usual.

This takes a decent chunk of time...

In [12]:
def tfc_within(lim_asp):
    lim_tfc = tfc_gdf[
        (tfc_gdf['altitude'] >= lim_asp['lowerLimit_value']) & (tfc_gdf['altitude'] <= lim_asp['upperLimit_value'])]
    print(lim_asp['name'])
    return lim_tfc.sjoin(gpd.GeoDataFrame(lim_asp.to_frame().T).set_crs(asp_gdf.crs), predicate='within')


# joined_dfs = jl.Parallel(n_jobs=-1, verbose=20)(jl.delayed(tfc_within)(lim_asp) for _, lim_asp in list(asp_gdf.iterrows()))

joined_dfs = [tfc_within(lim_asp) for _, lim_asp in asp_gdf.iterrows()]

con_asp_tfc_gdf = pd.concat(joined_dfs, axis=0)

BRISTOL CTA 125.650
BRISTOL CTA 125.650
BRISTOL CTA 125.650
BRISTOL CTA 125.650
BRISTOL CTA 125.650
BRISTOL CTA 125.650
BRISTOL CTA 125.650
CARDIFF CTA 119.155
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
FARNBOROUGH CTA 133.440
LONDON GATWICK CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON LUTON CTA
LONDON STANSTED CTA
LONDON STANSTED CTA
LONDON STANSTED CTA
LONDON STANSTED CTA
LONDON CITY CTA
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOLENT CTA 120.230
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
SOUTHEND CTA 130.780
LONDON TMA
LONDON TMA
LONDON TMA
LONDON TMA
LONDON TMA
LOND

In [13]:
con_asp_tfc_gdf.to_pickle('../data/southeng/southeng_con_asp_tfc_2019.pkl.bz2', compression='bz2')
print(con_asp_tfc_gdf.shape)
con_asp_tfc_gdf.head()

(5735641, 26)


Unnamed: 0,timestamp,alert,altitude,callsign,geoaltitude,groundspeed,hour,icao24,latitude,longitude,...,track_unwrapped,geometry,index_right,name,type,icaoClass,activity,country,lowerLimit_value,upperLimit_value
17885,2019-01-01 00:04:00+00:00,False,775.0,HLE10,1525.0,103.0,2019-01-01 00:00:00+00:00,407152,51.3125,-2.511719,...,-14.039062,POINT (-2.51172 51.31250),1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,457.199985,3200.399898
17886,2019-01-01 00:04:30+00:00,False,800.0,HLE10,1550.0,103.0,2019-01-01 00:00:00+00:00,407152,51.3125,-2.515625,...,-13.625,POINT (-2.51562 51.31250),1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,457.199985,3200.399898
17887,2019-01-01 00:05:00+00:00,False,825.0,HLE10,1575.0,103.0,2019-01-01 00:00:00+00:00,407152,51.3125,-2.521484,...,-14.039062,POINT (-2.52148 51.31250),1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,457.199985,3200.399898
17888,2019-01-01 00:05:30+00:00,False,825.0,HLE10,1575.0,104.0,2019-01-01 00:00:00+00:00,407152,51.34375,-2.527344,...,-13.367188,POINT (-2.52734 51.34375),1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,457.199985,3200.399898
17889,2019-01-01 00:06:00+00:00,False,825.0,HLE10,1600.0,105.0,2019-01-01 00:00:00+00:00,407152,51.34375,-2.533203,...,-17.21875,POINT (-2.53320 51.34375),1,BRISTOL CTA 125.650,Other,D,None - No specific activity (default),GB,457.199985,3200.399898


In [25]:
unc_asp_tfc_gdf = pd.merge(tfc_gdf, con_asp_tfc_gdf, how="outer", indicator=True
                           ).query('_merge=="left_only"').drop(labels=['_merge'], axis=1)
unc_asp_tfc_gdf

Unnamed: 0,timestamp,alert,altitude,callsign,geoaltitude,groundspeed,hour,icao24,latitude,longitude,...,track_unwrapped,geometry,index_right,name,type,icaoClass,activity,country,lowerLimit_value,upperLimit_value
552,2019-01-01 13:26:30+00:00,False,500.0,FGITZ,1300.0,89.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.082031,...,259.75,POINT (-2.08203 51.90625),,,,,,,,
553,2019-01-01 13:27:00+00:00,False,500.0,FGITZ,1250.0,75.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.095703,...,266.25,POINT (-2.09570 51.90625),,,,,,,,
554,2019-01-01 13:27:30+00:00,False,200.0,FGITZ,950.0,78.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.113281,...,263.25,POINT (-2.11328 51.90625),,,,,,,,
794,2019-01-01 13:30:00+00:00,False,700.0,DIDWC,1375.0,138.0,2019-01-01 13:00:00+00:00,3e2172,51.37500,0.060791,...,205.25,POINT (0.06079 51.37500),,,,,,,,
795,2019-01-01 13:30:30+00:00,False,275.0,DIDWC,1050.0,112.0,2019-01-01 13:00:00+00:00,3e2172,51.34375,0.048187,...,207.75,POINT (0.04819 51.34375),,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6645874,2019-12-31 13:44:00+00:00,False,200.0,N936CT,875.0,131.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.169678,...,413.25,POINT (0.16968 51.68750),,,,,,,,
6645875,2019-12-31 13:44:30+00:00,False,225.0,N936CT,900.0,137.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.179443,...,588.00,POINT (0.17944 51.68750),,,,,,,,
6645876,2019-12-31 13:45:00+00:00,False,250.0,N936CT,875.0,127.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.153564,...,671.00,POINT (0.15356 51.68750),,,,,,,,
6645877,2019-12-31 13:45:30+00:00,False,175.0,N936CT,825.0,122.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.132812,...,548.00,POINT (0.13281 51.68750),,,,,,,,


In [3]:
# unc_asp_tfc_gdf.to_pickle('../data/southeng/southeng_unc_asp_tfc_2019.pkl.bz2', compression='bz2')
unc_asp_tfc_gdf = pd.read_pickle('../data/southeng/southeng_unc_asp_tfc_2019.pkl.bz2', compression='bz2')

In [32]:
unc_asp_tfc_gdf['type'] = 0
unc_asp_tfc_gdf['icaoClass'] = 6
unc_asp_tfc_gdf['name'] = 'UNCONTROLLED AIRSPACE'
# unc_asp_tfc_gdf = unc_asp_tfc_gdf.dropna(axis=0)
unc_asp_tfc_gdf = unc_asp_tfc_gdf[(unc_asp_tfc_gdf['altitude'] > 0) & (unc_asp_tfc_gdf['altitude'] <= 304.8*4)]

unc_asp_tfc_gdf = unc_asp_tfc_gdf.drop(
    labels=['index_right', 'country', 'lowerLimit_value', 'upperLimit_value', 'activity'], axis=1)

KeyError: "['index_right', 'country', 'lowerLimit_value', 'upperLimit_value', 'activity'] not found in axis"

In [4]:
unc_asp_tfc_gdf

Unnamed: 0,timestamp,alert,altitude,callsign,geoaltitude,groundspeed,hour,icao24,latitude,longitude,...,spi,squawk,track,vertical_rate,flight_id,track_unwrapped,geometry,name,type,icaoClass
552,2019-01-01 13:26:30+00:00,False,500.0,FGITZ,1300.0,89.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.082031,...,False,7000,259.75000,-704.0,FGITZ_096,259.75,POINT (-2.08203 51.90625),UNCONTROLLED AIRSPACE,0,6
553,2019-01-01 13:27:00+00:00,False,500.0,FGITZ,1250.0,75.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.095703,...,False,7000,266.25000,-320.0,FGITZ_096,266.25,POINT (-2.09570 51.90625),UNCONTROLLED AIRSPACE,0,6
554,2019-01-01 13:27:30+00:00,False,200.0,FGITZ,950.0,78.0,2019-01-01 13:00:00+00:00,392279,51.90625,-2.113281,...,False,7000,263.25000,-512.0,FGITZ_096,263.25,POINT (-2.11328 51.90625),UNCONTROLLED AIRSPACE,0,6
794,2019-01-01 13:30:00+00:00,False,700.0,DIDWC,1375.0,138.0,2019-01-01 13:00:00+00:00,3e2172,51.37500,0.060791,...,False,4102,205.25000,-768.0,DIDWC_168,205.25,POINT (0.06079 51.37500),UNCONTROLLED AIRSPACE,0,6
795,2019-01-01 13:30:30+00:00,False,275.0,DIDWC,1050.0,112.0,2019-01-01 13:00:00+00:00,3e2172,51.34375,0.048187,...,False,4102,207.75000,-704.0,DIDWC_168,207.75,POINT (0.04819 51.34375),UNCONTROLLED AIRSPACE,0,6
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6645874,2019-12-31 13:44:00+00:00,False,200.0,N936CT,875.0,131.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.169678,...,False,7000,53.21875,192.0,N936CT_2733,413.25,POINT (0.16968 51.68750),UNCONTROLLED AIRSPACE,0,6
6645875,2019-12-31 13:44:30+00:00,False,225.0,N936CT,900.0,137.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.179443,...,False,7000,228.00000,256.0,N936CT_2733,588.00,POINT (0.17944 51.68750),UNCONTROLLED AIRSPACE,0,6
6645876,2019-12-31 13:45:00+00:00,False,250.0,N936CT,875.0,127.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.153564,...,False,7000,310.75000,256.0,N936CT_2733,671.00,POINT (0.15356 51.68750),UNCONTROLLED AIRSPACE,0,6
6645877,2019-12-31 13:45:30+00:00,False,175.0,N936CT,825.0,122.0,2019-12-31 13:00:00+00:00,acfc37,51.68750,0.132812,...,False,7000,188.00000,-192.0,N936CT_2733,548.00,POINT (0.13281 51.68750),UNCONTROLLED AIRSPACE,0,6


In [5]:
unc_asp_tfc = traffic.core.Traffic(unc_asp_tfc_gdf)

In [None]:
unc_asp_tfc_alt_gdf =  unc_asp_tfc_gdf.groupby(['track', pd.cut(unc_asp_tfc_gdf['altitude'], [x for x in range(0,int(unc_asp_tfc_gdf.altitude.max()+1), 500)], right=True)])

Aggregate traffic data by projected XY and collect statistics for each cell.

In [96]:
res = 7000
tfc_unc_xy_gdf = unc_asp_tfc.compute_xy('epsg:3857')
tfc_agg = tfc_unc_xy_gdf.assign(
    x=lambda elt: (elt.x // res) * res,
    y=lambda elt: (elt.y // res) * res,
).groupby(["x", "y"]).agg(altitude_mean=pd.NamedAgg('altitude', np.nanmean),
                          altitude_std=pd.NamedAgg('altitude', np.std), track_mean=pd.NamedAgg('track', np.nanmean),
                          track_std=pd.NamedAgg('track', np.std),
                          groundspeed_mean=pd.NamedAgg('groundspeed', np.nanmean),
                          groundspeed_std=pd.NamedAgg('groundspeed', np.std),
                          flight_id_nunique=('flight_id', 'nunique'))

Only use cells with over 30 samples in order for the Central Limit Theorem to hold. This ensures the distributions we extract from these cells are valid approximations of a Gaussian distribution.

In [97]:
tfc_magg = tfc_agg#[tfc_agg['flight_id_nunique'] > 30]
tfc_gdf = tfc_agg.reset_index()
tfc_mgdf = tfc_magg.reset_index()
tfc_magg.head(10)

Unnamed: 0_level_0,Unnamed: 1_level_0,altitude_mean,altitude_std,track_mean,track_std,groundspeed_mean,groundspeed_std,flight_id_nunique
x,y,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
-329000.0,6566000.0,650.0,494.974747,73.0,4.861359,131.0,11.313708,2
-329000.0,6573000.0,303.25,7.572838,182.625,47.35188,33.71875,19.015898,1
-329000.0,6580000.0,975.0,0.0,60.65625,0.0,120.0,0.0,1
-329000.0,6587000.0,850.0,0.0,16.1875,0.044194,113.0,2.828427,1
-329000.0,6608000.0,850.0,,233.875,,113.0,,1
-329000.0,6615000.0,937.5,175.254916,92.875,69.88871,118.625,18.047061,5
-329000.0,6629000.0,1100.0,,240.0,,96.0,,1
-329000.0,6657000.0,1050.0,79.056942,111.1875,98.110935,119.8125,17.935997,4
-329000.0,6664000.0,669.0,21.650635,192.125,47.135237,121.1875,2.886751,2
-329000.0,6671000.0,819.0,192.469023,175.125,25.491175,68.8125,35.329874,2


In [111]:
airprox_gdf

Unnamed: 0,AirproxID,Latitude,Longitude,Altitude,Risk,Aircraft1_Classification,Aircraft1_Category,Aircraft1_Type,Aircraft1_FlightRules,Aircraft2_Classification,Aircraft2_Category,Aircraft2_Type,Aircraft2_FlightRules,Combined_Rules,x,y,geometry,name,type,icaoClass
919,2011013,51.016667,-2.633333,10.0,e,military,rotorcraft_-_helicopter,OTHER - Military (Lynx),vfr,military,fixed_wing_-_aeroplane,OTHER - Military (Hawk),vfr,vfr,-293141.325756,6.624242e+06,POINT Z (-2.63333 51.01667 10.00000),YEOVILTON MATZ 127.350,Military Airport Traffic Zone (MATZ),G
2328,2015181,50.850000,0.850000,74.0,e,commercial_air_transport,fixed_wing_-_aeroplane,ATR - ATR42,ifr,general_aviation,fixed_wing_-_aeroplane,CESSNA - 525,ifr,ifr,94621.567174,6.594803e+06,POINT Z (0.85000 50.85000 74.00000),LYDD ILS,Other,G
1569,2006080,51.133333,-1.766667,100.0,c,military,military_fixed_wing,"GROB 115, TUTOR",vfr,military,military_fixed_wing,"GROB 115, TUTOR",vfr,vfr,-196664.433735,6.644913e+06,POINT Z (-1.76667 51.13333 100.00000),BOSCOMBE DOWN/MIDDLE WALLOP MATZ 126.700,Military Airport Traffic Zone (MATZ),G
737,2021015,51.200000,-1.833333,100.0,c,ua/other,rpas,OTHER - Military (RPAS),vfr,military,rotorcraft_-_helicopter,OTHER - Military (Chinook),vfr,vfr,-204085.733121,6.656748e+06,POINT Z (-1.83333 51.20000 100.00000),BOSCOMBE DOWN/MIDDLE WALLOP MATZ 126.700,Military Airport Traffic Zone (MATZ),G
1569,2006080,51.133333,-1.766667,100.0,c,military,military_fixed_wing,"GROB 115, TUTOR",vfr,military,military_fixed_wing,"GROB 115, TUTOR",vfr,vfr,-196664.433735,6.644913e+06,POINT Z (-1.76667 51.13333 100.00000),D120 BOSCOMBE DOWN (NOTAM),Other,G
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4584,2021230,51.500000,-1.633333,2300.0,c,general_aviation,fixed_wing_-_aeroplane,PIPER - PA28,vfr,general_aviation,fixed_wing_-_aeroplane,PIPER - PA28,vfr,vfr,-181821.834962,6.710219e+06,POINT Z (-1.63333 51.50000 2300.00000),UNCONTROLLED AIRSPACE,0,6
4586,2021233,50.966667,-0.900000,900.0,c,military,rotorcraft_-_helicopter,OTHER - Military (Chinook),vfr,general_aviation,fixed_wing_-_sailplane_(glider),OTHER (HPH Shark),vfr,vfr,-100187.541714,6.615400e+06,POINT Z (-0.90000 50.96667 900.00000),UNCONTROLLED AIRSPACE,0,6
4587,2021235,51.416667,-0.100000,1350.0,c,emergency_services,rotorcraft_-_helicopter,MBB - BK117 (EC145),vfr,general_aviation,rotorcraft_-_helicopter,AGUSTA - A109,vfr,vfr,-11131.949079,6.695331e+06,POINT Z (-0.10000 51.41667 1350.00000),UNCONTROLLED AIRSPACE,0,6
4593,2021242,50.933333,-2.883333,14900.0,e,emergency_services,fixed_wing_-_aeroplane,BAE - AVRO146RJ - 100 - 70,vfr,military,fixed_wing_-_aeroplane,OTHER - Military (Hawk T1),vfr,vfr,-320971.198454,6.609510e+06,POINT Z (-2.88333 50.93333 14900.00000),UNCONTROLLED AIRSPACE,0,6


In [112]:
x_idx = np.array(tfc_agg.index.levels[0])
y_idx = np.array(tfc_agg.index.levels[1])

In [113]:
airprox_gdf = airprox_gdf[
    (airprox_gdf.Latitude >= tfc_clean.data.latitude.min()) &
    (airprox_gdf.Latitude <= tfc_clean.data.latitude.max()) &
    (airprox_gdf.Longitude >= tfc_clean.data.longitude.min()) &
    (airprox_gdf.Longitude <= tfc_clean.data.longitude.max()) &
    ((airprox_gdf.icaoClass == 6) | (airprox_gdf.icaoClass == 'G') | (
            airprox_gdf.type == 'Radio Mandatory Zone (RMZ)') | (airprox_gdf.type == 'Gliding Sector'))
    ]

In [114]:
transformer = pyproj.Transformer.from_proj(pyproj.Proj("epsg:4326"), pyproj.Proj("epsg:3857"), always_xy=True)
x, y = transformer.transform(
    airprox_gdf.Longitude.values,
    airprox_gdf.Latitude.values,
)
airprox_gdf = airprox_gdf.assign(x=x, y=y)

Match up the locations of airproxes with the traffic stats in that cell

In [115]:
tfc_grid = np.array(tfc_magg.reset_index()[['x', 'y']])
airprox_locs = np.array(airprox_gdf[['x', 'y']])

In [116]:
tfc_idxs = cdist(tfc_grid, airprox_locs).argmin(axis=0)

In [117]:
tfc_cells = tfc_magg.iloc[tfc_idxs].reset_index()
airproxes_with_tfc = pd.concat([airprox_gdf.reset_index(), tfc_cells], axis=1)
airproxes_with_tfc = airproxes_with_tfc.drop(labels=['index', 'x', 'y'], axis=1)
airproxes_with_tfc

Unnamed: 0,AirproxID,Latitude,Longitude,Altitude,Risk,Aircraft1_Classification,Aircraft1_Category,Aircraft1_Type,Aircraft1_FlightRules,Aircraft2_Classification,...,name,type,icaoClass,altitude_mean,altitude_std,track_mean,track_std,groundspeed_mean,groundspeed_std,flight_id_nunique
0,2011013,51.016667,-2.633333,10.0,e,military,rotorcraft_-_helicopter,OTHER - Military (Lynx),vfr,military,...,YEOVILTON MATZ 127.350,Military Airport Traffic Zone (MATZ),G,972.0,212.210307,174.125,43.896037,111.1250,20.865659,48
1,2015181,50.850000,0.850000,74.0,e,commercial_air_transport,fixed_wing_-_aeroplane,ATR - ATR42,ifr,general_aviation,...,LYDD ILS,Other,G,582.0,346.802659,150.375,90.518608,108.2500,31.903754,48
2,2006080,51.133333,-1.766667,100.0,c,military,military_fixed_wing,"GROB 115, TUTOR",vfr,military,...,BOSCOMBE DOWN/MIDDLE WALLOP MATZ 126.700,Military Airport Traffic Zone (MATZ),G,931.5,198.321348,144.875,112.386535,103.6875,38.780889,47
3,2021015,51.200000,-1.833333,100.0,c,ua/other,rpas,OTHER - Military (RPAS),vfr,military,...,BOSCOMBE DOWN/MIDDLE WALLOP MATZ 126.700,Military Airport Traffic Zone (MATZ),G,1041.0,121.564468,189.125,66.740740,136.5000,24.125334,37
4,2006080,51.133333,-1.766667,100.0,c,military,military_fixed_wing,"GROB 115, TUTOR",vfr,military,...,D120 BOSCOMBE DOWN (NOTAM),Other,G,931.5,198.321348,144.875,112.386535,103.6875,38.780889,47
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
829,2021230,51.500000,-1.633333,2300.0,c,general_aviation,fixed_wing_-_aeroplane,PIPER - PA28,vfr,general_aviation,...,UNCONTROLLED AIRSPACE,0,6,1043.0,130.672089,251.625,68.538020,94.3750,18.402008,31
830,2021233,50.966667,-0.900000,900.0,c,military,rotorcraft_-_helicopter,OTHER - Military (Chinook),vfr,general_aviation,...,UNCONTROLLED AIRSPACE,0,6,901.0,238.741897,163.875,75.731276,96.2500,22.038467,33
831,2021235,51.416667,-0.100000,1350.0,c,emergency_services,rotorcraft_-_helicopter,MBB - BK117 (EC145),vfr,general_aviation,...,UNCONTROLLED AIRSPACE,0,6,613.5,110.534275,205.500,5.085466,120.0625,16.139121,567
832,2021242,50.933333,-2.883333,14900.0,e,emergency_services,fixed_wing_-_aeroplane,BAE - AVRO146RJ - 100 - 70,vfr,military,...,UNCONTROLLED AIRSPACE,0,6,683.5,250.397433,213.500,131.898139,91.6875,17.752320,36


In [118]:
non_airprox_tfc = pd.merge(tfc_magg.reset_index(), tfc_cells.reset_index(), how="outer", indicator=True
                           ).query('_merge=="left_only"')
non_airprox_tfc

Unnamed: 0,x,y,altitude_mean,altitude_std,track_mean,track_std,groundspeed_mean,groundspeed_std,flight_id_nunique,index,_merge
10,-291000.0,6710000.0,788.50,236.967106,179.000,103.936179,88.937500,25.750512,42,,left_only
15,-290000.0,6710000.0,693.50,262.084432,177.000,103.650206,71.250000,27.441272,76,,left_only
16,-289000.0,6704000.0,276.25,114.703908,217.000,85.064000,24.265625,18.048948,73,,left_only
17,-289000.0,6710000.0,444.25,269.827126,204.125,91.093887,16.015625,21.545005,171,,left_only
38,-264000.0,6615000.0,859.00,260.821741,184.875,109.625151,88.187500,16.429800,88,,left_only
...,...,...,...,...,...,...,...,...,...,...,...
2250,156000.0,6660000.0,686.00,316.793629,182.000,107.294609,83.375000,19.260690,32,,left_only
2251,159000.0,6682000.0,535.00,405.841401,138.375,94.204842,70.750000,33.629425,35,,left_only
2252,160000.0,6682000.0,562.50,333.767434,145.000,105.788956,81.312500,41.836407,40,,left_only
2253,161000.0,6682000.0,663.50,361.120062,138.750,105.869766,90.062500,24.853361,34,,left_only


Sanity check the data at this point by plotting

In [119]:
airproxes_with_tfc.explore('altitude_mean', cmap='inferno')

  return umr_sum(a, axis, dtype, out, keepdims, initial, where)


Examine the spatial coverage of the data. this is the area within which we can apply CLT and extract valid distributions

In [120]:
fig, ax = plt.subplots(
    1, 1, figsize=(11, 11), subplot_kw=dict(projection=Projection('epsg:3857')),
)

ax.add_feature(countries())
ax.add_feature(lakes())
ax.add_feature(ocean())

flow = ax.tricontourf(
    tfc_gdf[tfc_gdf['flight_id_nunique'] > 30]['x'],
    tfc_gdf[tfc_gdf['flight_id_nunique'] > 30]['y'],
    np.log(tfc_gdf[tfc_gdf['flight_id_nunique'] > 30]['flight_id_nunique']),
    alpha=0.5,
    cmap='inferno')

# aps = ax.scatter(airprox_gdf['x'], airprox_gdf['y'], c='r', marker='x')

ax.set_title('Sample Count')
cb = fig.colorbar(flow)
cb.set_label('Samples')
# ax.legend([aps], ['Airprox'])

<IPython.core.display.Javascript object>

  cb = fig.colorbar(flow)


In [92]:
from cartes.crs import LambertConformal, EPSG_27700, PlateCarree, EuroPP, Mercator, Projection
from traffic.drawing import countries, lakes, ocean
bounds = (-2.9, 1.5, 50.5, 51.9)

fig, ax = plt.subplots(
    1, 1, figsize=(12,6), subplot_kw=dict(projection=Projection('epsg:3857')),
)

ax.add_feature(countries())
ax.add_feature(lakes())
ax.add_feature(ocean())
# ax.set_extent(bounds)
# ax.set_global()

xs = np.sort(tfc_magg['flight_id_nunique'].reset_index()['x'].unique().astype(int))
ys = np.sort(tfc_magg['flight_id_nunique'].reset_index()['y'].unique().astype(int))

pcm = ax.pcolormesh(xs, ys, np.log(tfc_magg['flight_id_nunique'].reset_index().pivot_table('flight_id_nunique', 'y', 'x', fill_value=np.nan))
, cmap='inferno', alpha=0.5)

airports['EGHL'].point.plot(ax, alpha=0.2)
airports['EGTK'].point.plot(ax, alpha=0.2)
airports['EGKA'].point.plot(ax, alpha=0.2)
airports['EGMD'].point.plot(ax, alpha=0.2)
airports['EGTB'].point.plot(ax, alpha=0.2)
airports['EGHO'].point.plot(ax, alpha=0.2)
airports['EGBP'].point.plot(ax, alpha=0.2)
airports['EGKH'].point.plot(ax, alpha=0.2)

cb = fig.colorbar(pcm)
cb.set_label('ln Traffic Counts')
# tfc_magg['flight_id_nunique'].to_xarray().sortby('x').plot.pcolormesh(
#     ax=ax,
#     alpha=0.4,
#     cmap="inferno",
# )
# fig.show()

<IPython.core.display.Javascript object>

  cb = fig.colorbar(pcm)


Plot a correlation matrix between all variables using the Pearson Correlation Coefficient

In [None]:
# corr = airproxes_with_tfc.corr(method='spearman')
corr = airproxes_with_tfc.apply(lambda x: pd.factorize(x)[0]).corr(method='pearson', min_periods=1)
fig, ax = plt.subplots(figsize=(20, 20))
sns.heatmap(corr, square=True, cmap=sns.color_palette('icefire', as_cmap=True), annot=True, ax=ax)
plt.savefig('corr.svg')

Compute vectors for the quiver plot

In [123]:
tfc_mgdf['track_scale'] = 1 - (tfc_mgdf['track_std'] / tfc_mgdf['track_std'].max())
tfc_mgdf['track_u'] = np.cos(np.radians(tfc_mgdf['track_mean'])) * tfc_mgdf['track_scale']
tfc_mgdf['track_v'] = np.sin(np.radians(tfc_mgdf['track_mean'])) * tfc_mgdf['track_scale']
tfc_mgdf.head()

Unnamed: 0,x,y,altitude_mean,altitude_std,track_mean,track_std,groundspeed_mean,groundspeed_std,flight_id_nunique,track_scale,track_u,track_v
0,-298000.0,6738000.0,643.5,241.913223,137.875,114.594545,69.3125,37.967881,34,0.251386,-0.186452,0.168654
1,-297000.0,6710000.0,787.5,268.826816,155.875,103.243168,109.0,34.299093,31,0.325541,-0.297088,0.133046
2,-297000.0,6732000.0,792.0,309.244177,225.0,118.786662,100.5,20.969857,42,0.224,-0.158265,-0.158484
3,-291000.0,6710000.0,788.5,236.967106,179.0,103.936179,88.9375,25.750512,42,0.321014,-0.321014,0.005324
4,-290000.0,6704000.0,276.25,111.391024,145.75,87.917795,38.53125,18.920367,31,0.425657,-0.351666,0.239848


Plot the mean traffic flow direction for cells with sufficient samples. The scale of the vectors is inversely proportional to the standard deviation of the distribution of directions for that cell. In practice, this means the longer the arrow the more unidirectional and organised the traffic flow is.

Vector colouring is based on direction of the vector and is only to provide more visual difference.

Airprox locations are superimposed for information only.

Both a quiver and contour plot are made to based on the same data

In [126]:
fig, ax = plt.subplots(
    1, 1, figsize=(11, 11), subplot_kw=dict(projection=Projection('epsg:3857')),
)

ax.add_feature(countries())
ax.add_feature(lakes())
ax.add_feature(ocean())

flow = ax.quiver(tfc_mgdf['x'],
                 tfc_mgdf['y'],
                 tfc_mgdf['track_u'],
                 tfc_mgdf['track_v'],
                 tfc_mgdf['track_mean'],
                 scale_units=None,
                 cmap='cool')

# aps = ax.scatter(airprox_gdf['x'], airprox_gdf['y'], c='r', marker='x')

ax.set_title('Mean traffic flow')
cb = fig.colorbar(flow)
cb.set_label('Mean traffic flow')
ax.legend([aps], ['Airprox'])

# airports['EGHQ'].point.plot(ax)
# airports['EGHE'].point.plot(ax)
# airports['EGHC'].point.plot(ax)

<IPython.core.display.Javascript object>

  cb = fig.colorbar(flow)


<matplotlib.legend.Legend at 0x1fb54b0b790>

In [125]:
alt_tfc_magg = []
alt_bins = range(0,int(5000+1), 500)
for floor, ceil in zip(alt_bins, alt_bins[1:]):
    tfc_alt_gdf = tfc_unc_xy_gdf.data[(tfc_unc_xy_gdf.data['altitude'] >= floor) & (tfc_unc_xy_gdf.data['altitude'] < ceil)]
    tfc_alt_agg = tfc_alt_gdf.assign(
        x=lambda elt: (elt.x // res) * res,
        y=lambda elt: (elt.y // res) * res,
    ).groupby(["x", "y"]).agg(altitude_mean=pd.NamedAgg('altitude', np.nanmean),
                              altitude_std=pd.NamedAgg('altitude', np.std), track_mean=pd.NamedAgg('track', np.nanmean),
                              track_std=pd.NamedAgg('track', np.std),
                              groundspeed_mean=pd.NamedAgg('groundspeed', np.nanmean),
                              groundspeed_std=pd.NamedAgg('groundspeed', np.std),
                              flight_id_nunique=('flight_id', 'nunique'))
    tfc_alt_mgdf = tfc_alt_agg[tfc_alt_agg['flight_id_nunique'] > 30].reset_index()
    alt_tfc_magg.append(tfc_alt_mgdf)
    
    tfc_alt_mgdf['track_scale'] = 1 - (tfc_alt_mgdf['track_std'] / tfc_alt_mgdf['track_std'].max())
    tfc_alt_mgdf['track_u'] = np.cos(np.radians(tfc_alt_mgdf['track_mean'])) * tfc_alt_mgdf['track_scale']
    tfc_alt_mgdf['track_v'] = np.sin(np.radians(tfc_alt_mgdf['track_mean'])) * tfc_alt_mgdf['track_scale']

    fig, ax = plt.subplots(
        1, 1, figsize=(11, 11), subplot_kw=dict(projection=Projection('epsg:3857')),
    )

    ax.add_feature(countries())
    ax.add_feature(lakes())
    ax.add_feature(ocean())

    flow = ax.quiver(tfc_alt_mgdf['x'],
                     tfc_alt_mgdf['y'],
                     tfc_alt_mgdf['track_u'],
                     tfc_alt_mgdf['track_v'],
                     tfc_alt_mgdf['track_mean'],
                     scale_units=None,
                     cmap='cool')

    aps = ax.scatter(airprox_gdf['x'], airprox_gdf['y'], c='r', marker='x')

    ax.set_title(f'Mean traffic flow: {floor}ft - {ceil}ft AGL')
    # cb = fig.colorbar(flow)
    # cb.set_label('Mean traffic flow')
    ax.legend([aps], ['Airprox'])
    fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


<IPython.core.display.Javascript object>

  fig.savefig(f'southeng_quiver_{floor}.png', tightlayout=True)
  amean = a.mean()
  ret = ret.dtype.type(ret / rcount)


In [127]:
from cartopy.crs import Projection
from traffic.drawing import countries, lakes, ocean
from traffic.data import airports

fig, ax = plt.subplots(
    1, 1, figsize=(11, 11), subplot_kw=dict(projection=Projection('epsg:3857')),
)

ax.add_feature(countries())
ax.add_feature(lakes())
ax.add_feature(ocean())

flow = ax.tricontourf(tfc_mgdf['x'],
                      tfc_mgdf['y'],
                      # tfc_gdf['track_u'],
                      # tfc_gdf['track_v'],
                      tfc_mgdf['track_mean'],
                      alpha=0.5,
                      cmap='inferno')

aps = ax.scatter(airprox_gdf['x'], airprox_gdf['y'], c='r', marker='x')

ax.set_title('Mean traffic flow')
cb = fig.colorbar(flow)
cb.set_label('Mean traffic bearing')
ax.legend([aps], ['Airprox'])

# airports['EGHQ'].point.plot(ax)
# airports['EGHE'].point.plot(ax)
# airports['EGHC'].point.plot(ax)

<IPython.core.display.Javascript object>

  cb = fig.colorbar(flow)


<matplotlib.legend.Legend at 0x1fb78721250>

In [128]:
fig, ax = plt.subplots(
    1, 1, figsize=(11, 11), subplot_kw=dict(projection=Projection('epsg:3857')),
)

ax.add_feature(countries())
ax.add_feature(lakes())
ax.add_feature(ocean())

flow = ax.tricontourf(
    tfc_mgdf['x'],
    tfc_mgdf['y'],
    tfc_mgdf['altitude_mean'],
    alpha=0.5,
    cmap='inferno')

# aps = ax.scatter(airprox_gdf['x'], airprox_gdf['y'], c='r', marker='x')

ax.set_title('Mean altitude')
cb = fig.colorbar(flow)
cb.set_label('Mean altitude')
# ax.legend([aps], ['Airprox'])

<IPython.core.display.Javascript object>

  cb = fig.colorbar(flow)


# Aggregate Stats

In [2]:
import shapely.geometry as sg

print('World space stats (uncontrolled volumes):')
ceiling_alt = 304.8 * 4
bounds = (-2.9, 50.5, 1.5, 51.9)
transformer = pyproj.Transformer.from_proj(pyproj.Proj("epsg:4326"), pyproj.Proj("epsg:3857"), always_xy=True)
trans_bounds = transformer.transform_bounds(*bounds)
bound_poly = sg.box(*trans_bounds)
total_vol = bound_poly.area * ceiling_alt
print(f"Total area: {bound_poly.area} m^2")
print(f"Total volume: {bound_poly.area * ceiling_alt} m^3")
coords = [np.array(c) for c in bound_poly.exterior.coords[:-1]]
coord_dists = np.unique(cdist(coords, coords).round(decimals=3))
coord_dists = coord_dists[coord_dists > 0]
print(f'Total x,y,z dimensions are {coord_dists[0]}m, {coord_dists[1]}m, {ceiling_alt}m with xy diagonal {coord_dists[2]}m')

World space stats (uncontrolled volumes):


NameError: name 'pyproj' is not defined

In [99]:
total_tfc = len(unc_asp_tfc_gdf.groupby('flight_id').groups)
print(f'Total Traffic Density in uncontrolled airspace over year: {total_tfc/total_vol/8766} aircraft/m^3/hr')

Total Traffic Density in uncontrolled airspace over year: 2.716924742893255e-14 aircraft/m^3/hr


If we discard cells without any traffic recorded for the year, we can get the traffic density for all the areas aircraft actually fly

In [1]:
cell_vol = res * res * ceiling_alt
data_vol = len(tfc_magg['flight_id_nunique']) * cell_vol
print(f'Data Volume: {data_vol}')
print(f'Mean Traffic Density in uncontrolled airspace over year for data area: {total_tfc/data_vol/8766} aircraft/m^3/hour')
# print(f'Equivalent to an aircraft per {np.sqrt((1/cell_traffic_densities.mean())/152.4)}m x 500ft cuboid per hour')

NameError: name 'res' is not defined

## Testing hypotheses
All tests are done to 5% significance unless otherwise specified.

In [56]:
from scipy import stats as ss

sig_lvl = 0.05

### Track correlation
First, the correlation of direction variance with airprox location is tested. The mean standard deviation for directions in the entire area is found and compared to that of just where airproxes occurred:

In [57]:
print('Overall mean of stddev: ', non_airprox_tfc['track_std'].mean(), ' for ', len(non_airprox_tfc['track_std']),
      ' samples')
print('Airprox location mean of stddev: ', airproxes_with_tfc['track_std'].mean(), 'for ',
      len(airproxes_with_tfc['track_std']), ' samples')

NameError: name 'non_airprox_tfc' is not defined

In [133]:
F, p = ss.bartlett(non_airprox_tfc['track_std'], airproxes_with_tfc['track_std'])
print(f'Bartlett equal variance test gives score of {F} at a p-significance of {p}')
if p <= sig_lvl:
    print(f'The hypothesis is accepted (F={F}, p={p})')
else:
    print('Null hypothesis is accepted.')

Bartlett equal variance test gives score of 89.60671222555828 at a p-significance of 2.9053676860724746e-21
The hypothesis is accepted (F=89.60671222555828, p=2.9053676860724746e-21)


### Density Correlation

The count of unique flights within a cell is used as a measure of traffic density.

Otherwise the same procedure as above

In [134]:
print('Overall mean: ', non_airprox_tfc['flight_id_nunique'].mean(), ' for ', len(non_airprox_tfc['flight_id_nunique']),
      ' samples')
print('Airprox location mean: ', airproxes_with_tfc['flight_id_nunique'].mean(), 'for ',
      len(airproxes_with_tfc['flight_id_nunique']), ' samples')

Overall mean:  95.29415904292752  for  1421  samples
Airprox location mean:  88.20143884892086 for  834  samples


In [135]:
F, p = ss.f_oneway(non_airprox_tfc['flight_id_nunique'], airproxes_with_tfc['flight_id_nunique'])
print(f'One-Way ANOVA test gives F-score of {F} at a p-significance of {p}')
if p <= sig_lvl:
    print(f'The hypothesis is accepted (F={F}, p={p})')
else:
    print('Null hypothesis is accepted.')

One-Way ANOVA test gives F-score of 0.38711582771521547 at a p-significance of 0.5338823134084051
Null hypothesis is accepted.


### Speed correlation

First the difference in the overall traffic flow speed is compared between airprox and non-airprox traffic

In [136]:
print('Overall mean: ', np.array(non_airprox_tfc['groundspeed_mean']).mean(), ' for ',
      len(non_airprox_tfc['groundspeed_mean']),
      ' samples')
print('Airprox location mean: ', np.array(airproxes_with_tfc['groundspeed_mean']).mean(), 'for ',
      len(airproxes_with_tfc['groundspeed_mean']), ' samples')

Overall mean:  93.06  for  1421  samples
Airprox location mean:  91.94 for  834  samples


In [137]:
F, p = ss.f_oneway(non_airprox_tfc['groundspeed_mean'], airproxes_with_tfc['groundspeed_mean'])
print(f'One-Way ANOVA test gives F-score of {F} at a p-significance of {p}')
if p <= sig_lvl:
    print(f'The hypothesis is accepted (F={F}, p={p})')
else:
    print('Null hypothesis is accepted.')

One-Way ANOVA test gives F-score of 0.6072564657753389 at a p-significance of 0.4359045170991731
Null hypothesis is accepted.


Now the difference in *spread* of traffic flow speeds is compared between airprox and non-airprox traffic

In [138]:
print('Overall mean of stddev: ', np.array(non_airprox_tfc['groundspeed_std']).mean(), ' for ',
      len(non_airprox_tfc['groundspeed_std']),
      ' samples')
print('Airprox location mean of stddev: ', np.array(airproxes_with_tfc['groundspeed_std']).mean(), 'for ',
      len(airproxes_with_tfc['groundspeed_std']), ' samples')

Overall mean of stddev:  49.29393177869329  for  1421  samples
Airprox location mean of stddev:  38.56751673591979 for  834  samples


In [139]:
F, p = ss.bartlett(non_airprox_tfc['groundspeed_std'], airproxes_with_tfc['groundspeed_std'])
print(f'Bartlett equal variance test gives score of {F} at a p-significance of {p}')
if p <= sig_lvl:
    print(f'The hypothesis is accepted (F={F}, p={p})')
else:
    print('Null hypothesis is accepted.')

Bartlett equal variance test gives score of 9.099903400156471 at a p-significance of 0.0025562299494020224
The hypothesis is accepted (F=9.099903400156471, p=0.0025562299494020224)


# Adherence to Semicircular Rule
Check to see if traffic tends to follow the [Semicircular rule](https://en.wikipedia.org/wiki/Flight_level#Semicircular/hemispheric_rule)

In [None]:
tfc_magg['track_mean']

In [None]:
tfc_alt_agg