# Visualizing Flight Test Data Interactively With Open Source Tools
---
## Society of Flight Test Engineers 49th Annual International Symposium
### 9 October 2018, Savannah GA
### Luke Starnes (GTRI)

# Agenda
* OSS Value Proposition
* ADS-B Background
* Tooling Overview
* Examples

# OSS Value Proposition


* Proprietary data analysis tools are expensive and create “vendor lock”
* Walled Garden

 ![](images/walled_garden.jpg)

# OSS Value Proposition

* open source tools are a superior choice for today’s flight test analysis problems
* open interfaces
* widespread compatibility (community of interoperable tools)
* seamless migration between tools (no “vendor lock”)
* flexibility and agility

# Open Flight Data as Lens for Talking OSS Tooling


<div align="center"><table><tr><td><img src='images/adsb.png'></td><td><img src='images/lens.png'></td><td><img src='images/osi_logo.png'></td></tr></table></div>


# ADS-B Background
* Automatic Dependent Surveillance-Broadcast
* Airfract system for broadcasting identification and position data
* Facilitated by uibiquity of GPS
* Driven by cost of maintaining ATC radars
* ADSB mandated in US starting Jan 1, 2020
 * required for aircraft operating about 10k', around airports, or off Gulf of Mexico
* European mandate starts Jan 1, 2019

* ADSB is Line of Sight - requires network of ground stations to recieve reports (min ~100NM)

<div align="center"><img src="images/adsb_ground_stations.png"></div>

* Transmissions are unencrypted
Thus a preponderance of...

<div align="center"><img src="images/prostick.jpg" height=30% width=30%></div>

<div align="center"><img src="images/planefinder.png"></div>
<div align="center"><sup>Source: [planefinder.net](https://planefinder.net/)</sup></div>

Other simlar sites: include [flightradar24.com](https://www.flightradar24.com/), [flightaware.com](https://flightaware.com/), and [adsbexchange.com](https://www.adsbexchange.com/).

* ADSB-B Exchange ([adsbexchange.com](https://www.adsbexchange.com/)) provides public access to their worldwide dataset (begins June 9, 2016)
<div align="center"><img src="images/adsbexchange_logo_full.png"></div>
* Data made available as JSON
* Each day is a single zip file with 1,440 JSON files (1 file per minute)

# OSS Tool Stack
* __Hierarchical Data Format 5 (HDF5)__ - multiplatform, effient, fast data storage format, metadata support
* __Pandas__ - robust tool for accessing, transforming, and analyzing tabular data
* __Luigi__ - pipelining tool for managing complex pipelines with inter-dependent steps
* __Jupyter__ - (*this*) web tool for integrating code, documentation, and visualization into narrative notebook
* __Bokeh__ - browser-based interactive visualization tool
* __Datashader__ - plotting tool for visualizing large datasets (points >> pixels)

# Pandas

In [1]:
import pandas as pd
import os

In [2]:
%%time
row_len = 500_000
h5_dir = r'c:\adsb'
h5_file = os.path.join(h5_dir, '2018-06-16.h5')
pickle_name = f'{os.path.basename(h5_file)}-{row_len}.p'
pickle_path = os.path.join(os.getcwd(), 'data', pickle_name)
if os.path.exists(pickle_path):
    df = pd.read_pickle(pickle_path)
else:
    try:
        from download_files import DownloadFile
        DownloadFile(pickle_path)
        df = pd.read_pickle(pickle_path)
    except:
        with pd.HDFStore(h5_file, mode='r') as store:
            df = store.select('data', stop = row_len, 
                              columns=['Man', 'Icao', 'Type', 'Op'])
        main_ops = ['Southwest', 'American', 'Delta', 'SkyWest', 
                    'Air Canada', 'Alaska', 'Virgin', 'United',
                    'JetBlue', 'Spirit', 'Frontier', 'Wells Fargo']
        for o in main_ops:
            df.loc[df.Op.fillna('-').str.lower().str.contains(o.lower()), 'Op'] = o
        df.to_pickle(pickle_path)

Wall time: 303 ms


In [4]:
%%time
row_len = 50
pickle_name = f'{os.path.basename(h5_file)}-{row_len}.p'
pickle_path = os.path.join(os.getcwd(), 'data', pickle_name)
if os.path.exists(pickle_path):
    df2 = pd.read_pickle(pickle_path)
else:
    try:
        from download_files import DownloadFile
        DownloadFile(pickle_path)
        df2 = pd.read_pickle(pickle_path)
    except:
        print('except')
        with pd.HDFStore(h5_file, mode='r') as store:
            df2 = store.select('data', stop = row_len)
        df2.to_pickle(pickle_path)

Wall time: 2.99 ms


| Field        | Description|
| ------------- |:-------------|
| Id | The unique identifier of the aircraft.|
| TSecs | The number of seconds that the aircraft has been tracked for.|
| Rcvr | The ID of the feed that last supplied information about the aircraft. Will be different to srcFeed if the source is a merged feed.|
| Icao | The ICAO of the aircraft.|
| Bad | True if the ICAO is known to be invalid. This information comes from the local BaseStation.sqb database.|
| Reg | The registration.|
| Alt | The altitude in feet at standard pressure.|
| GAlt | The altitude adjusted for local air pressure, should be roughly the height above mean sea level.|
| InHg | The air pressure in inches of mercury that was used to calculate the AMSL altitude from the standard pressure altitude.|
| AltT | The type of altitude transmitted by the aircraft: 0 = standard pressure altitude, 1 = indicated altitude (above mean sea level). Default to standard pressure altitude until told otherwise.|
| TAlt | The target altitude, in feet, set on the autopilot / FMS etc.|
| Call | The callsign.|
| CallSus | True if the callsign may not be correct.|
| Lat | The aircraft's latitude over the ground.|
| Long | The aircraft's longitude over the ground.|
| PosTime | The time (at UTC in JavaScript ticks) that the position was last reported by the aircraft.|
| Mlat | True if the latitude and longitude appear to have been calculated by an MLAT server and were not transmitted by the aircraft.|
| PosStale | True if the last position update is older than the display timeout value - usually only seen on MLAT aircraft in merged feeds.|
| IsTisb | True if the last message received for the aircraft was from a TIS-B source.|
| Spd | The ground speed in knots.|
| SpdTyp | The type of speed that Spd represents. Only used with raw feeds. 0/missing = ground speed, 1 = ground speed reversing, 2 = indicated air speed, 3 = true air speed.|
| Vsi | Vertical speed in feet per minute.|
| VsiT | 0 = vertical speed is barometric, 1 = vertical speed is geometric. Default to barometric until told otherwise.|
| Trak | Aircraft's track angle across the ground clockwise from 0° north.|
| TrkH | True if Trak is the aircraft's heading, false if it's the ground track. Default to ground track until told otherwise.|
| TTrk | The track or heading currently set on the aircraft's autopilot or FMS.|
| Type | The aircraft model's ICAO type code.|
| Mdl | A description of the aircraft's model. Usually also includes the manufacturer's name.|
| Man | The manufacturer's name.|
| CNum | The aircraft's construction or serial number.|
| From | The code and name of the departure airport.|
| To | The code and name of the arrival airport.|
| Stops | An array of strings, each being a stopover on the route.|
| Op | The name of the aircraft's operator.|
| OpCode | The operator's ICAO code.|
| Sqk | The squawk as a decimal number (e.g. a squawk of 7654 is passed as 7654, not 4012).|
| Help | True if the aircraft is transmitting an emergency squawk.|
| Dst | The distance to the aircraft in kilometres.|
| Brng | The bearing from the browser to the aircraft clockwise from 0° north.|
| WTC | The wake turbulence category of the aircraft - see enums.js for values.|
| Engines | The number of engines the aircraft has. Usually '1', '2' etc. but can also be a string - see ICAO documentation.|
| EngType | The type of engine the aircraft uses - see enums.js for values.|
| EngMount | The placement of engines on the aircraft - see enums.js for values.|
| Species | The species of the aircraft (helicopter, jet etc.) - see enums.js for values.|
| Mil | True if the aircraft appears to be operated by the military.|
| Cou | The country that the aircraft is registered to.|
| HasPic | True if the aircraft has a picture associated with it.|
| PicX | The width of the picture in pixels.|
| PicY | The height of the picture in pixels.|
| FlightsCount | The number of Flights records the aircraft has in the database.|
| CMsgs | The count of messages received for the aircraft.|
| Gnd | True if the aircraft is on the ground.|
| Tag | The user tag found for the aircraft in the BaseStation.sqb local database.|
| Interested | True if the aircraft is flagged as interesting in the BaseStation.sqb local database.|
| TT | Trail type - empty for plain trails, 'a' for trails that include altitude, 's' for trails that include speed.|
| Trt | Transponder type - 0=Unknown, 1=Mode-S, 2=ADS-B (unknown version), 3=ADS-B 0, 4=ADS-B 1, 5=ADS-B 2.|
| Year | The year that the aircraft was manufactured.|
| Sat | True if the aircraft has been seen on a SatCom ACARS feed (e.g. a JAERO feed).|
| Cos | Short trails - see note 1.|
| Cot | Full trails - see note 2.|
| ResetTrail | True if the entire trail has been sent and the JavaScript should discard any existing trail history it's built up for the aircraft.|
| HasSig | True if the aircraft has a signal level associated with it.|
| Sig | The signal level for the last message received from the aircraft, as reported by the receiver. Not all receivers pass signal levels. The value's units are receiver-dependent.|

In [4]:
df2.loc[25].dropna()

Alt                            24025
AltT                               0
Bad                            False
CMsgs                             12
CNum                           62294
Call                         VVLL877
CallSus                        False
Cos                                 
Cou                    United States
EngMount                           0
EngType                            3
Engines                            2
FSeen                  1529107221934
FlightsCount                       0
GAlt                           24034
Gnd                            False
HasPic                         False
HasSig                         False
Help                           False
Icao                          AE5C5A
Id                          11426906
InHg                         29.9291
Interested                     False
Lat                          32.8567
Long                         -80.555
Man                           Boeing
Mdl             Boeing P-8A Poseidon
M

In [5]:
print(df.shape)
df.dropna(how='any').head()

(500000, 4)


Unnamed: 0,Man,Icao,Type,Op
1,Raytheon Aircraft Company,A3286B,BE40,"MOSER AVIATION LLC - ENGLEWOOD, CO"
3,Boeing,AB1FFE,B739,Delta
6,Robinson,A56D30,R44,Robinson Helicopter Company
10,Airbus,424356,A320,Aeroflot Russian Airlines
13,McDonnell Douglas,AD8563,MD83,Wells Fargo


In [6]:
df['Man'].value_counts()[:10]

Boeing                          120932
Airbus                           92420
Embraer                          27713
Bombardier                       26085
Cessna                            9963
McDonnell Douglas                 5777
Gulfstream Aerospace              2578
Beech                             2491
Piper                             2365
Avions de Transport Regional      2350
Name: Man, dtype: int64

In [7]:
df.groupby('Op').agg({'Icao': pd.Series.nunique}).sort_values('Icao', ascending=False)[:15]

Unnamed: 0_level_0,Icao
Op,Unnamed: 1_level_1
American,649
Delta,573
United,537
Southwest,483
Wells Fargo,327
Private,234
Air Canada,226
JetBlue,150
Virgin,132
SkyWest,126


In [8]:
airlines_filter = df['Op'].isin(df.Op.value_counts().index[:10])
table = df[airlines_filter].groupby(['Op','Type']).agg({'Icao': pd.Series.nunique}).unstack().T
table['Total'] = table.sum(skipna=True, axis=1).map(int)
table.sort_values('Total', ascending=False).fillna('')[:10]

Unnamed: 0_level_0,Op,Air Canada,American,Delta,JetBlue,Private,SkyWest,Southwest,United,Virgin,Wells Fargo,Total
Unnamed: 0_level_1,Type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Icao,B738,,172.0,52.0,,,,142.0,48.0,53.0,21,488
Icao,B737,,,4.0,,,,317.0,17.0,2.0,18,358
Icao,A320,36.0,24.0,37.0,83.0,,,,57.0,35.0,10,282
Icao,A321,12.0,141.0,40.0,37.0,,,,,,4,234
Icao,A319,22.0,55.0,33.0,,,,,45.0,7.0,13,175
Icao,B739,,,64.0,,,,,80.0,,1,145
Icao,CRJ9,11.0,7.0,60.0,,,10.0,,,,25,113
Icao,E145,,50.0,,,,,,7.0,,36,93
Icao,B752,,9.0,50.0,,,,,29.0,,4,92
Icao,CRJ7,,13.0,10.0,,,38.0,,9.0,,21,91


# Bokeh

# Why Visualization is important
<div align="center"><img src="images/anscombe's_quartet.png" height=50% width=50%></div>
<div align="center">Anscombe's quartet - dataset consisting of four sets of points which are all statistically similar, but visually varied.</div>


# Datashader - The Why

<div align="center"><img src="images/datashader-plotting-pitfalls.png"></div>


# Datashader Examples
* [Worldwide Viz with Datashades](/notebooks/GitHub/sfte2018-adsb/Worldwide Viz with Datashader.ipynb)
* [Interactive Datashader](/notebooks/GitHub/sfte2018-adsb/Interactive Datashader.ipynb)

# Conclusion

* open source tools are a superior choice for today’s flight test analysis problems
* open interfaces
* widespread compatibility (community of interoperable tools)
* seamless migration between tools (no “vendor lock”)
* flexibility and agility

### Slides / Notebooks available here:

* https://github.com/slstarnes/sfte2018-adsb