## Objective
1. Visualize the launches in an informative way
2. Identify any outlier launches that should be reviewed
3. Identify any interesting patterns in the data (seasonality, poorly performing parts)

## 1. Visualize the launches in an informative way
- The data contains only 5 seconds prior to 15 seconds after launch
- I assume the presuption is that something strange might be happening after the zip is launched
- Need to detect outlier launches, which would mean that the distance or speed after the launch is not as high as it could be when one accounts for the wind speed

### a. load the data into a single dataframe

In [18]:
import pandas as pd
import re
from glob import glob
import yaml

import plotly.plotly as py
import plotly.graph_objs as go

In [16]:
with open('creds.yaml', 'r') as f:
    creds = yaml.load(f)

In [17]:
import plotly
plotly.tools.set_credentials_file(username=creds['plotly']['username'], 
                                  api_key=creds['plotly']['apikey'])

In [2]:
hires_flight_csv = glob('../data/flight*.csv')

In [3]:
reg = '../data/flight_(\d*)'

In [4]:
hires_flight_data = pd.DataFrame()
for csv in hires_flight_csv:
    csv_data = pd.read_csv(csv)
    flight_number = re.match(reg, csv).group(1)
    csv_data['flight_id'] = int(flight_number)
    hires_flight_data = pd.concat([hires_flight_data, csv_data])

### b. Check that the range of values for the time after launch is the same
- This is important because if the distance traveled is the metric, then the time after the launch needs to be the same

In [5]:
max_time_after_launch = hires_flight_data.groupby('flight_id').\
                                          agg({'seconds_since_launch':'max'})

In [6]:
max_time_after_launch.describe()

Unnamed: 0,seconds_since_launch
count,447.0
mean,14.995431
std,0.0001
min,14.99502
25%,14.99538
50%,14.99542
75%,14.99547
max,14.99605


It looks like almost all of the data has a datapoint within one one hunderedth of the 15 second mark. This will make it easy to compare one flight with another.

One metric that could be important is the velocity after 15 seconds. This metric would be important if every flight was relatively straight. Plot the positions for a random set of flights 

### c. Plot the path of a sample of planes

In [7]:
flight_summaries = pd.read_csv('../data/summary_data.csv')

In [8]:
flight_sample = flight_summaries.sample(n=10, random_state=42)

In [9]:
hires_flight_data.columns

Index(['seconds_since_launch', 'position_ned_m[0]', 'position_ned_m[1]',
       'position_ned_m[2]', 'velocity_ned_mps[0]', 'velocity_ned_mps[1]',
       'velocity_ned_mps[2]', 'accel_body_mps2[0]', 'accel_body_mps2[1]',
       'accel_body_mps2[2]', 'orientation_rad[0]', 'orientation_rad[1]',
       'orientation_rad[2]', 'angular_rate_body_radps[0]',
       'angular_rate_body_radps[1]', 'angular_rate_body_radps[2]',
       'position_sigma_ned_m[0]', 'position_sigma_ned_m[1]',
       'position_sigma_ned_m[2]', 'flight_id'],
      dtype='object')

In [10]:
set(flight_summaries.columns).intersection(set(hires_flight_data.columns))

{'flight_id'}

In [21]:
select_flights = flight_sample[['flight_id']].merge(hires_flight_data, 
                                    how='inner',
                                    on='flight_id')
select_flights.reset_index
select_flights.head(2)

Unnamed: 0,flight_id,seconds_since_launch,position_ned_m[0],position_ned_m[1],position_ned_m[2],velocity_ned_mps[0],velocity_ned_mps[1],velocity_ned_mps[2],accel_body_mps2[0],accel_body_mps2[1],accel_body_mps2[2],orientation_rad[0],orientation_rad[1],orientation_rad[2],angular_rate_body_radps[0],angular_rate_body_radps[1],angular_rate_body_radps[2],position_sigma_ned_m[0],position_sigma_ned_m[1],position_sigma_ned_m[2]
0,17508,-4.99813,4.511611,7.207816,-3.410138,0.0,0.0,0.0,2.224452,0.016884,-9.503209,0.006961,0.216883,2.740921,-0.000893,0.000388,0.00237,0.144418,0.168294,0.394049
1,17508,-4.97848,4.50889,7.205891,-3.415132,0.0,0.0,0.0,2.177743,-0.018896,-9.506433,0.006963,0.216905,2.740921,0.002406,-0.004695,0.000523,0.14442,0.168297,0.394055


In [41]:
coordinates = ['flight_id', 'seconds_since_launch', 'position_ned_m[0]', 'position_ned_m[1]',
       'position_ned_m[2]']

In [43]:
select_flight_groups = select_flights[coordinates].groupby('flight_id')

In [45]:
traces = []
for flight, flight_data in select_flight_groups:
    temp_trace = go.Scatter3d(x=flight_data['position_ned_m[1]'],
                            y=flight_data['position_ned_m[0]'],
                            z=flight_data['position_ned_m[2]'] * (-1),
                            hovertext = flight_data['seconds_since_launch'],
                            mode='lines',
                            name=flight)
    traces.append(temp_trace)
py.iplot(traces, filename='trajectory')

It looks like the flights are primarly along the same XY plane