# Mission Command Data Analysis - Part 2 - Data Exploration, Merging, and Analysis
## By: Matthew Jacobsen

In the second installment of the Mission Command Data Analysis series, we will explore our fictional data and demonstrate a method to merge the datasets to create "matched" data sets.  Note that this is only one possible method to accomplish this task and the actual method will depend upon the system being assessed.  

Recall from the previous part that we are assessing a fictional mission command widget (MCWidget) designed to pass data to keep soldiers and commanders aware of circumstances on the battlefield.  In the previous part of this series, we imported, processed, and normalized data related to this analysis for both technical assessment of behind the scenes translations and assessment of the accuracy of the information displayed to the user.  

### Data Exploration

In order to gain a better understanding of the data we are looking at, we first need to understand what is available in our data.  Let's import and take a look over what is available for us to use, so we can begin to develop what our deliverable from such a project.

In [1]:
import pickle
import pandas as pd
import numpy as np

In [2]:
with open('map_df.pkl','rb') as map_in:
    in_data = pickle.load(map_in)
    map_df = pd.DataFrame(in_data, columns=['Latitude','Longitude'])

with open('network_df.pkl','rb') as network_in:
    in_data = pickle.load(network_in)
    network_df = pd.DataFrame(in_data)

with open('mcwidget_df.pkl','rb') as mcwidget_in:
    in_data = pickle.load(mcwidget_in)
    widget_df = pd.DataFrame(in_data)

In [3]:
map_df

Unnamed: 0,Latitude,Longitude
0,38.8899,-77.038616
1,38.889151,-77.035916
2,38.889567,-77.035332
3,38.889567,-77.020374
4,38.889434,-77.033037
5,38.889511,-77.019133


In [4]:
network_df.head()

Unnamed: 0,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed
0,1586214746999,4,5,1500,38.8897,77.0206,25
1,1586551601348,5,8,835,38.8892,77.0481,28
2,1586515475138,1,8,1186,38.8895,77.0353,25
3,1586271176350,7,7,1834,38.8898,77.0353,16
4,1586627112164,6,10,2109,38.8895,77.0191,28


In [5]:
widget_df.head()

Unnamed: 0,Internal Observed Time,altitude,from,latitude,longitude,speed,to
0,1586214751618,1500,5,38.8897,77.0206,25,4
1,1586526747179,1500,2,38.8899,77.0592,31,8
2,1586551614191,835,8,38.8892,77.0481,28,5
3,1586515475433,1186,8,38.8895,77.0353,25,1
4,1586271183767,1834,7,38.8898,77.0353,16,7


In inspecting the data we have imported, the instrumented data (widget_df and network_df), we have observations of messages including who the sender is, who the addressee is, what altitude they were at, their latitude and longitude, and the speed.  From the map data, all we are able to extract without more context is the latitude and longitude. One thing that we should consider is what, if anything, needs to be done with the positional information. 

#### Normalizing Coordinates
As can be easily seen, we have positional data in all three datasets (Latitude and Longitude).  This may or may not always be the case.  For example, military forces routinely use the Military Grid Reference System (MGRS), which consists of progressively smaller grid references.  Take for example, one of our reference points from Part 1 when extracting map data was [38.8899, -77.0091] which is the Latitude and Longitude of the US Capitol Building in Washington D.C.  This converted into MGRS is 18SUJ2575106477.  If it is necessary to convert grid references (like MGRS) into Latitude/Longitude, one method to do this is using the [MGRS package](https://pypi.org/project/mgrs/) in Python.  

In [6]:
import mgrs

In [7]:
mgrs_converter = mgrs.MGRS()
input_coords = '18SUJ2575106477'
print(mgrs_converter.toLatLon(input_coords.encode()))

(38.889897556798665, -77.00910652914726)


#### Inspecting the Data

Beyond normalizing the data, we also need to understand what we can and cannot do with the data we have available.

In [8]:
network_df.describe()

Unnamed: 0,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed
count,31,31,31,31,31.0,31.0,31
unique,31,10,9,20,8.0,21.0,16
top,1586593616496,3,5,1500,38.8895,77.0353,25
freq,1,5,11,12,13.0,11.0,11


In [9]:
widget_df.describe()

Unnamed: 0,Internal Observed Time,altitude,from,latitude,longitude,speed,to
count,37,37,37,37.0,37.0,37,37
unique,37,23,7,8.0,22.0,13,11
top,1586863937247,24,5,38.8895,77.0353,25,24
freq,1,8,13,12.0,9.0,11,8


By looking at the unique values, we have several items that repeat.  Depending upon how these line up, some may or may not be useful for matching.  

### Combining Instrumented Data

The fact that we have "objective" measured times means that we can make the straightforward assumption that the system works as advertised. Given that we are testing the MCWidget system, it is safe to assume the developers also believe it is "ready to go".  If we make the assumption that the system works, then we can also assume that by matching messages using the minimized time difference we be able to correlate the results.  In addition, if the translation is working as intended, then we should also have the contents matching up. 

To start, let's take the internal and external observed times and build a matrix indicating which ones are best matches and so on. We will do this by looking to find all messages with *exactly* the same content and then pick the one with the minimum tranlsation time. 

In [10]:
combined_network = list(
    zip(
        [x for x in network_df.index],
        [x for x in network_df['Altitude']],
        [x for x in network_df['Latitude']],
        [x for x in network_df['Longitude']],
        [x for x in network_df['Speed']],
        [x for x in network_df.To],
        [x for x in network_df.From],
        [x for x in network_df['External Observed Time']]
    )
)

combined_widget = list(
    zip(
        [x for x in widget_df.index],
        [x for x in widget_df['altitude']],
        [x for x in widget_df['latitude']],
        [x for x in widget_df['longitude']],
        [x for x in widget_df['speed']],
        [x for x in widget_df['to']],
        [x for x in widget_df['from']],
        [x for x in widget_df['Internal Observed Time']]
    )
)

match_data = {}
for message in combined_network:
    out_message_id = message[0]
    out_message_altitude = int(message[1])
    out_message_latitude = float(message[2])
    out_message_longitude = float(message[3])
    out_message_speed = int(message[4])
    out_message_to = int(message[5])
    out_message_from = int(message[6])
    out_message_time = int(message[7])
    match_data[out_message_id] = {}
    for in_message in combined_widget:
        in_message_id = in_message[0]
        in_message_altitude = int(in_message[1])
        in_message_latitude = float(in_message[2])
        in_message_longitude = float(in_message[3])
        in_message_speed = int(in_message[4])
        in_message_to = int(in_message[5])
        in_message_from = int(in_message[6])
        in_message_time = int(in_message[7])
        time_delta = in_message_time - out_message_time
        if (
            (in_message_from == out_message_from) and 
            (in_message_speed == out_message_speed) and 
            (in_message_longitude == out_message_longitude) and 
            (in_message_latitude == out_message_latitude) and 
            (in_message_to == out_message_to) and 
            (in_message_altitude == out_message_altitude) and 
            (time_delta > 0)
        ):
            match_data[out_message_id][in_message_id] = in_message_time - out_message_time
        else:
            match_data[out_message_id][in_message_id] = np.NaN

match_df = pd.DataFrame(match_data,columns = [x for x in network_df.index])

In [11]:
match_df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,21,22,23,24,25,26,27,28,29,30
0,4619.0,,,,,,,,,,...,,,,,,,,,,
1,,,,,,,,,,,...,,,,,,,,,,2179.0
2,,12843.0,,,,,,,,,...,,,,,,,,,,
3,,,295.0,,,,,,,,...,,,,,,,,,,
4,,,,7417.0,,,,,,,...,,,,,,,,,,
5,,,,,10277.0,,,,,,...,,,,,,,,,,
6,,,,,,13794.0,,,,,...,,,,,,,,,,
7,,,,,,,14814.0,,,,...,,,,,,,,,,
8,,,,,,,,,,,...,,,,,,,,,,
9,,,,,,,,,14859.0,,...,,,,,,,,,,


So now we have a reasonable way to match the messages across the translation layer of our MCWidget. Based upon this table alone, we can get an estimate of the translation time. Given that some of the messages didn't match, we can also get an error rate measurement for the translation, as the messages that don't match likely encountered an error in the process.

In [12]:
minimum_times = pd.DataFrame(match_df.dropna(axis=1, how='all').min(),columns=['min_times'])
print('Error Rate: '+str(100*(1-len(minimum_times)/len(match_df))))
minimum_times.describe()

Error Rate: 37.83783783783784


Unnamed: 0,min_times
count,23.0
mean,8919.130435
std,4443.21795
min,295.0
25%,5748.5
50%,10219.0
75%,12756.0
max,14859.0


So our message translation for the MCWidget was not particularly good.  We had a 38% error rate and those that did translate correctly took, on average, 8919 milliseconds to translate or just under 10 seconds.  We would want to compare this number to the transit time for a similar message over a similar type of network architecture to see whether the translation was addind a substantial amount of time to the process. 

The final step in this process would be to merge the data, where possible, for comparison in latitude and longitude accuracy.  To do this, we will create a lookup dictionary for each of the messages and then use that to merge the two datasets together. 

In [13]:
index_dict = {}
for column in match_df.dropna(axis=1, how='all').columns:
    index_dict[column] = match_df[column].idxmin()

index_dict

{0: 0,
 1: 2,
 2: 3,
 3: 4,
 4: 5,
 5: 6,
 6: 7,
 8: 9,
 10: 12,
 11: 14,
 12: 15,
 13: 16,
 15: 19,
 16: 21,
 18: 23,
 19: 24,
 21: 26,
 23: 28,
 24: 29,
 25: 30,
 27: 33,
 29: 35,
 30: 1}

With the lookup dictionary set up, we can now reset the index, so that our dataframe row numbers move to a column in the dataframe.  Then, we can use those indices to create a match_index column with which the dataframes will be joined. 

In [14]:
new_network_df = network_df.copy()
new_network_df.reset_index(inplace=True)
new_network_df.head()

Unnamed: 0,index,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed
0,0,1586214746999,4,5,1500,38.8897,77.0206,25
1,1,1586551601348,5,8,835,38.8892,77.0481,28
2,2,1586515475138,1,8,1186,38.8895,77.0353,25
3,3,1586271176350,7,7,1834,38.8898,77.0353,16
4,4,1586627112164,6,10,2109,38.8895,77.0191,28


In [15]:
new_network_df['match_index'] = new_network_df['index'].map(index_dict)
new_network_df.head()

Unnamed: 0,index,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed,match_index
0,0,1586214746999,4,5,1500,38.8897,77.0206,25,0.0
1,1,1586551601348,5,8,835,38.8892,77.0481,28,2.0
2,2,1586515475138,1,8,1186,38.8895,77.0353,25,3.0
3,3,1586271176350,7,7,1834,38.8898,77.0353,16,4.0
4,4,1586627112164,6,10,2109,38.8895,77.0191,28,5.0


We then repeat some of this process with the widget dataframe, in that the index is reset to move the row numbers into the dataframe. With those indexes injected into the widget dataframe, we can then rename the index column to match_index, in order to merge the dataframes in the next step.

In [16]:
new_widget_df = widget_df.copy()
new_widget_df.reset_index(inplace=True)
new_widget_df.rename(columns={'index':'match_index'},inplace=True)
new_widget_df.head()

Unnamed: 0,match_index,Internal Observed Time,altitude,from,latitude,longitude,speed,to
0,0,1586214751618,1500,5,38.8897,77.0206,25,4
1,1,1586526747179,1500,2,38.8899,77.0592,31,8
2,2,1586551614191,835,8,38.8892,77.0481,28,5
3,3,1586515475433,1186,8,38.8895,77.0353,25,1
4,4,1586271183767,1834,7,38.8898,77.0353,16,7


Now that we have consistent information in both dataframes to enable matching, we can merge the two datasets on the 'match_index' columns in both.  

In [17]:
merged_translation_df = new_network_df.merge(new_widget_df, on = 'match_index')
merged_translation_df.head()

Unnamed: 0,index,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed,match_index,Internal Observed Time,altitude,from,latitude,longitude,speed,to
0,0,1586214746999,4,5,1500,38.8897,77.0206,25,0.0,1586214751618,1500,5,38.8897,77.0206,25,4
1,1,1586551601348,5,8,835,38.8892,77.0481,28,2.0,1586551614191,835,8,38.8892,77.0481,28,5
2,2,1586515475138,1,8,1186,38.8895,77.0353,25,3.0,1586515475433,1186,8,38.8895,77.0353,25,1
3,3,1586271176350,7,7,1834,38.8898,77.0353,16,4.0,1586271183767,1834,7,38.8898,77.0353,16,7
4,4,1586627112164,6,10,2109,38.8895,77.0191,28,5.0,1586627122441,2109,10,38.8895,77.0191,28,6


To this point, the data we imported from the last step has been considered by Python to be in the object data type.  So, we need to recast this in the appropriate data types to enable some of the math we will have to do next.  Some of these items will be cast as integers and some as float type.

In [18]:
to_types = [
    ['External Observed Time',1],
    ['To',1],
    ['From',1],
    ['Altitude',1],
    ['Latitude',2],
    ['Longitude',2],
    ['Speed',1],
    ['Internal Observed Time',1],
    ['altitude',1],
    ['from',1],
    ['latitude',2],
    ['longitude',2],
    ['speed',1],
    ['to',1]
]

for group in to_types:
    if group[1] == 1:
        merged_translation_df[group[0]] = merged_translation_df[group[0]].apply(lambda x: int(x))
    elif group[1] == 2:
        merged_translation_df[group[0]] = merged_translation_df[group[0]].apply(lambda x: float(x))

In the next two cells, we will perform some of the math needed to determine translation times and displacements associated with the matched messages.  The process that we used to merge the dataframes earlier results in any non-matching items are excluded from the match.  One benefit to this is that we do not have to fill values in order to accomplish the math we are doing.  

In [19]:
merged_translation_df['translation_time'] = merged_translation_df['Internal Observed Time'] - merged_translation_df['External Observed Time']

In order to determine displacements associated with translation, we will first create a list of tuples containing the coordinates for both internal and external data.  Then, we can use the Haversine library to compute the distance between the two coordinate sets and import that as part of our data frame.

In [20]:
network_coords = list(zip([x for x in merged_translation_df['Latitude']], [x for x in merged_translation_df['Longitude']]))
widget_coords = list(zip([x for x in merged_translation_df['latitude']], [x for x in merged_translation_df['longitude']]))

from haversine import haversine
coords_list = list(zip(network_coords,widget_coords))
coords_diff = []
for pair in coords_list:
    coords_diff.append(haversine(pair[0],pair[1])*1000)

merged_translation_df['network_coords'] = network_coords
merged_translation_df['widget_coords'] = widget_coords
merged_translation_df['coords_diff'] = coords_diff

merged_translation_df

Unnamed: 0,index,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed,match_index,Internal Observed Time,altitude,from,latitude,longitude,speed,to,translation_time,network_coords,widget_coords,coords_diff
0,0,1586214746999,4,5,1500,38.8897,77.0206,25,0.0,1586214751618,1500,5,38.8897,77.0206,25,4,4619,"(38.8897, 77.0206)","(38.8897, 77.0206)",0.0
1,1,1586551601348,5,8,835,38.8892,77.0481,28,2.0,1586551614191,835,8,38.8892,77.0481,28,5,12843,"(38.8892, 77.0481)","(38.8892, 77.0481)",0.0
2,2,1586515475138,1,8,1186,38.8895,77.0353,25,3.0,1586515475433,1186,8,38.8895,77.0353,25,1,295,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0
3,3,1586271176350,7,7,1834,38.8898,77.0353,16,4.0,1586271183767,1834,7,38.8898,77.0353,16,7,7417,"(38.8898, 77.0353)","(38.8898, 77.0353)",0.0
4,4,1586627112164,6,10,2109,38.8895,77.0191,28,5.0,1586627122441,2109,10,38.8895,77.0191,28,6,10277,"(38.8895, 77.0191)","(38.8895, 77.0191)",0.0
5,5,1586753502109,4,5,1500,38.8895,77.0353,28,6.0,1586753515903,1500,5,38.8895,77.0353,28,4,13794,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0
6,6,1586329321050,3,5,894,38.8892,77.0225,25,7.0,1586329335864,894,5,38.8892,77.0225,25,3,14814,"(38.8892, 77.0225)","(38.8892, 77.0225)",0.0
7,8,1586536050756,6,5,2220,38.8893,77.0268,25,9.0,1586536065615,2220,5,38.8893,77.0268,25,6,14859,"(38.8893, 77.0268)","(38.8893, 77.0268)",0.0
8,10,1586279266735,4,2,1500,38.8896,77.0299,24,12.0,1586279267309,1500,2,38.8896,77.0299,24,4,574,"(38.8896, 77.0299)","(38.8896, 77.0299)",0.0
9,11,1586356315809,3,5,1605,38.8893,77.0125,40,14.0,1586356323379,1605,5,38.8893,77.0125,40,3,7570,"(38.8893, 77.0125)","(38.8893, 77.0125)",0.0


At this point, we have a complete dataframe with the internal and external data combined in a single dataframe.  In order to use this in our next part, let's output this as a pickled dataframe. 

In [21]:
with open('merged_instrumented_data_df.pkl','wb') as merged_out:
    pickle.dump(merged_translation_df, merged_out, protocol=2)

### Combining the Map Data with the Instrumented Data

As was seen with the merging done in the previous section, if we were to merge the map data in a similar manner, we would lose the majority of the data.  Therefore, we will create a separate dataframe merging the map data in, so that we can assess those from network to user.  First, let's remind ourselves of the data that we are using from the map. 

In [22]:
map_df.head()

Unnamed: 0,Latitude,Longitude
0,38.8899,-77.038616
1,38.889151,-77.035916
2,38.889567,-77.035332
3,38.889567,-77.020374
4,38.889434,-77.033037


Next, we will insert the measured time for the map data, which would be the time stamp on the screen capture.  We are, for the purposes of this walkthrough, using the latest timestamp from the widget dataframe.  In order to match the messages to the graphics on the screen, we take a two step approach.  First, we minimize the positional difference, then we find the minimal time difference item *that has not already been used*.  When a message is matched, it is removed from consideration for the remainder of the messages. The ensures we don't get double matches.  

In [23]:
map_coords = list(zip([x for x in map_df['Latitude']],[-1*x for x in map_df['Longitude']]))
map_measured_time = 1586753515903 + 15000
widget_data = list(zip([x for x in merged_translation_df['widget_coords']],[x for x in merged_translation_df['Internal Observed Time']]))

matches = {}
matched_items = []
map_id = 0
for coord in map_coords:
    match_df = pd.DataFrame(widget_data, columns = ['Coordinates','Internal Observed Times'])
    match_df['Display Time'] = map_measured_time - match_df['Internal Observed Times']
    widget_coords = [x for x in match_df['Coordinates']]
    coord_diff = []
    for wc in widget_coords:
        coord_diff.append(haversine(wc, coord)*1000)
    match_df['Coord Diff'] = coord_diff
    for id_num in matched_items:
        match_df = match_df.drop(id_num)
    min_diff = match_df['Coord Diff'].min()
    sub_match_df = match_df[match_df['Coord Diff'] == min_diff]
    min_time_index = sub_match_df['Display Time'].idxmin()
    matches[map_id] = min_time_index
    matched_items.append(min_time_index)
    map_id += 1

In [24]:
matches

{0: 10, 1: 5, 2: 18, 3: 0, 4: 14, 5: 4}

As with the instrumented data, we create a match_index column and reset the index on the instrumented data, in order to merge again. 

In [25]:
map_df['match_index'] = map_df.index.map(matches)
map_df

Unnamed: 0,Latitude,Longitude,match_index
0,38.8899,-77.038616,10
1,38.889151,-77.035916,5
2,38.889567,-77.035332,18
3,38.889567,-77.020374,0
4,38.889434,-77.033037,14
5,38.889511,-77.019133,4


In [26]:
merged_translation_df_copy = merged_translation_df.copy()
merged_translation_df_copy.reset_index(inplace=True)
merged_translation_df_copy.head()

Unnamed: 0,level_0,index,External Observed Time,To,From,Altitude,Latitude,Longitude,Speed,match_index,...,altitude,from,latitude,longitude,speed,to,translation_time,network_coords,widget_coords,coords_diff
0,0,0,1586214746999,4,5,1500,38.8897,77.0206,25,0.0,...,1500,5,38.8897,77.0206,25,4,4619,"(38.8897, 77.0206)","(38.8897, 77.0206)",0.0
1,1,1,1586551601348,5,8,835,38.8892,77.0481,28,2.0,...,835,8,38.8892,77.0481,28,5,12843,"(38.8892, 77.0481)","(38.8892, 77.0481)",0.0
2,2,2,1586515475138,1,8,1186,38.8895,77.0353,25,3.0,...,1186,8,38.8895,77.0353,25,1,295,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0
3,3,3,1586271176350,7,7,1834,38.8898,77.0353,16,4.0,...,1834,7,38.8898,77.0353,16,7,7417,"(38.8898, 77.0353)","(38.8898, 77.0353)",0.0
4,4,4,1586627112164,6,10,2109,38.8895,77.0191,28,5.0,...,2109,10,38.8895,77.0191,28,6,10277,"(38.8895, 77.0191)","(38.8895, 77.0191)",0.0


We can, with the dataframe structures we have, merge the map data and the instrumented data frames, resulting in a single dataframe containing all items that appears on the map. 

In [27]:
merged_display_df = map_df.merge(merged_translation_df_copy, left_on = 'match_index', right_on = 'level_0')
merged_display_df

Unnamed: 0,Latitude_x,Longitude_x,match_index_x,level_0,index,External Observed Time,To,From,Altitude,Latitude_y,...,altitude,from,latitude,longitude,speed,to,translation_time,network_coords,widget_coords,coords_diff
0,38.8899,-77.038616,10,10,12,1586666814762,3,10,1096,38.8898,...,1096,10,38.8898,77.0386,36,3,14382,"(38.8898, 77.0386)","(38.8898, 77.0386)",0.0
1,38.889151,-77.035916,5,5,5,1586753502109,4,5,1500,38.8895,...,1500,5,38.8895,77.0353,28,4,13794,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0
2,38.889567,-77.035332,18,18,24,1586688832759,9,4,2125,38.8895,...,2125,4,38.8895,77.0353,39,9,6426,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0
3,38.889567,-77.020374,0,0,0,1586214746999,4,5,1500,38.8897,...,1500,5,38.8897,77.0206,25,4,4619,"(38.8897, 77.0206)","(38.8897, 77.0206)",0.0
4,38.889434,-77.033037,14,14,18,1586520677952,10,7,1500,38.8893,...,1500,7,38.8893,77.0341,27,10,10219,"(38.8893, 77.0341)","(38.8893, 77.0341)",0.0
5,38.889511,-77.019133,4,4,4,1586627112164,6,10,2109,38.8895,...,2109,10,38.8895,77.0191,28,6,10277,"(38.8895, 77.0191)","(38.8895, 77.0191)",0.0


The final step in collecting our data for preparation of a report is to merge in the data regarding actual time required to display graphics and the displacement associated with the graphical representation.  As with the matching process, this is accomplished using the Haversine library and some simple subtraction.

In [28]:
merged_display_df['Display Time'] = map_measured_time - merged_display_df['Internal Observed Time']
widget_coords = [x for x in merged_display_df['widget_coords']]
map_coords = list(zip([x for x in merged_display_df['Latitude_x']],[-1*x for x in merged_display_df['Longitude_x']]))
i = 0
displacement = []
while i <= len(map_coords)-1:
    displacement.append(haversine(widget_coords[i],map_coords[i])*1000)
    i+=1

merged_display_df['Graphical Displacement'] = displacement
merged_display_df.head()

Unnamed: 0,Latitude_x,Longitude_x,match_index_x,level_0,index,External Observed Time,To,From,Altitude,Latitude_y,...,latitude,longitude,speed,to,translation_time,network_coords,widget_coords,coords_diff,Display Time,Graphical Displacement
0,38.8899,-77.038616,10,10,12,1586666814762,3,10,1096,38.8898,...,38.8898,77.0386,36,3,14382,"(38.8898, 77.0386)","(38.8898, 77.0386)",0.0,86701759,11.206605
1,38.889151,-77.035916,5,5,5,1586753502109,4,5,1500,38.8895,...,38.8895,77.0353,28,4,13794,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0,15000,65.990402
2,38.889567,-77.035332,18,18,24,1586688832759,9,4,2125,38.8895,...,38.8895,77.0353,39,9,6426,"(38.8895, 77.0353)","(38.8895, 77.0353)",0.0,64691718,7.953099
3,38.889567,-77.020374,0,0,0,1586214746999,4,5,1500,38.8897,...,38.8897,77.0206,25,4,4619,"(38.8897, 77.0206)","(38.8897, 77.0206)",0.0,538779285,24.54413
4,38.889434,-77.033037,14,14,18,1586520677952,10,7,1500,38.8893,...,38.8893,77.0341,27,10,10219,"(38.8893, 77.0341)","(38.8893, 77.0341)",0.0,232842732,93.176023


In [29]:
with open('merged_display_data_df.pkl','wb') as display_out:
    pickle.dump(merged_display_df, display_out, protocol=2)

In the final part of this series, we will explore how to display this information to the end user and decision maker, in order to clearly depict how effective this MCWidget is at it's main function, informing the user.  See you then!