# Viewing AIS data

This [Jupyter](https://jupyter.org) notebook explores and analyzes the [Automatic Identification System (AIS)](https://en.wikipedia.org/wiki/Automatic_identification_system) vessel-location data provided for the 12/2020 VAULT technical scenario, showing what data is available and how it can be accessed and visualized from Python. The notebook also acts as a runnable application that can be put on a server to allow users to explore the data interactively.

In [None]:
import pandas as pd
import numpy as np
import panel as pn
import datetime as dt
import holoviews as hv
import colorcet as cc
import param
from holoviews.operation.datashader import rasterize, dynspread
hv.extension('bokeh')

Data has been provided for four [UTM zones](https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system#UTM_zone) (1, 2, 3, and 10), with vessels identified by their [Maritime Mobile Service Identity](https://en.wikipedia.org/wiki/Maritime_Mobile_Service_Identity) numbers. Here we will load 2017 data from three of the zones using [pandas](https://pandas.pydata.org):

In [None]:
zone1 = pd.read_csv('./data/AIS_2017_01_Zone01.csv', parse_dates=[1])
zone2 = pd.read_csv('./data/AIS_2017_01_Zone02.csv', parse_dates=[1])
zone3 = pd.read_csv('./data/AIS_2017_01_Zone03.csv', parse_dates=[1])
zones = pd.concat([zone1,zone2, zone3])

The AIS files include a variety of data and metadata values about each reported AIS broadcast. As we can see by looking at the first few rows of data, some of these values indicate the current state (e.g. LAT, LON, SOG, COG, Heading at the given time), while the others indicate metadata about the vessel (VesselName, CallSign, etc.):

In [None]:
zones.head()

Not all columns are present for every record, but some could be filled in using publicly available data (or replaced with a placeholder, such as using MMSI for a missing (NaN) VesselName.

Because the eventual task involves matching vessel locations to satellite fields of view, let's visualize the set of locations available using [HoloViews](https://holoviews.org):

In [None]:
print(f"Ranges found: latitude {min(zones.LAT):.4} to {max(zones.LAT):.4}, "
      f"longitude {min(zones.LON):.4} to {max(zones.LON):.4}")

In [None]:
points = rasterize(hv.Points(zones, ['LON','LAT'])).opts(cnorm='eq_hist', aspect='equal')
points

Here HoloViews generates an interactive [Bokeh](https://bokeh.org) plot that uses [Datashader](https://datashader.org) to compute a heatmap for location data, with darker blue colors indicating pixels with more AIS data "pings". With an interactive Python session running, you can zoom into the above heatmap plot to see how the location data is distributed. E.g. if we zoom into the densest region of AIS points, we can see a lot of interesting structure:

In [None]:
rasterize(hv.Points(zones, ['LON','LAT']), width=800, height=300, x_range=(-169,-162), y_range=(53.5,56))\
  .opts(width=800, height=300, cnorm='eq_hist', tools=['hover'])

To put this data in context, let's plot it on a map. We'll use a public map tile source in Web Mercator coordinates, so we'll first project the LON,LAT coorinates to easting,northing values:

In [None]:
%%time
zones.loc[:, 'x'], zones.loc[:, 'y'] = hv.util.transform.lon_lat_to_easting_northing(zones.LON,zones.LAT)

Let's also create some annotations that show the UTM zone boundaries, to validate that the data provided is indeed in the zones listed in the filenames:

In [None]:
def zone(i):
    """
    Return plottable bounds object for a given UTM zone
    (see https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system#UTM_zone)
    """
    lrbt = ((-180+6*(i-1),-180+6*i),(-80,84))
    m    = hv.util.transform.lon_lat_to_easting_northing(*lrbt)
    bnds = hv.Bounds((m[0][0],m[1][0],m[0][1],m[1][1])).opts(color="white") 
    text = hv.Text(m[0][0]+(m[0][1]-m[0][0])/2, 0, f"{i}").opts(color="white", text_font_size="5pt")
    return bnds * text

And let's set up some defaults, including a suitable lon,lat range and a colormap going from red to yellow to white so that it shows up against the dark ocean:

In [None]:
x_range, y_range = hv.util.transform.lon_lat_to_easting_northing([-180-45,-60], [-5,60])
bounds = dict(x=tuple(x_range), y=tuple(y_range))
opts = hv.opts.Image(cmap=cc.fire[64:], width=900, height=500, cnorm='eq_hist', alpha=1)

We can now plot the data on top of a map, with bounding boxes for the UTM zones, using `*` to overlay each item:

In [None]:
points = rasterize(hv.Points(zones, ['x','y']).redim.range(**bounds)).opts(opts)
tiles  = hv.element.tiles.EsriImagery().opts( alpha=0.5, bgcolor='black')
labels = hv.element.tiles.StamenLabels().opts(alpha=0.7, level='annotation')

tiles * dynspread(points) * zone(1) * zone(2) * zone(3) * zone(10) * labels

Zooming and panning (selecting appropriate tools from the plot toolbar if needed) should reveal that these vessels provide many AIS pings around ports and in shipping lanes, as well as revealing other interesting trajectories and movement patterns.

# Selecting a vessel at a given time

The above plots show the cumulative AIS location data over all times available in the files. If we are given a _particular_ time, we can can overlay markers for each vessel on top of the cumulative location data, showing the location of that vessel at the given time. 

In [None]:
vessels = {name:df.drop_duplicates().sort_values(by='BaseDateTime').set_index('BaseDateTime') 
           for name,df in zones.groupby('VesselName')}
columns = list([el for el in zones.columns if el!= 'BaseDateTime'])

In [None]:
def vessel_at_time(vessel_name, time, vessels):
    df = vessels[vessel_name].drop_duplicates()
    if time < df.index[0]:
        return None # Query before first value
    if time > df.index[-1]:
        return None # Query after last value
    try:
        idx = df.index.get_loc(time, method='nearest')
        return df.iloc[idx]
    except:
        return None

marked_points = None # TODO: Declare a class and make this an attribute

def mark_vessels(value):
    global marked_points
    records = []
    empty = dict({'x':0., 'y':0., 'time':''}, **{col:'' for col in columns})
    for vessel in vessels.keys():
        match = vessel_at_time(vessel, value, vessels)
        if match is not None:
            x, y = hv.util.transform.lon_lat_to_easting_northing(match['LON'], match['LAT'])
            records.append(dict({'x':x, 'y':y, 'time':match.name}, **{col:match[col] for col in columns}))
    markers = pd.DataFrame(records if len(records) != 0 else [empty]) 
    alpha = 1 if len(records) else 0
    marked_points = hv.Points(markers, ['x', 'y'], columns).opts(color='white', size=4, 
                                                                                 marker='triangle', alpha=alpha)
    return marked_points

Let's take the midpoint of the times available in the file, as an example time:

In [None]:
times = zones['BaseDateTime']
midpoint = times.min() + (times.max() - times.min())/2

Now we can compute the locations of each vessel near that time, and overlay that on the points and map tiles:

In [None]:
points = rasterize(hv.Points(zones, ['x','y']).redim.range(**bounds)).opts(opts)

tiles * points * mark_vessels(midpoint).opts(tools=['hover'])

Now when you hover over the white markers indicating the position of a vessel, you should see lots of information about it.

# User interface for selecting a given time and vessel

Instead of editing Python code to specify a time, let's add a [Panel](https://panel.holoviz.org) widget to let the user select a specific time interactively:

In [None]:
dt_input = pn.widgets.DatetimeInput(name='Datetime', value=midpoint)
dt_input

Also, as the data can be unwieldy in hover form, we can also provide it in a separate table, accessed by clicking on one of the ship markers:

In [None]:
table_cols = ['MMSI', 'VesselName', 'VesselType', 'Heading', 'CallSign', 'Length', 'Width', 'Cargo']
empty_df = pd.DataFrame({el:[] for el in table_cols})

class Drilldown(param.Parameterized):
    selection = param.DataFrame(empty_df)
    
    @param.depends('selection')
    def update_table(self, *args, **kwargs):
        return pn.widgets.DataFrame(self.selection, show_index=False)
    
drilldown = Drilldown()

In [None]:
vessel_types=pd.read_csv("AIS_categories.csv")

# Needs work
def vessel_type_from_str(val):
    i = int(str(val)[0:2])
    return vessel_types.iloc[i].desc if i in vessel_types.index else f'{i}'

In [None]:
def markerfn(index, table_cols=table_cols):
    if len(index) > 0:
        rows = [marked_points.data.iloc[ind] for ind in index]
        df = pd.DataFrame(rows)[table_cols]
        df.columns = table_cols
        #df['VesselType'] = df['VesselType'].apply(vessel_type_from_str)
        drilldown.selection = df

    return hv.HLine(0).opts(visible=False)

In [None]:
points = rasterize(hv.Points(zones, ['x','y']).redim.range(**bounds)).opts(opts)

title = "# AIS vessel locations"

message = ("Select a time covered by this dataset and press return to see ship locations at that time "
           "(after a few seconds, due to unoptimized time-filtering code). Then click on a ship "
           "to see more information about it.")

dmap = hv.DynamicMap(mark_vessels, streams=[dt_input.param.value])
overlay2 = (tiles * points * dmap)
marker = hv.DynamicMap(markerfn, streams=[hv.streams.Selection1D(source=dmap)]).opts(tools=['tap'])

pn.Column(title, message, dt_input, overlay2 * marker, drilldown.update_table).servable()

This app can now be launched as a separate server using a command like `panel serve <notebookname>.ipynb`, then visiting the URL that is printed as output.