# Analyse AIS information

In this notebook, we show how to retrieve and analyse information from AIS messages. We'll use the public AIS history of the Danish Maritime Authority as an example. You can find an overview of the AIS archive [here](http://web.ais.dk/aisdata/). 

We take the following steps:

- [Download](#Download)


## Download
We download the data using wget (a command line download tool) and we extract it using the unzip command. 

In [5]:
import pathlib

In [6]:
data_dir = pathlib.Path('~/data/ais/dma').expanduser()

In [7]:
!wget --directory-prefix {data_dir} -c http://web.ais.dk/aisdata/aisdk-2022-04-23.zip 

--2022-04-26 11:49:57--  http://web.ais.dk/aisdata/aisdk-2022-04-23.zip
Resolving web.ais.dk (web.ais.dk)... 185.153.153.66
Connecting to web.ais.dk (web.ais.dk)|185.153.153.66|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.



In [10]:
# this might work slightly different on windows...
!cd {data_dir} && unzip -o aisdk-2022-04-23.zip 

Archive:  aisdk-2022-04-23.zip
  inflating: aisdk-2022-04-23.csv    


## Read the data
We'll use the combination of pandas, geopandas and movingpandas to read the data.
Pandas reads the data as a table. Geopandas adds the location and movingpandas adds the time information. 
The result is that we can use data as table, feature (geospatial) and trajectory (moving objects). 

In [11]:
import pandas as pd
import geopandas as gpd
import movingpandas 

In [46]:
ais_df = pd.read_csv(data_dir / 'aisdk-2022-04-23.csv')


In [48]:
# subset for this example
ais_df['t'] = pd.to_datetime(ais_gdf['# Timestamp'])

# too high (some latitude where > 90, 91)
too_high_idx = ais_df['Latitude'] > 90
ais_df.loc[too_high_idx, 'Latitude'] = 90
ais_df = ais_df[:100000]



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ais_df['t'] = pd.to_datetime(ais_gdf['# Timestamp'])


In [49]:
ais_df.head()

Unnamed: 0,# Timestamp,Type of mobile,MMSI,Latitude,Longitude,Navigational status,ROT,SOG,COG,Heading,...,Type of position fixing device,Draught,Destination,ETA,Data source type,A,B,C,D,t
0,23/04/2022 00:00:00,Class A,219024178,54.571818,11.928717,Under way using engine,0.0,0.0,187.2,143.0,...,Undefined,,,,AIS,,,,,2022-04-23
1,23/04/2022 00:00:00,Class A,219076000,54.45587,12.23491,Under way using engine,3.6,9.7,18.4,20.0,...,Undefined,,,,AIS,,,,,2022-04-23
2,23/04/2022 00:00:00,Class A,210056000,55.511542,12.697775,Under way using engine,-0.4,12.4,191.2,190.0,...,Undefined,,,,AIS,,,,,2022-04-23
3,23/04/2022 00:00:00,Class A,265411000,54.672188,12.40199,Under way using engine,0.0,14.1,236.9,236.0,...,Undefined,,,,AIS,,,,,2022-04-23
4,23/04/2022 00:00:00,Class A,249616000,54.8185,12.850367,Under way using engine,0.0,10.9,251.0,249.0,...,Undefined,,,,AIS,,,,,2022-04-23


In [50]:
# add goelocation
geometry = gpd.points_from_xy(ais_df.Longitude, ais_df.Latitude)
ais_gdf = gpd.GeoDataFrame(ais_df, geometry=geometry)

In [51]:
# TODO make plots using hvplot heatmap
ais_gdf.hvplot()

In [88]:
ais_ts_df = ais_gdf.head(3000).set_index('t')
ais_ts_df = ais_ts_df.set_crs('EPSG:4326')
ais_ts_df = ais_ts_df[ais_ts_df.Longitude != 0]

trajectories = movingpandas.TrajectoryCollection(ais_ts_df, traj_id_col='MMSI')

In [None]:
import matplotlib.pyplot as plt
import logging
import shapely
import warnings
from shapely.errors import ShapelyDeprecationWarning
warnings.filterwarnings("ignore", category=ShapelyDeprecationWarning) 



trajectories.hvplot()