<h1> Ship Type Prediction </h1>

<p align='justify'>

This Jupyter notebook contains a classification example which is
done with the help of Scikit-Learn library. In this notebook,
the following steps are performed:
</p>

<ol align='justify'>
    <li> The preprocessing i.e. feature generation, filtering and
         interpolation of the data is carried out using the
         NumMobility Library.
    </li>
    <li> Further, several models like RandomForestClassifier, KMeans
         Classifier etc. are trained using the Scikit-Learn library
         based on the cleaned dataset.
    </li>
    <li>
        Finally, on the interpolated dataset, the type of ships are
        predicted and their accuracy is checked.
    </li>

In [1]:
# Import the dataset.

import pandas as pd
from Nummobility.core.TrajectoryDF import NumPandasTraj

pdf = pd.read_csv('./data/ships.csv')
np_ships = NumPandasTraj(data_set=pdf,
                         latitude='lat',
                         longitude='lon',
                         datetime='Timestamp',
                         traj_id='Name')
np_ships.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,lat,lon,MMSI,NavStatus,SOG,COG,ShipType
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
AB RAMANTENN,2017-05-07 00:13:05,11.905735,57.681092,265902200,Moored,0.1,170.7,Undefined
AB RAMANTENN,2017-05-07 00:25:04,11.90574,57.68107,265902200,Moored,0.1,170.7,Undefined
AB RAMANTENN,2017-05-07 00:31:05,11.905792,57.68106,265902200,Moored,0.1,177.4,Undefined
AB RAMANTENN,2017-05-07 01:01:05,11.90565,57.681127,265902200,Moored,0.0,175.6,Undefined
AB RAMANTENN,2017-05-07 01:07:05,11.9057,57.681107,265902200,Moored,0.1,180.8,Undefined


In [2]:
%%time

# Now using Nummobility, generate distance features and
# run hampel filter on the dataset to remove outliers.
from Nummobility.features.spatial_features import SpatialFeatures
from Nummobility.preprocessing.filters import Filters

dist_ships = SpatialFeatures.create_distance_between_consecutive_column(np_ships)
dist_ships.head()

CPU times: user 296 ms, sys: 12 ms, total: 308 ms
Wall time: 306 ms


Unnamed: 0_level_0,Unnamed: 1_level_0,lat,lon,MMSI,NavStatus,SOG,COG,ShipType,Distance_prev_to_curr
traj_id,DateTime,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
AB RAMANTENN,2017-05-07 00:13:05,11.905735,57.681092,265902200,Moored,0.1,170.7,Undefined,
AB RAMANTENN,2017-05-07 00:25:04,11.90574,57.68107,265902200,Moored,0.1,170.7,Undefined,2.457384
AB RAMANTENN,2017-05-07 00:31:05,11.905792,57.68106,265902200,Moored,0.1,177.4,Undefined,5.883613
AB RAMANTENN,2017-05-07 01:01:05,11.90565,57.681127,265902200,Moored,0.0,175.6,Undefined,17.391237
AB RAMANTENN,2017-05-07 01:07:05,11.9057,57.681107,265902200,Moored,0.1,180.8,Undefined,5.970428


In [3]:
%%time

filt_ships = Filters.hampel_outlier_detection(dist_ships,
                                              column_name='Distance_prev_to_curr')
print(f"Length of original DF: {len(dist_ships)}")
print(f"Length of Filtered DF: {len(filt_ships)}")

Length of original DF: 84702
Length of Filtered DF: 61394
CPU times: user 224 ms, sys: 68.2 ms, total: 292 ms
Wall time: 7.08 s


