# Voyages API Voyages Data Like Use Case

## Run this example in [Colab](https://colab.research.google.com/github/SignalOceanSdk/SignalSDK/blob/master/docs/examples/jupyter/VoyagesAPI/VoyagesAPI-VoyagesDataLike.ipynb). 

A Voyage is defined as a sequence of Load operations followed by a sequence of Discharges. Users of **Signal Ocean Platform** interface with the concept of a voyage in different levels of detail. For example in the Voyages tab of Vessels Data (https://app.signalocean.com/vessels) users can see  all the operations of a voyage even at jetty level.  
However very often arises the need of conducting an analysis of the voyages for a specific vessel class for a specific time window. This need is accommodated by the  **Voyages Data Dashboard** (https://app.signalocean.com/reportsindex/voyagesdatalive).  

The level of detail provided by the Voyages Data Dashboard has been tailored, having in mind the neccessary information needed to carry out such an analysis without being overwhelmed by the full data provided by Signal Ocean Platform regarding the voyages of the vessels.  

While both ```get_voyages``` and ``get_voyages_flat`` functions of the Signal SDK return the full low level data available, in this example we are going to construct a dataframe that resembles the form of ***Voyages Data Dashboard***

## Setup
Install the Signal Ocean SDK:
```
pip install signal-ocean
```
Set your subscription key acquired here: https://apis.signalocean.com/profile

In [1]:
pip install signal-ocean

In [2]:
signal_ocean_api_key = '' #replace with your subscription key

In [3]:
from signal_ocean import Connection
from signal_ocean.voyages import VoyagesAPI
import pandas as pd
import numpy as np
from datetime import date, timedelta, datetime, timezone
from dateutil.relativedelta import relativedelta

In [4]:
pd.set_option('display.max_columns', None)

In [5]:
connection = Connection(signal_ocean_api_key)
api = VoyagesAPI(connection)

### Get voyages

In [6]:
vlcc_id = 84
date_from = date.today() - relativedelta(months=6)

In [7]:
voyages = api.get_voyages(vessel_class_id=vlcc_id, date_from=date_from)

In [8]:
voyages = pd.DataFrame(v.__dict__ for v in voyages)
events = pd.DataFrame(e.__dict__ for voyage_events in voyages['events'].dropna() for e in voyage_events)

In [9]:
# we filter out predicted voyages that have not yet started
voyages.end_date = pd.to_datetime(voyages.end_date, errors = 'coerce', utc = True)
voyages.dropna(subset = ['end_date'], inplace = True)
voyages = voyages[voyages.start_date.dt.date <= date.today()]

In [10]:
def get_open_load_discharge_events(voyage_events):
    open_event = next((e.__dict__ for e in voyage_events or [] if e.purpose=='Start'), None)
    load_event = next((e.__dict__ for e in voyage_events or [] if e.purpose=='Load'), None)
    discharge_event = next((e.__dict__ for e in reversed(voyage_events) or [] if e.purpose=='Discharge'), None)
    return pd.Series((open_event,load_event, discharge_event))
    
voyages[['open_event','load_event','discharge_event']] = voyages['events'].apply(get_open_load_discharge_events)

In [11]:
mapping_dict = {'port_name':['starting_port','first_load_port','last_discharge_port'],
                'area_name_level0':['starting_area','first_load_area','last_discharge_area'], 
                'country':['starting_country','first_load_country','last_discharge_country'],
                'arrival_date':['open_port_arrival_date','first_load_port_arrival_date','last_discharge_port_arrival_date'],
                'sailing_date':['open_port_sailing_date','first_load_port_sailing_date','last_discharge_port_sailing_date'], 
                }

events = {0:'open_event',1:'load_event',2:'discharge_event'}

In [12]:
for feature,targets in mapping_dict.items():
    for num,target in enumerate(targets):
        voyages[target] = voyages[events[num]].apply(lambda e: e[feature] if isinstance(e,dict) else None)

In [13]:
def get_start_time_of_operation(event):
    if (event['event_type'] == 'PortCall') and (event['event_horizon'] != 'Future'):
        next_event_detail = next((ed.__dict__ for ed in event['event_details'] or []), None)
        return next_event_detail['start_time_of_operation']

In [14]:
voyages.loc[voyages.load_event.notna(),'first_load_port_start_time_of_operation'] = (
   voyages.loc[voyages.load_event.notna()].load_event.apply(get_start_time_of_operation)
)
voyages.loc[voyages.load_event.notna(),'last_discharge_port_start_time_of_operation'] = (
   voyages.loc[voyages.discharge_event.notna()].discharge_event.apply(get_start_time_of_operation)
)

voyages.first_load_port_start_time_of_operation = pd.to_datetime(voyages.first_load_port_start_time_of_operation)
voyages.last_discharge_port_start_time_of_operation = pd.to_datetime(voyages.last_discharge_port_start_time_of_operation)

In [15]:
def get_sts_load_ind(load_event):
    return next((True for d in load_event["event_details"] or [] if d.event_detail_type =='StS'), False)

def get_sts_discharge_ind(discharge_event):
    return next((True for d in discharge_event["event_details"] or [] if d.event_detail_type =='StS'), False)


voyages.loc[voyages.discharge_event.notna(),'sts_discharge_ind'] = \
voyages.loc[voyages.discharge_event.notna(),'discharge_event'].apply(get_sts_discharge_ind)
voyages.loc[voyages.load_event.notna(),'sts_load_ind'] = \
voyages.loc[voyages.load_event.notna(),'load_event'].apply(get_sts_load_ind)

In [16]:
def get_repairs_ind(events):
    for ev in events:
        if ev.purpose == 'Dry dock':
            return True
    return False

In [17]:
voyages['repairs_ind'] = voyages.events.apply(get_repairs_ind)

In [18]:
def get_storage_ind(events):
    for ev in events:
        if ev.purpose == 'StorageVessel':
            return True
    return False

In [19]:
voyages['storage_ind'] = voyages.events.apply(get_storage_ind)

In [20]:
voyages['local_trade_ind'] = voyages.apply(
    lambda row: row['first_load_country'] == row['last_discharge_country'],
    axis = 1
)

In [21]:
vessel_status_dict = {
    1:"Voyage", 2:"Breaking", 3:"Domestic Trade", 4:"FPSO", 5:"FPSO Conversion", 
    6:"Inactive", 7:"Storage Vessel", 9:"Conversion"
}
voyages['vessel_status'] = voyages.vessel_status_id.replace(vessel_status_dict)

In [22]:
commercial_status_dict = {
    0:"OnSubs", 1:"FullyFixed", 2:"Failed", 3:"Cancelled", 4:"Available", 
    -1:"Unknown", -2:"NotSet"
}
voyages['commercial_status'] = voyages.fixture_status_id.replace(commercial_status_dict)

In [23]:
wanted_columns = ['vessel_name',
                  'imo',
                  'vessel_class',
                  'commercial_operator',
                  'voyage_number',
                  'start_date',
                  'end_date',
                  'starting_port',
                  'first_load_port',
                  'last_discharge_port',
                  'first_load_port_arrival_date',
                  'first_load_port_start_time_of_operation',
                  'first_load_port_sailing_date',
                  'last_discharge_port_arrival_date',
                  'last_discharge_port_start_time_of_operation',
                  'last_discharge_port_sailing_date',
                  'charterer',
                  'rate',
                  'rate_type',
                  'laycan_from',
                  'laycan_to',
                  'quantity',
                  'cargo_group',
                  'cargo_type',
                  'cargo_type_source',
                  'fixture_is_coa',
                  'fixture_is_hold',
                  'fixture_date',
                  'trade',
                  'vessel_status',
                  'commercial_status',
                  'starting_country',
                  'starting_area',
                  'first_load_country',
                  'first_load_area',
                  'last_discharge_country',
                  'last_discharge_area',
                  'sts_load_ind',
                  'sts_discharge_ind',
                  'storage_ind',
                  'repairs_ind',
                  'is_implied_by_ais',
                  'local_trade_ind',
                  'has_manual_entries',
                  'ballast_distance',
                  'laden_distance'
                 ]

voyages = voyages[wanted_columns]

In [24]:
import re

def snake_to_camel(word):
    return ''.join(x.capitalize() or '_' for x in word.split('_'))

In [25]:
voyages.columns = [*map(snake_to_camel, voyages.columns)]
voyages

Unnamed: 0,VesselName,Imo,VesselClass,CommercialOperator,VoyageNumber,StartDate,EndDate,StartingPort,FirstLoadPort,LastDischargePort,FirstLoadPortArrivalDate,FirstLoadPortStartTimeOfOperation,FirstLoadPortSailingDate,LastDischargePortArrivalDate,LastDischargePortStartTimeOfOperation,LastDischargePortSailingDate,Charterer,Rate,RateType,LaycanFrom,LaycanTo,Quantity,CargoGroup,CargoType,CargoTypeSource,FixtureIsCoa,FixtureIsHold,FixtureDate,Trade,VesselStatus,CommercialStatus,StartingCountry,StartingArea,FirstLoadCountry,FirstLoadArea,LastDischargeCountry,LastDischargeArea,StsLoadInd,StsDischargeInd,StorageInd,RepairsInd,IsImpliedByAis,LocalTradeInd,HasManualEntries,BallastDistance,LadenDistance
0,Hapon,9102241,VLCC,Bahri,131,2021-09-27 13:22:23.500000+00:00,2021-10-08 00:00:00+00:00,Dongjiangkou,,,NaT,NaT,NaT,NaT,NaT,NaT,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Crude,Inactive,,China,North China,,,,,,,False,False,,True,,179.56,
1,Arman 114,9116412,VLCC,NITC,30,2021-10-11 19:59:10+00:00,2021-12-06 01:54:00.500000+00:00,Port Said,Aqabah,Suez,2021-10-13 14:11:21.500000+00:00,NaT,2021-11-21 04:08:52.500000+00:00,2021-11-23 23:50:54.500000+00:00,NaT,2021-12-06 01:54:00.500000+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Crude,Voyage,,Egypt,East Mediterranean,Jordan,Red Sea,Egypt,Red Sea,False,False,False,False,,False,,372.28,340.55
7,Karo,9182291,VLCC,New Shipping,66,2021-07-28 15:59:44+00:00,2021-09-25 01:50:26+00:00,Tanjung Pelepas,Fujairah,Tanjung Pelepas,2021-08-17 23:59:41+00:00,NaT,2021-08-31 13:32:11+00:00,2021-09-19 15:38:47+00:00,2021-09-20 07:55:05+00:00,2021-09-25 01:50:26+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Crude,Voyage,,Malaysia,Singapore / Malaysia,United Arab Emirates,Arabian Gulf,Malaysia,Singapore / Malaysia,False,True,False,False,,False,,3322.17,3204.19
8,Karo,9182291,VLCC,New Shipping,67,2021-09-25 01:50:26+00:00,2021-10-12 15:03:18+00:00,Tanjung Pelepas,Tanjung Pelepas,Tanjung Pelepas,2021-10-07 11:55:17+00:00,2021-10-07 11:55:17+00:00,2021-10-10 07:57:19+00:00,2021-10-10 11:42:16+00:00,2021-10-10 11:42:16+00:00,2021-10-12 15:03:18+00:00,,,,NaT,NaT,,Dirty,Fueloil,Estimated,,,NaT,Crude,Voyage,,Malaysia,Singapore / Malaysia,Malaysia,Singapore / Malaysia,Malaysia,Singapore / Malaysia,True,True,False,False,,True,,3.79,1.76
9,Karo,9182291,VLCC,New Shipping,68,2021-10-12 15:03:18+00:00,2021-12-15 11:52:08+00:00,Tanjung Pelepas,Fujairah,Tanjung Pelepas,2021-11-15 05:55:23.500000+00:00,NaT,2021-11-17 12:00:01.500000+00:00,2021-12-01 07:56:16+00:00,2021-12-01 11:59:43+00:00,2021-12-15 11:52:08+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Crude,Voyage,,Malaysia,Singapore / Malaysia,United Arab Emirates,Arabian Gulf,Malaysia,Singapore / Malaysia,False,True,False,False,,False,,4536.70,2068.36
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1927,Grand Ambition,9909807,VLCC,Not set,1,2021-09-03 14:20:42+00:00,2021-12-15 11:59:36+00:00,Okpo/Geoje,Yosu,Lome,2021-10-14 12:33:04+00:00,2021-10-15 11:56:00+00:00,2021-10-20 07:47:13+00:00,2021-12-01 15:57:26+00:00,2021-12-04 07:59:40+00:00,2021-12-15 11:59:36+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Product,Voyage,,"Korea, Republic of",Korea,"Korea, Republic of",Korea,Togo,Africa Atlantic Coast,False,True,False,True,,False,,1133.20,11038.99
1928,Grand Ambition,9909807,VLCC,Trafigura,2,2021-12-15 11:59:36+00:00,2022-01-02 00:32:33.430000+00:00,Lome,Lome,Lome,2021-12-15 15:55:01+00:00,2021-12-15 15:55:01+00:00,2021-12-19 15:57:58+00:00,2021-12-19 19:54:44+00:00,2021-12-19 19:54:44+00:00,2022-01-02 00:32:33.430000+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Product,Voyage,,Togo,Africa Atlantic Coast,Togo,Africa Atlantic Coast,Togo,Africa Atlantic Coast,True,True,False,False,,True,,3.93,42.88
1929,Tateshina,9910117,VLCC,Not set,1,2021-10-27 01:04:54+00:00,2022-01-08 23:00:00+00:00,Qushan Island,Ruwais,Southwold,2021-12-07 19:33:57+00:00,2021-12-09 15:58:59+00:00,2021-12-12 07:58:22+00:00,2022-01-06 17:32:13.298000+00:00,NaT,2022-01-08 23:00:00+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Crude,Voyage,,China,Central China,United Arab Emirates,Arabian Gulf,United Kingdom,British Isles,False,False,False,True,,False,,6216.48,4805.88
1931,Grand Bonanza,9915569,VLCC,Koch,1,2021-10-20 05:16:43+00:00,2021-12-12 11:55:27+00:00,Okpo/Geoje,Balboa,Singapore,2021-12-06 23:48:37+00:00,2021-12-07 14:11:13+00:00,2021-12-09 15:50:45+00:00,2021-12-11 19:55:27+00:00,NaT,2021-12-12 11:55:27+00:00,,,,NaT,NaT,,Dirty,Crude,Estimated,,,NaT,Product,Voyage,,"Korea, Republic of",Korea,Panama,West Coast Central America,Singapore,Singapore / Malaysia,True,False,False,True,,False,,1841.19,573.52


In [26]:
datetime_columns = voyages.select_dtypes(include=['datetime64[ns, UTC]']).columns

voyages.loc[:,datetime_columns] = (
    voyages
    .select_dtypes(
        include=['datetime64[ns, UTC]']
    ).apply(lambda column: column.dt.tz_localize(None),
        axis = 0
    )
)

In [27]:
voyages.to_excel('voyages_data.xlsx', index = False)