# Voyages API Voyages Data Like Use Case

## Run this example in [Colab](https://colab.research.google.com/github/SignalOceanSdk/SignalSDK/blob/master/docs/examples/jupyter/VoyagesAPI/VoyagesAPI-VoyagesDataLike.ipynb). 

<p style="text-align: justify"> 
     Floating Storages are laden vessels that remain stopped, instead of directly proceeding with the laden part of the voyage and the discharge of the cargo. This is usually performed for trading reasons. The minimum duaration for a stop to be classified as a Floating Storage is 7 days for concluded Stops and 3 days for ongoing ones. 
</p>

<p style="text-align: justify"> 
    Very often arises the need of conducting an analysis of the total quantity of the quantity of oil that remains in floating storages, either globally or in a specific area/port and for a given time window. This is accommodated by the <b>VoyagesData API</b>
</p>

Both `get_voyages_by_advanced_search` and `get_voyages_flat_by_advanced_search` of the Signal SDK facilititate this need. In this example, we will be constructing a dataframe with all the floating storage events of interest, from which a time series of total stored quantitites will be derived.



## Setup
Install the Signal Ocean SDK:
```
pip install signal-ocean
```
Set your subscription key acquired here: https://apis.signalocean.com/profile

In [None]:
pip install signal-ocean

In [None]:
signal_ocean_api_key = '' #replace with your subscription key

In [None]:
from signal_ocean import Connection
from signal_ocean.voyages import VoyagesAPI
from signal_ocean.voyages import VesselClass, VesselClassFilter
import pandas as pd
from datetime import date, timedelta, datetime, timezone
from dateutil.relativedelta import relativedelta
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-darkgrid')

In [None]:
pd.set_option('display.max_columns', None)

In [None]:
connection = Connection(signal_ocean_api_key)
api = VoyagesAPI(connection)

### Get voyages

For this tutorial we will retrieve the voyages of VLCC vessels that have started between July 2019-2020.

In [None]:
#get vessel class id for vlcc
vc = api.get_vessel_classes(VesselClassFilter('vlcc'))[0]
vlcc_id = vc.vessel_class_id
vlcc_id

In [None]:
start_date_to = date(2020,7,31)
start_date_from = start_date_to - relativedelta(months=12)

In [None]:
%%time
voyages = api.get_voyages_by_advanced_search(
    vessel_class_id = vlcc_id,
    start_date_from = start_date_from,
    start_date_to = start_date_to
)
voyages = pd.DataFrame(v.__dict__ for v in voyages)
events = pd.DataFrame(e.__dict__ for voyage_events in voyages['events'].dropna() for e in voyage_events)
event_details = pd.DataFrame(d.__dict__ for event_detail in events['event_details'].dropna() for d in event_detail)

In [None]:
# extracting floating storage events
floating_storage_events_data = []

for iVoyage, r in voyages.iterrows():
    imo = r['imo']
    voyage_number = r['voyage_number']
    vessel_class = r['vessel_class']
    cargo_group = r['cargo_group']
    cargo_type = r['cargo_type']
    quantity = r['quantity']

    events = r['events']
    
    for event in events:
        if not event.event_details:
            continue
        port_name = event.port_name
        country = event.country
        
        for event_detail in event.event_details:
            if not event_detail.floating_storage_start_date:
                continue
            floating_storage_start_date = event_detail.floating_storage_start_date
            floating_storage_duration = event_detail.floating_storage_duration
            
            floating_storage_events_data.append([
                imo, voyage_number, vessel_class,cargo_group,cargo_type, quantity,
                port_name, country, floating_storage_start_date, 
                floating_storage_duration
            ])

floating_storage_events_df = pd.DataFrame(floating_storage_events_data, 
                                           columns=['imo', 'voyage_number', 'vessel_class','cargo_group', 'cargo_type',
                                                    'quantity', 'port_name', 'country','floating_storage_start_date', 
                                                    'floating_storage_duration'
                                                   ])

Here the user can set a higher threshold for a Stop to be considered as a Floating Storage. In this example, we use 20 days.

In [None]:
# define the threshold in days to filter floating_storage_duration
threshold = 20
floating_event_details = event_details.loc[event_details.floating_storage_duration >= threshold,
                                           ['event_id','floating_storage_start_date', 'floating_storage_duration']].copy()

In [None]:
floating_storage_events_df.head(2)

In [None]:
# keeping only the date part of floating_storage_start_date, since the floating_storage_duration is given in days
floating_storage_events_df['floating_storage_start_date'] = floating_storage_events_df.floating_storage_start_date.apply(lambda x: x.date)

In [None]:
excluded = ['Fueloil', 'Crude Condensate', 'Algerian Condensate', 'Agbami Condensate', 
            'Crude Condensate', 'Ichthys Condensate', 'High Sulphur Vacuum Gasoil']
floating_storage_events_df = floating_storage_events_df[(floating_storage_events_df.cargo_group == 'Dirty') &
                                                        (~floating_storage_events_df.cargo_type.isin(excluded))
                                                       ].copy()

In [None]:
floating_storage_events_df['floating_storage_end_date'] = floating_storage_events_df.apply(
    lambda r: r['floating_storage_start_date'] + relativedelta(days=r['floating_storage_duration']), axis=1)

In [None]:
def snake_to_camel(word):
    return ''.join(x.capitalize() or '_' for x in word.split('_'))

In [None]:
floating_storage_events_df.head(2)

In [None]:
floating_storage_events_df.columns = [*map(snake_to_camel, floating_storage_events_df.columns)]
floating_storage_events_df = floating_storage_events_df[['Imo', 'VoyageNumber', 'VesselClass', 'CargoType', 'Quantity',
                                                         'PortName', 'Country', 'FloatingStorageStartDate',
                                                         'FloatingStorageEndDate'
                                                        ]].copy()
floating_storage_events_df.head(2)

In [None]:
# min and max dates for consideration
date_min = date(2020, 2, 1)
date_max = date(2020, 8, 1)

delta = (date_max - date_min).days

In [None]:
oil_on_water_data = []

for iDay in range(delta):
    curr_date = date_min + relativedelta(days=iDay)
    quantity_on_water = floating_storage_events_df[(floating_storage_events_df.FloatingStorageStartDate <= curr_date) &
                                                   (floating_storage_events_df.FloatingStorageEndDate >= curr_date)
                                                  ].Quantity.sum()
    oil_on_water_data.append([curr_date, quantity_on_water])

In [None]:
oil_on_water_series = pd.DataFrame(oil_on_water_data, columns=['Date', 'Quantity'])
oil_on_water_series['Quantity'] = oil_on_water_series['Quantity'] / 10 ** 6 # million metric tonnes

In [None]:
oil_on_water_series.head(3)

In [None]:
fig1 = plt.figure(figsize=(5, 3))
axes1 = fig1.add_axes([0, 0, 1, 1])
axes1.plot(oil_on_water_series.Date, oil_on_water_series.Quantity)
axes1.set_title('Oil on water between February and August, 2020.')
axes1.set_xlabel('Date')
axes1.set_ylabel('Quantity (million MT)')
plt.show()

In [None]:
#voyages.to_excel('voyages_data.xlsx', index = False)