## Vessels Count & Cargo Quantity transiting waypoint

This notebook aims to provide vessels transiting list at certain waypoint/polygons with the uses of VoyageSearchEnriched endpoint. It provides the information such as entry date, exit date of the waypoint, vessel info, cargo info etc. With this dataset, we can then look at:

1. How many vessels transiting the waypoint (Suez Canal, Cape of Good Hope etc) in each day.
2. What are the cargoes that are transiting the waypoint in each day & its quantity.

Mainly this is for the cases of the red sea incident recently. 
For **Bab-el-mandeb**, below is the filter for VoyageSearchEnriched:

1. Northbound
- origin: EoS excl Red Sea
- dest: anywhere
- location: bab-el-mandeb

2. Southbound
- origin: WoS + red sea
- dest: anywhere
- location: bab-el-mandeb

For **Suez Canal**, below is the filter for VoyageSearchEnriched. Well here is the interesting bit. Depending on what the client wants to see.

If they want to see all the Suez transits then you would have to use:

1. Northbound - EoS - anywhere

2. Southbound - WoS - anywhere

If they want to see all the Suez transits that are “affected” from Yemen attacks then should be:
1. Northbound - Origin= EoS,excl. Red Sea, Dest=Anywhere

2. Southbound - Origin= WoS, Dest=Anywhere, excl. Red Sea


### Example

In this notebook, we will query all vessels that transit through Cape, regardless of its direction.

## Import Libraries

In [7]:
import vortexasdk as v
from datetime import datetime,timedelta,date
import pandas as pd
import plotly.express as px

## Extracting ids

In [8]:
eos = [p.id for p in v.Geographies().search('East of Suez').to_list() if p.layer==['alternative_region']]
wos = [p.id for p in v.Geographies().search('West of Suez').to_list() if p.layer==['alternative_region']]
red_sea = [p.id for p in v.Geographies().search('Red Sea').to_list() if p.layer==['shipping_region']]
bab_el_mandeb= [p.id for p in v.Geographies().search('bab-el-mandeb').to_list() if p.layer==['waypoint']]
panama= [p.id for p in v.Geographies().search('Panama Canal').to_list() if p.layer==['waypoint']]
suez= [p.id for p in v.Geographies().search('Suez Canal').to_list() if p.layer==['waypoint']]
cape= [p.id for p in v.Geographies().search('Cape of Good Hope').to_list() if p.layer==['waypoint']]
india= [p.id for p in v.Geographies().search('India').to_list() if p.layer==['country']]
cape

['d10056e8bf4109ebf0370c511953f046ff75552b3dc6ed788fcb4d5ea6b71d3f',
 '5637a23e1ec27f7598abfe0166b014a65eeb7561cf46f6770e460dca65f15f71',
 'c9a1a65ac7d63fe1ed1fe115e5c4c5919482bbc669f3c8c7e963945523874755']

In [9]:
cpp = [p.id for p in v.Products().search('Clean Petroleum Products').to_list() if p.name=='Clean Petroleum Products']
dpp = [p.id for p in v.Products().search('Dirty Petroleum Products').to_list() if p.name=='Dirty Petroleum Products']
crude = [p.id for p in v.Products().search('Crude/Condensates').to_list() if p.name=='Crude/Condensates']
diesel = [p.id for p in v.Products().search('Diesel/Gasoil').to_list() if p.name=='Diesel/Gasoil']
gasoline = [p.id for p in v.Products().search('Gasoline/Blending Components').to_list() if p.name=='Gasoline/Blending Components']
jet = [p.id for p in v.Products().search('Jet/Kero').to_list() if p.name=='Jet/Kero']
lng = [p.id for p in v.Products().search('LNG').to_list() if p.name=='LNG']
lpg = [p.id for p in v.Products().search('LPG+').to_list() if p.name=='LPG+']

In [10]:
other_cpp = [p.id for p in v.Products().search('Other Clean Products').to_list() if p.name=='Other Clean Products']
cond = [p.id for p in v.Products().search('Condensates').to_list() if p.name=='Condensates']
biodiesel = [p.id for p in v.Products().search('Biodiesel').to_list() if p.name=='Biodiesel']

assert len(other_cpp)==1
assert len(cond)==1
assert len(biodiesel)==1

## Search Enriched Calculation (change filter)

In [12]:
test = v.VoyagesSearchEnriched().search(
    time_min = datetime(2023,11,1),
    time_max = datetime.today(),
    vessels = 'oil',
    locations = cape,
    unit = 't',
).to_list()

2024-09-17 16:26:45,096 vortexasdk.client — ERROR — Could not decode response


In [13]:
cargo_events = pd.DataFrame(i.__dict__ for i in test[0].events)
cargo_events = cargo_events[cargo_events['event_type']=='cargo']

## Pre-processing function

In [17]:
def extract_element_from_list(l1):
    if l1 == None:
        return None
    if len(l1)>0:
        return l1[0]
    else:
        return None
def calculating_visiting_time(list_of_voyage,location_id):
    voyage_rows = []
    idx = 0
    for voyage in list_of_voyage:
        cargo_events = pd.DataFrame(i.__dict__ for i in voyage.events)
        cargo_events = cargo_events[cargo_events['event_type']=='cargo']
        cargo_movement_id = cargo_events['cargo_movement_id'].unique()
        cargo_events['end_timestamp'].fillna((datetime.now()+timedelta(days = 1)).strftime("%Y-%m-%dT%H:%M:%S.%fZ"), inplace = True)
        
        if len(voyage.latest_product_details) == 0:
            latest_product_details = 'None'
        else: 
            latest_product_details = [i.label for i in voyage.latest_product_details[0] if i.layer == 'group'][0]
        record_row = {}
        for event in voyage.events:
            if event.event_type == 'visit' and event.location_id == location_id:
                record_row = {
                  'voyage_id': voyage.voyage_id,
                  'vessel_id': voyage.vessel.id,
                  'vessel name': voyage.vessel.name,
                  'vessel_imo': voyage.vessel.imo,
                  'vessel_class': voyage.vessel.vessel_class,
                  'cargo_movement_id': cargo_movement_id,
                  'entry_timestamp': event.start_timestamp,
                  'exit_timestamp': event.end_timestamp,
                  'location': event.location_details[0].label,
                    'voyage_status':voyage.voyage_status,
                    'latest_products_details':latest_product_details
                }
                if record_row['exit_timestamp'] == None:
                    record_row['exit_timestamp'] = datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%fZ")
                    print(f"{record_row['vessel name']} in canal")
                filtered_cargo_event = cargo_events[(cargo_events['start_timestamp'] < record_row['entry_timestamp'])
                                                   &(cargo_events['end_timestamp'] > record_row['exit_timestamp'])].reset_index(drop = True)
                if 'quantity_barrels' not in filtered_cargo_event.columns:
                    cargo_type = 'None'
                    #cargo_category = 'None'
                    quantity_sum = 0
                elif len(filtered_cargo_event)==0:
                    cargo_type = 'None'
                    cargo_category = 'None'
                    quantity_sum = 0
                else:
                    object_list = filtered_cargo_event.loc[0,'product_details']
                    cargo_type = [obj.label for obj in object_list if obj.layer =='group'][0]
                    #cargo_category = [obj.label for obj in object_list if obj.layer =='category'][0]
                    
                    # To-change unit - quantity_barrels, quantity_tonnes
                    quantity_sum = filtered_cargo_event['quantity_barrels'].sum()
                
                # cargo origin
                if len(filtered_cargo_event)>0:
                    object_list = filtered_cargo_event.loc[0,'cargo_origin_details']
                    origin_port = [obj.label for obj in object_list if obj.layer =='port']
                    origin_country = [obj.label for obj in object_list if obj.layer =='country']

                    # cargo dest
                    object_list = filtered_cargo_event.loc[0,'cargo_destination_details']
                    dest_port = [obj.label for obj in object_list if obj.layer =='port']
                    dest_country = [obj.label for obj in object_list if obj.layer =='country']
                else:
                    origin_port,origin_country,dest_port,dest_country = None,None,None,None
                
                record_row['origin_port'] = extract_element_from_list(origin_port)
                record_row['origin_country'] = extract_element_from_list(origin_country)
                record_row['dest_port'] = extract_element_from_list(dest_port)
                record_row['dest_country'] = extract_element_from_list(dest_country)
                record_row['quantity'] = quantity_sum
                record_row['product'] = cargo_type
                #record_row['category'] = cargo_category
                voyage_rows.append(record_row)
        idx+=1
    voyage_df = pd.DataFrame(voyage_rows)
    return voyage_df

## Output

In [18]:
result_df = calculating_visiting_time(test,cape[0][:16])

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  cargo_events['end_timestamp'].fillna((datetime.now()+timedelta(days = 1)).strftime("%Y-%m-%dT%H:%M:%S.%fZ"), inplace = True)


TORM EVOLVE in canal
SFL LION in canal


The above dataframe is the vessel list that passthrough the waypoint (Cape of Good Hope in this case). It's also provided the entry & exit timestamp of the waypoint, so you can have the flexibility to group the time axis. 

## Further processing & plotting

In [19]:
result_df['entry_timestamp'] = pd.to_datetime(result_df['entry_timestamp'])
result_df['exit_timestamp'] = pd.to_datetime(result_df['exit_timestamp'])

result_df['duration'] = result_df['exit_timestamp'] - result_df['entry_timestamp']
result_df.sort_values('entry_timestamp',ascending = False)

#Filter (optional)
result_df = result_df[result_df['entry_timestamp']>='2023-11-01'].reset_index(drop = True)

In [20]:
time_series_df = result_df.groupby([pd.Grouper(key = 'entry_timestamp', freq = '1W'),'voyage_status']).agg({'voyage_id':'count'}).reset_index()

In [21]:
px.bar(time_series_df,x = 'entry_timestamp', y = 'voyage_id', color = 'voyage_status', title = 'Number of vessels passing through Bab-el-Mandeb')

In [22]:
cargo_quantity_df =  result_df.groupby([pd.Grouper(key = 'entry_timestamp', freq = '1D'),'product']).agg({'quantity':'sum'}).reset_index()

In [23]:
px.bar(cargo_quantity_df,x = 'entry_timestamp', y = 'quantity', color = 'product', title = 'Cargo quantity passing through Bab-el-Mandeb')

## Save

In [37]:
title = 'Vessels at Bab-el-mandeb'
result_df.to_csv(f'{title}_transiting_list.csv')
cargo_quantity_df.to_csv(f'{title}_time_series.csv')
time_series_df.to_csv(f'{title}_count_time_series.csv')