# Tutorial 2: Access Vessel Particulars and Voyage Tables with CO2 Emission Estimates

Voyage tables are emission data broken down and accumulated per voyage for every vessel. The voyage tables were created by fusing the World Port Index and the hourly ship tracks (AIS). 

The data contain from/to ports and the total co2 emission. In this example we will calculate the CO2 emissions from all Bullk carriers going from Port Hedland in Australia to Serangoon harbour i Singapore.

OBS: Not all ship tracks are broken up to voyages correctly, since not all harbours are listed in the World Port Index.

In [1]:
import os
import dask.dataframe as dd
import pandas as pd

from hackathon_utils import get_files_from_blob


In [5]:
# Connection string to blob storage has to be set if being run outside the Ocean Data Connector
#os.environ['HACKATHON_CONNECTION_STR']="xxxxxxxxxx"
try: 
    os.environ['HACKATHON_CONNECTION_STR']
except:
    print('HACKATHON_CONNECTION_STR must be set to access data')

In [None]:
from dask.distributed import Client
client=Client() #Specify number of workers with n_workers

### Retrieve voyage data from storage

Select a specific vessel type by choosing the appropriate folder below.

In [7]:


# Available folders divided into vessel categories
folders=['parquet/voyage_tables/Bulk carrier/',
 'parquet/voyage_tables/Chemical tanker/',
 'parquet/voyage_tables/Container/',
 'parquet/voyage_tables/Cruise/',
 'parquet/voyage_tables/Ferry-pax only/',
 'parquet/voyage_tables/Ferry-ro-pax/',
 'parquet/voyage_tables/General cargo/',
 'parquet/voyage_tables/Liquefied gas tanker/',
 'parquet/voyage_tables/Offshore/',
 'parquet/voyage_tables/Oil tanker/',
 'parquet/voyage_tables/Other liquid tankers/',
 'parquet/voyage_tables/Refrigerated bulk/',
 'parquet/voyage_tables/Ro-ro/',
 'parquet/voyage_tables/Vehicle/']

file_list=get_files_from_blob(folders[0])
print(f'Total number of voyage files : {len(file_list)}')



Total number of voyage files : 11724


Open dask dataframe from voyage parquet files in blob storage

In [8]:
df=dd.read_parquet(file_list, storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [9]:
df

Unnamed: 0_level_0,Unnamed: 0,voyage_departure,voyage_arrival,from_port,to_port,co2_kg,duration_hours,distance_nm,interpolated_ratio,avg_speed_knts,mmsi,vessel_type
npartitions=11724,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
,int64,object,object,object,object,float64,int64,float64,float64,float64,int64,object
,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...
,...,...,...,...,...,...,...,...,...,...,...,...
,...,...,...,...,...,...,...,...,...,...,...,...


Filter and compute the dask dataframe. Finding all the voyages between the two harbours with departure dates in 2020. The result is a pandas in memory dataframe.

In [10]:
%%time
df_route=df[(df.from_port=='PORT HEDLAND') & (df.to_port=='SERANGOON HARBOR')
           & (df.voyage_departure>='2020-01-01')  & (df.voyage_departure<'2021-01-01')].compute()

CPU times: user 1min 6s, sys: 6.8 s, total: 1min 13s
Wall time: 3min 5s


In [15]:
df_route.head()

Unnamed: 0.1,Unnamed: 0,voyage_departure,voyage_arrival,from_port,to_port,co2_kg,duration_hours,distance_nm,interpolated_ratio,avg_speed_knts,mmsi,vessel_type
17,17,2020-11-22 04:00:00+00:00,2020-11-28 14:00:00+00:00,PORT HEDLAND,SERANGOON HARBOR,916461.9,154,1700.37304,0.281046,11.201396,215350000,Bulk carrier
11,11,2020-06-12 14:00:00+00:00,2020-06-19 03:00:00+00:00,PORT HEDLAND,SERANGOON HARBOR,383365.2,157,3143.758234,0.339744,10.767093,229266000,Bulk carrier
2,2,2020-03-20 23:00:00+00:00,2020-04-22 12:00:00+00:00,PORT HEDLAND,SERANGOON HARBOR,2373303.0,781,14190.120651,0.468454,8.595073,240446000,Bulk carrier
18,18,2020-12-08 21:00:00+00:00,2020-12-15 19:00:00+00:00,PORT HEDLAND,SERANGOON HARBOR,376112.1,166,1679.649591,0.187879,10.247879,241282000,Bulk carrier
5,5,2020-06-09 14:00:00+00:00,2020-07-16 14:00:00+00:00,PORT HEDLAND,SERANGOON HARBOR,2718785.0,888,13487.281342,0.573201,7.468103,241358000,Bulk carrier


Calculating total amount of CO2 from this route

In [16]:
df_route.co2_kg.sum()

118551025.69746687

In [17]:
df_route=df_route.sort_values('co2_kg',ascending=False)

In [18]:
voyage=df_route.iloc[2]
voyage

Unnamed: 0                                    2
voyage_departure      2020-03-27 03:00:00+00:00
voyage_arrival        2020-04-27 11:00:00+00:00
from_port                          PORT HEDLAND
to_port                        SERANGOON HARBOR
co2_kg                            4352795.86194
duration_hours                              752
distance_nm                         7440.171078
interpolated_ratio                     0.434483
avg_speed_knts                         9.125969
mmsi                                  636014327
vessel_type                        Bulk carrier
Name: 2, dtype: object

### Access vessel particulars

The vessle particulars contains some information about each vessel, invluding vessel class (ICCT_class) and mmsi number ('MaritimeMobileServiceIdentityMMSINumber)

In [19]:
df_vessel_particulars=pd.read_csv(get_files_from_blob('csv/vessel_particulars/')[0], storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [20]:
df_vessel_particulars[df_vessel_particulars['MaritimeMobileServiceIdentityMMSINumber']==voyage.mmsi].iloc[0]

LRIMOShipNo                                                    9334882
ShipName                                                     ABIGAIL N
ShiptypeLevel5                                             Ore Carrier
YearOfBuild                                                       2009
GrossTonnage                                                    151448
Deadweight                                                      297430
ShipStatus                                       In Service/Commission
FlagName                                                       Liberia
FuelType1First                                         Distillate Fuel
LengthRegistered                                                320.84
MainEngineType                                                     Oil
MaritimeMobileServiceIdentityMMSINumber                    636014327.0
PropulsionType                             Oil Engine(s), Direct Drive
Speedmax                                                          16.9
Speeds

### Acces port information

The World Port Index is a dataset with many of the larger ports in the world. See https://msi.nga.mil/Publications/WPI for more information.

In [21]:
df_wpi=pd.read_csv(get_files_from_blob('csv/world_port_index/')[0], storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [22]:
df_wpi[df_wpi['Main Port Name']=='Port Hedland'].iloc[0]

World Port Index Number                 54620
Region Name                Australia -- 53290
Main Port Name                   Port Hedland
Alternate Port Name                          
UN/LOCODE                              AU PHE
                                  ...        
Repairs                                 Major
Dry Dock                              Unknown
Railway                                 Small
Latitude                           -20.316667
Longitude                          118.583333
Name: 3138, Length: 107, dtype: object