# Tutorial 2: Access Vessel Particulars and Voyage Tables with CO2 Emission Estimates

Voyage tables are emission data broken down and accumulated per voyage for every vessel. The voyage tables were created by fusing the World Port Index and the hourly ship tracks (AIS). 

The data contain from/to ports and the total co2 emission. In this example we will calculate the CO2 emissions from all Bullk carriers going from Port Hedland in Australia to Serangoon harbour i Singapore.

OBS: Not all ship tracks are broken up to voyages correctly, since not all harbours are listed in the World Port Index.

In [1]:
import os
import dask.dataframe as dd
import pandas as pd

from hackathon_utils import get_files_from_blob


In [2]:
# Connection string to blob storage has to be set if being run outside the Ocean Data Connector
#os.environ['HACKATHON_CONNECTION_STR']="xxxxxxxxxx"

try: 
    os.environ['HACKATHON_CONNECTION_STR']
except:
    print('HACKATHON_CONNECTION_STR must be set to access data')

In [3]:
from dask.distributed import Client
client=Client() #Specify number of workers with n_workers

### Retrieve voyage data from storage

Select a specific vessel type by choosing the appropriate folder below.

In [4]:


# Available folders divided into vessel categories
folders=['parquet/voyage_tables/Bulk carrier/',
 'parquet/voyage_tables/Chemical tanker/',
 'parquet/voyage_tables/Container/',
 'parquet/voyage_tables/Cruise/',
 'parquet/voyage_tables/Ferry-pax only/',
 'parquet/voyage_tables/Ferry-ro-pax/',
 'parquet/voyage_tables/General cargo/',
 'parquet/voyage_tables/Liquefied gas tanker/',
 'parquet/voyage_tables/Offshore/',
 'parquet/voyage_tables/Oil tanker/',
 'parquet/voyage_tables/Other liquid tankers/',
 'parquet/voyage_tables/Refrigerated bulk/',
 'parquet/voyage_tables/Ro-ro/',
 'parquet/voyage_tables/Vehicle/']

file_list=get_files_from_blob(folders[0])
print(f'Total number of voyage files : {len(file_list)}')



KeyError: 'HACKATHON_CONNECTION_STR'

Open dask dataframe from voyage parquet files in blob storage

In [None]:
df=dd.read_parquet(file_list, storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [None]:
df

Filter and compute the dask dataframe. Finding all the voyages between the two harbours with departure dates in 2020. The result is a pandas in memory dataframe.

In [None]:
%%time
df_route=df[(df.from_port=='PORT HEDLAND') & (df.to_port=='SERANGOON HARBOR')
           & (df.voyage_departure>='2020-01-01')  & (df.voyage_departure<'2021-01-01')].compute()

In [None]:
df_route.head()

In [None]:
df_route.head().to_markdown(tablefmt="grid")

Calculating total amount of CO2 from this route

In [None]:
df_route.co2_kg.sum()

In [None]:
df_route=df_route.sort_values('co2_kg',ascending=False)

In [None]:
voyage=df_route.iloc[2]
voyage

### Access vessel particulars

The vessle particulars contains some information about each vessel, invluding vessel class (ICCT_class) and mmsi number ('MaritimeMobileServiceIdentityMMSINumber)

In [None]:
df_vessel_particulars=pd.read_csv(get_files_from_blob('csv/vessel_particulars/')[0], storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [None]:
df_vessel_particulars[df_vessel_particulars['MaritimeMobileServiceIdentityMMSINumber']==voyage.mmsi].iloc[0]

### Acces port information

The World Port Index is a dataset with many of the larger ports in the world. See https://msi.nga.mil/Publications/WPI for more information.

In [None]:
df_wpi=pd.read_csv(get_files_from_blob('csv/world_port_index/')[0], storage_options={"connection_string": os.environ['HACKATHON_CONNECTION_STR']})

In [None]:
df_wpi[df_wpi['Main Port Name']=='Port Hedland'].iloc[0]