# Intersection AIS with vessel lists

Intersection of SPIRE AIS data with lists of IUU vessels and the proactive vessel registry. IUU lists come from multiple sources most comprehensively the combined_iuu_list. In addition MMSI spoofing indicator from GFW is used. Legitimate vessels are acquired from the pro-active vessel registry.

Description of MMSI spoofing from GFW:

<em>List of MMSIs that experience substantial ID spoofing

By ID spoofing, we mean two or more vessels that are using the same MMSI at the same time. 

All the messages for an MMSI are grouped into sets of tracks that are contiguous spatially and temporally.  
Each continuous track has a unique seg_id field added.  Some tracks contain invalid lan/lon (like 91, 181) and 
are put into a special 'BAD' segment. 

The test for spoofing is fairly naive - we simple compute the extent of each segment in time, add them all up, 
and compare that to the extent of time that the vessel is active.  If the segment time is longer than the 
active time, then we know that some of the segments must overlap, and this is the indication of ID spoofing.
</em>

In [12]:
import pandas as pd
import ais_query

In [20]:
vessel_lists = pd.read_csv('iuu_list_of_lists.csv')

In [13]:
cols_position = ais_query.columns_position
cols_static = ais_query.columns_static

In [31]:
vessel_lists.list_source.value_counts()

gfw_spoofing         3385
pvr_purse_seiner      571
combined_iuu_list     120
pvr_longline           45
pvr_other              35
IATTC                   4
ICCAT                   3
Name: list_source, dtype: int64

In [78]:
sql_statement = """
SELECT imo, mmsi, name, ship_and_cargo_type, length, width, eta_date, destination
FROM ais_messages.ais_static;
"""
ais_static = ais_query.connect(sql_statement)
ais_static.columns = ['imo', 'mmsi', 'name', 'ship_and_cargo_type', 'length', 'width', 'eta_date', 'destination']

In [79]:
iuu_vessel_list = vessel_lists[vessel_lists.IUU=='yes']
legitimate_vessel_list = vessel_lists[vessel_lists.IUU=='no']

In [80]:
iuu_imo_filter = iuu_vessel_list[iuu_vessel_list.id_type=='IMO']
iuu_mmsi_filter = iuu_vessel_list[iuu_vessel_list.id_type=='mmsi']
legitimate_imo_filter = legitimate_vessel_list[legitimate_vessel_list.id_type=='IMO']
legitimate_mmsi_filter = legitimate_vessel_list[legitimate_vessel_list.id_type=='mmsi']

## iuu vessels intersection 

In [115]:
print("unique_imo = ", len(pd.merge(ais_static, iuu_imo_filter, how='inner', left_on=['imo'], right_on=['id']).drop_duplicates().mmsi.unique()))
pd.merge(ais_static, iuu_imo_filter, how='inner', left_on=['imo'], right_on=['id']).drop_duplicates().head()

unique_imo =  13


Unnamed: 0,imo,mmsi,name,ship_and_cargo_type,length,width,eta_date,destination,IUU,id,id_type,list_source
0,8913990.0,370070000,FONG KUO NO.819,70.0,115.0,16.0,2018-05-30 11:30:00 UTC,MAJURO,yes,8913992,IMO,combined_iuu_list
4,8913990.0,370070000,FONG KUO NO.819,70.0,115.0,16.0,2017-05-30 11:30:00 UTC,MAJURO,yes,8913992,IMO,combined_iuu_list
9,8913990.0,370070000,FONG KUO NO.819,70.0,115.0,16.0,2017-06-26 08:18:00 UTC,BANGKOK,yes,8913992,IMO,combined_iuu_list
45,8028420.0,616999336,DUBREKA,70.0,81.0,15.0,2018-04-30 08:00:00 UTC,FREETOWN,yes,8028424,IMO,combined_iuu_list
46,8028420.0,616999336,DUBREKA,70.0,81.0,15.0,2017-06-09 08:00:00 UTC,FREETOWN,yes,8028424,IMO,combined_iuu_list


## iuu vessels with mmsi spoofing detected by GFW

In [114]:
print("unique_mmsi = ", len(pd.merge(ais_static, iuu_mmsi_filter, how='inner', left_on=['mmsi'], right_on=['id']).drop_duplicates().mmsi.unique()))
pd.merge(ais_static, iuu_mmsi_filter, how='inner', left_on=['mmsi'], right_on=['id']).drop_duplicates().head()

unique_mmsi =  861


Unnamed: 0,imo,mmsi,name,ship_and_cargo_type,length,width,eta_date,destination,IUU,id,id_type,list_source
0,,416088900,HUNG SHING NO.212,,,,,,yes,416088900,mmsi,gfw_spoofing
5,,416088900,,30.0,50.0,6.0,,,yes,416088900,mmsi,gfw_spoofing
372,,111111114,A07,,,,,,yes,111111114,mmsi,gfw_spoofing
715,,111111114,A47BB,,,,,,yes,111111114,mmsi,gfw_spoofing
753,0.0,0,,80.0,338.0,1.0,,B),yes,0,mmsi,gfw_spoofing


## legitimate vessels

In [113]:
print("unique_imo = ",len(pd.merge(ais_static, legitimate_imo_filter, how='inner', left_on=['imo'], right_on=['id']).drop_duplicates().mmsi.unique()))
pd.merge(ais_static, legitimate_imo_filter, how='inner', left_on=['imo'], right_on=['id']).drop_duplicates().head()

unique_imo =  267


Unnamed: 0,imo,mmsi,name,ship_and_cargo_type,length,width,eta_date,destination,IUU,id,id_type,list_source
0,9517280.0,367344000,SEA HONOR,30.0,63.0,12.0,,,no,9517276,IMO,pvr_purse_seiner
94,9517280.0,365246848,CEA HONOR,30.0,63.0,44.0,,,no,9517276,IMO,pvr_purse_seiner
169,9517280.0,300235136,W%A0HONO2!,30.0,59.0,12.0,,,no,9517276,IMO,pvr_purse_seiner
198,8134650.0,354622000,LA PENA,30.0,73.0,19.0,2018-04-05 19:00:00 UTC,"QUTZAL, GUATEMALA",no,8134651,IMO,pvr_purse_seiner
229,8996100.0,553111756,ATUN STA,30.0,80.0,13.0,,,no,8996097,IMO,pvr_purse_seiner


In [109]:
print("unique_mmsi = ",len(pd.merge(ais_static, legitimate_mmsi_filter, how='inner', left_on=['mmsi'], right_on=['id']).drop_duplicates().mmsi.unique()))
pd.merge(ais_static, legitimate_mmsi_filter, how='inner', left_on=['mmsi'], right_on=['id']).drop_duplicates().head()

unique_mmsi =  0


Unnamed: 0,imo,mmsi,name,ship_and_cargo_type,length,width,eta_date,destination,IUU,id,id_type,list_source
