# Abstract

The ocean is a vital resource supporting global biodiversity, food security, and economic activity, yet its sustainable use is threatened by overfishing and other illicit practices. Automatic Identification System (AIS) data is a key tool for monitoring vessel activity, but disabling AIS—resulting in "AIS gaps"—raises concerns about potential illegal fishing. This study explores whether AIS gaps are primarily used to mask illegal activity or to conceal highly productive fishing locations from competitors. Using a dataset of AIS gaps from Global Fishing Watch, this study analyzes patterns in AIS disabling events, focusing on vessel types, spatial distributions, and the frequency of such events. By examining these dynamics, the study aims to provide insights into the motivations behind AIS disabling, and its implications for fisheries management and conservation.

# Introduction

The ocean is an essential global resource, providing critical ecosystem services, food, and livelihoods for millions worldwide. As marine resources face increasing pressure from overfishing, technological tools like the Automatic Identification System (AIS) have become indispensable for monitoring and managing vessel activity. AIS is designed to enhance maritime safety by broadcasting vessel positions, but disabling AIS creates data gaps that complicate oversight and raise concerns about illegal fishing activities. 



# Methods (data source and wrangling): 

This data set was gathere"d from: global fishing watch.  https://globalfishingwatch.org/data-download/datasets/public-welch-et-al-disabling-events:v20221102



This dataset is pretty clean, but lets get you some info about it anyways. 


In [3]:
import pandas as pd


df = pd.read_csv('ais_disabling_events.csv')
df.describe()

Unnamed: 0,mmsi,vessel_length_m,vessel_tonnage_gt,gap_start_lat,gap_start_lon,gap_start_distance_from_shore_m,gap_end_lat,gap_end_lon,gap_end_distance_from_shore_m,gap_hours
count,55368.0,55365.0,55368.0,55368.0,55368.0,55368.0,55368.0,55368.0,55368.0,55368.0
mean,415003400.0,53.1139,857.755972,0.207077,10.602479,515707.1,0.261127,11.273168,501061.5,100.392152
std,120679100.0,21.321896,711.376427,31.963962,116.896361,348032.0,31.946682,116.780426,356581.6,371.332756
min,612.0,10.62,12.0,-76.095333,-179.983,93000.0,-75.934333,-179.99936,1000.0,12.0
25%,412056300.0,36.52,276.0,-19.853636,-79.861616,279000.0,-19.818421,-79.581464,254000.0,15.583333
50%,412499900.0,54.999773,736.0,-2.366897,-16.121833,413000.0,-2.364694,-14.700157,410000.0,23.483333
75%,416772000.0,69.9,1269.0,26.938788,152.517194,661000.0,27.216765,152.467873,650000.0,67.8
max,999763600.0,255.39,9499.0,78.214127,179.993508,2245000.0,80.191548,179.993225,2262000.0,17215.933333


Awesome so we have rows like vessel_length_m and vessel_tonnage_gt and anything that's talking about the "gap" is going to be in reference to the gap from where AIS was disabled to where it was re-enabled. 

In [4]:
from geopy.distance import geodesic
# Ensure datetime columns are properly formatted
df['gap_start_timestamp'] = pd.to_datetime(df['gap_start_timestamp'])
df['gap_end_timestamp'] = pd.to_datetime(df['gap_end_timestamp'])

# Calculate total gap hours and number of records per vessel
vessel_gap_summary = df.groupby('mmsi')['gap_hours'].sum().reset_index()
vessel_gap_summary.rename(columns={'gap_hours': 'total_gap_hours'}, inplace=True)
print(vessel_gap_summary.head())# Group by vessel class and calculate the total and average gap hours
vessel_class_summary = df.groupby('vessel_class')['gap_hours'].agg(['sum', 'mean', 'count']).reset_index()
vessel_class_summary.rename(columns={
    'sum': 'total_gap_hours',
    'mean': 'average_gap_hours',
    'count': 'number_of_gaps'
}, inplace=True)

# Sort by total gap hours to see which vessel types turn off AIS most
vessel_class_summary.sort_values('total_gap_hours', ascending=False, inplace=True)

print(vessel_class_summary)

ModuleNotFoundError: No module named 'geopy'

In [11]:
# Identify unique vessels with at least one gap per class
vessels_with_gaps = df.groupby('vessel_class')['mmsi'].nunique().reset_index()
vessels_with_gaps.rename(columns={'mmsi': 'vessels_with_gaps'}, inplace=True)

# Count the total number of unique vessels per class
total_vessels = df.groupby('vessel_class')['mmsi'].nunique().reset_index()
total_vessels.rename(columns={'mmsi': 'unique_vessels'}, inplace=True)

# Merge the two datasets
vessel_class_summary = vessels_with_gaps.merge(total_vessels, on='vessel_class')

# Calculate the percentage of vessels with gaps
vessel_class_summary['percent_with_gaps'] = (
    vessel_class_summary['vessels_with_gaps'] / vessel_class_summary['unique_vessels'] * 100
)

# Display the corrected summary
print(vessel_class_summary)


         vessel_class  vessels_with_gaps  unique_vessels  percent_with_gaps
0  drifting_longlines               2191            2191              100.0
1               other                945             945              100.0
2        squid_jigger                806             806              100.0
3            trawlers                917             917              100.0
4   tuna_purse_seines                419             419              100.0


# Results (viz and stats)



# Discussion (who cares?)