# Puertos USA
- Carpeta ais_noaa_gov: deteccion de barcos en las costas de Estados Unidos.
- _Faltan puertos_
- CBP_drug_seizures: incautaciones de droga por año fiscal

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import numpy as np



Revisar los tipos de cargo que hay. Y ordenar por tamaño total

In [2]:
# Cargar barcos ais noaa:
barcos_dia_10 = pd.read_csv('../ais_noaa_gov/df_procesado_AIS_2024_09_10.csv', header = 0, sep=',')

In [3]:
barcos_dia_10 = barcos_dia_10.drop(columns='Cargo')
barcos_dia_10['size'] = barcos_dia_10['Length']*barcos_dia_10['Width']*barcos_dia_10['Draft']
size_array = barcos_dia_10['size'].values.reshape(-1, 1)

In [4]:
scaler = MinMaxScaler(feature_range=(2,20))
scaler.fit(size_array)
barcos_dia_10['size_scaled'] = scaler.transform(size_array)

In [5]:
barcos_dia_10['size_scaled'] = pd.to_numeric(barcos_dia_10['size_scaled'])
barcos_dia_10 = barcos_dia_10.dropna()

Comentarios sobre dataframe:
- Se filtró por categorías de barco 'cargo' -> Revisar documentación.
- MMSI es un identificador único.
- BaseDateTime varía, pero al considerarse como una actualización cada 24 horas, pues se omiten las horas.
- Latitud y Longitud: es necesario confirmar el Datum (sistema de referencia)
- Ordenando la longitud (y en menor medida el ancho): podemos ver las dimensiones reales del barco.
- Podemos obtener también la velocidad media a la que viajan.
- Se podría estimar si el viaje es doméstico (de un puerto a otro del mismo país) o internacional.
- SOG?, COG?, Heading?

In [6]:
import plotly.express as px
import plotly.graph_objects as go

Para dibujar mapas: https://medium.com/@alexroz/6-python-libraries-to-make-beautiful-maps-9fb9edb28b27

In [7]:
# Visualizacion de los barcos en un mapa:
fig = px.scatter_geo(barcos_dia_10, lat='LAT', lon='LON',
                     color='VesselType', size='size') #otros: projection, size, hover_name, mapbox_style

fig = go.Figure(data=go.Scattergeo(
        locationmode = 'USA-states',
        lon = barcos_dia_10['LON'],
        lat = barcos_dia_10['LAT'],
        mode = 'markers',
        marker = dict(
            size = barcos_dia_10['size_scaled'], #
            opacity = 0.8,
            reversescale = True,
            autocolorscale = False,
            symbol = 'square',
            line = dict(
                width=1,
                color='rgba(102, 102, 102)'
            ),
            colorscale = 'Blues',
            cmin = barcos_dia_10['VesselType'].min(),
            color = barcos_dia_10['VesselType'],
            cmax = barcos_dia_10['VesselType'].max(),
            colorbar=dict(
                title=dict(
                    text="Tipo de carguero"
                )
            )
        )))

fig.update_layout(
    title='Distribución de barcos en Norteamérica',
    geo = dict(
            scope='north america',
            # projection_type='albers usa',
            showland = True,
            landcolor = "rgb(250, 250, 250)",
            subunitcolor = "rgb(175, 175, 175)",
            countrycolor = "rgb(175, 175, 175)",
            countrywidth = 0.5,
            subunitwidth = 1
        ),
    width=800, height=600
)


fig.show()

Algoritmo de clustering para agrupar cargueros, en función del "puerto más próximo".
"Puerto-más-próximo": igual que un centroide, pero situado en el litoral. Habrá que repensar el algoritmo.

In [8]:
# Color de los barcos en funcion de CallSign:
fig2 = px.scatter_geo(barcos_dia_10, lat='LAT', lon='LON',
                     color='CallSign', size='size_scaled') #otros: projection, size, hover_name, mapbox_style

fig2.update_layout(title='Agrupacion de barcos por CallSign',
                   geo_scope = 'north america')

fig2.show()

La variable "Call Sign" no parece que tenga una relevancia geográfica.

In [None]:
# otros go.Figure: carpet, candlestick, heatmap, histogram, sankey

Si añadimos al dataframe los datos de los dias 11 y 12:

In [9]:
barcos_dia_11 = pd.read_csv('../ais_noaa_gov/df_procesado_AIS_2024_09_11.csv', header = 0, sep=',')
barcos_dia_12 = pd.read_csv('../ais_noaa_gov/df_procesado_AIS_2024_09_12.csv', header = 0, sep=',')

In [10]:
barcos_dia_11 = barcos_dia_11.drop(columns='Cargo')
barcos_dia_11['size'] = barcos_dia_11['Length']*barcos_dia_11['Width']*barcos_dia_11['Draft']
size_array = barcos_dia_11['size'].values.reshape(-1, 1)

scaler = MinMaxScaler(feature_range=(2,20))
scaler.fit(size_array)
barcos_dia_11['size_scaled'] = scaler.transform(size_array)

In [11]:
barcos_dia_12 = barcos_dia_12.drop(columns='Cargo')
barcos_dia_12['size'] = barcos_dia_12['Length']*barcos_dia_12['Width']*barcos_dia_12['Draft']
size_array = barcos_dia_12['size'].values.reshape(-1, 1)

scaler = MinMaxScaler(feature_range=(2,20))
scaler.fit(size_array)
barcos_dia_12['size_scaled'] = scaler.transform(size_array)

In [12]:
barcos_dia_11['size_scaled'] = pd.to_numeric(barcos_dia_11['size_scaled'])
barcos_dia_11 = barcos_dia_11.dropna()

barcos_dia_12['size_scaled'] = pd.to_numeric(barcos_dia_12['size_scaled'])
barcos_dia_12 = barcos_dia_12.dropna()

In [14]:
ais_noaa_df = pd.concat([barcos_dia_10, barcos_dia_11, barcos_dia_12], ignore_index=True, axis=0)

Si mismo mmsi en varios registros, trazar linea de trayectoria:
- Para los 50, 100 barcos con size más grande. Agruparlos por mmsi
- Solo dibujar cabecera del último día, pero trayectoria completa

In [37]:
barcos_dia_12

Unnamed: 0,MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,TransceiverClass,size,size_scaled
0,636093186,2024-09-12T00:00:00,29.99542,-124.18539,18.9,129.5,128.0,UMM SALAL,IMO9525857,5LLQ5,74.0,0.0,366.0,48.0,14.5,A,254736.0,17.093052
1,414047000,2024-09-12T00:00:00,46.22080,-124.66064,1.4,73.6,15.0,GUO YUAN 12,IMO9579250,BFDV,70.0,0.0,224.0,32.0,7.4,A,53043.2,5.142798
2,209470000,2024-09-12T00:00:02,28.99710,-94.55552,8.8,82.8,82.0,MSC BOSPHORUS,IMO9247742,5BDF5,79.0,0.0,300.0,40.0,12.1,A,145200.0,10.603068
4,255806494,2024-09-12T00:00:05,45.84103,-125.32340,16.5,358.0,358.0,MSC ELODIE,IMO9704972,CQEV3,70.0,0.0,300.0,48.0,9.9,A,142560.0,10.446649
5,316043882,2024-09-12T00:00:00,44.02211,-82.42380,12.9,301.0,300.0,ALGOMA INTREPID,IMO9773387,VABC,70.0,0.0,198.0,24.0,6.4,A,30412.8,3.801952
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1386,338796000,2024-09-12T00:04:46,13.46036,144.66981,0.0,88.8,270.0,MATSONIA,IMO9814612,KIAB,74.0,5.0,265.0,35.0,10.1,A,93677.5,7.550371
1387,538008048,2024-09-12T13:25:50,13.85330,145.01191,11.9,237.5,228.0,KYOWA STORK,IMO9820790,V7IM5,70.0,0.0,143.0,23.0,6.6,A,21707.4,3.286159
1388,477999700,2024-09-12T17:22:59,11.71044,143.97356,10.3,353.2,354.0,MOUNT RAINIER,IMO9336799,VRBG6,70.0,0.0,177.0,28.0,10.5,A,52038.0,5.083240
1389,219324000,2024-09-12T11:24:26,12.13238,145.00933,15.1,288.7,289.0,SALLY MAERSK,IMO9120865,OZHS2,70.0,0.0,346.0,42.0,12.6,A,183103.2,12.848824


In [38]:
ais_noaa_df_top100 = ais_noaa_df.sort_values(by='size', ascending=False).head(10)

In [39]:
ais_noaa_df_top100

Unnamed: 0,MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,TransceiverClass,size,size_scaled
3694,374886000,2024-09-12T04:57:49,36.2341,-127.82341,17.2,126.0,125.0,MSC HAMBURG,IMO9647461,3FYY8,72.0,0.0,399.0,54.0,14.1,A,303798.6,20.0
290,636023269,2024-09-10T00:00:04,28.16848,-118.96883,17.8,130.6,128.0,MSC CARMELITA,IMO9946879,5LMW3,70.0,0.0,366.0,51.0,15.9,A,296789.4,20.0
1823,563190500,2024-09-11T00:00:28,33.74571,-118.27178,0.0,156.4,207.0,EVER MAX,IMO9935208,9V7625,74.0,5.0,366.0,51.0,15.7,A,293056.2,20.0
3039,636022606,2024-09-12T00:00:05,30.55117,-124.49359,16.0,128.4,127.0,MSC NOA ARIELA,IMO9946843,5LJP8,73.0,0.0,366.0,51.0,15.4,A,287456.4,19.031728
2347,477347400,2024-09-11T00:47:55,40.17341,-131.9998,16.6,121.5,121.0,YM WORTH,IMO9704635,VROO6,74.0,0.0,368.0,51.0,15.2,A,285273.6,19.52198
2168,563226700,2024-09-11T00:02:12,40.67875,-74.14622,0.0,48.3,311.0,EVER MAST,IMO9935246,9V7629,74.0,5.0,366.0,51.0,15.2,A,283723.2,19.426752
833,563229400,2024-09-10T00:02:44,33.74592,-118.2719,0.0,128.0,207.0,EVER MEGA,IMO9935260,9V7638,74.0,5.0,366.0,51.0,15.0,A,279990.0,18.981132
2431,636021036,2024-09-11T07:17:28,37.74872,-69.41397,12.2,300.1,302.0,BALTIMORE EXPRESS,IMO9665621,5LBY2,70.0,0.0,368.0,51.0,14.8,A,277766.4,19.060875
2508,636019213,2024-09-11T17:40:11,17.91265,-69.90508,0.8,137.3,225.0,MSC FAITH,IMO9842085,D5TM2,71.0,0.0,366.0,48.0,15.8,A,277574.4,19.049082
3658,636019213,2024-09-12T01:29:42,18.21422,-69.69285,1.1,323.5,27.0,MSC FAITH,IMO9842085,D5TM2,71.0,0.0,366.0,48.0,15.8,A,277574.4,18.446222


In [15]:
grouped_ais_noaa_df = ais_noaa_df.groupby('MMSI')

In [31]:
# Create a list to store the traces
traces = []

# Iterate through each group (each vessel's trajectory)
for mmsi, group in grouped_ais_noaa_df:
    trace = px.line_geo(group, lat='LAT', lon='LON') # name=f'MMSI: {mmsi}' # Basic line plot                      
    trace.update_traces()  # Adjust line width and opacity # line=dict(width=2, opacity=0.7)

    traces.append(trace.data[0])


In [33]:
grouped_ais_noaa_df

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x73d76e2f0520>

In [36]:
# Create the figure
fig = px.scatter_geo(ais_noaa_df, lat='LAT', lon='LON', color='VesselType',
                     size='size_scaled'
                    )  # Scatter plot for all vessels #hover_name='MMSI'

# Add the trajectory lines to the figure
fig.add_traces(traces)

# fig = go.Figure(data=go.Scattergeo(
#         locationmode = 'USA-states',
#         lon = barcos_dia_10['LON'],
#         lat = barcos_dia_10['LAT'],
#         mode = 'markers',
#         marker = dict(
#             size = barcos_dia_10['size_scaled'], #
#             opacity = 0.8,
#             reversescale = True,
#             autocolorscale = False,
#             symbol = 'square',
#             line = dict(
#                 width=1,
#                 color='rgba(102, 102, 102)'
#             ),
#             colorscale = 'Blues',
#             cmin = barcos_dia_10['VesselType'].min(),
#             color = barcos_dia_10['VesselType'],
#             cmax = barcos_dia_10['VesselType'].max(),
#             colorbar=dict(
#                 title=dict(
#                     text="Tipo de carguero"
#                 )
#             )
#         )))

fig.update_layout(
    title='Distribución de barcos en Norteamérica',
    geo = dict(
            scope='north america',
            # projection_type='albers usa',
            showland = True,
            landcolor = "rgb(250, 250, 250)",
            subunitcolor = "rgb(175, 175, 175)",
            countrycolor = "rgb(175, 175, 175)",
            countrywidth = 0.5,
            subunitwidth = 1
        ),
    width=800, height=600
)


fig.show()