## Merging Detected Anomalies
The output from Isolation Forest and Autoencoder will be merged to find the common anomalies detected by both models.

In [5]:
import pandas as pd

# Import the csv with the detected anomalies
forest_anomalies = pd.read_csv('isolation_forest_anomalies.csv')
autoencoder_anomalies = pd.read_csv('autoencoder_anomalies.csv')

# Using 'MMSI' and 'BaseDateTime' to merge the results
common_columns = ['MMSI', 'BaseDateTime']

# Merge the two datasets on the common identifier columns
merged_anomalies = pd.merge(
    forest_anomalies,
    autoencoder_anomalies,
    on=common_columns,
    suffixes=('_iso', '_auto')
)

# Assuming both methods anomalies with a column called '*_anomaly'
common_anomalies = merged_anomalies[
    (merged_anomalies['autoencoder_anomaly'] == 1) &
    (merged_anomalies['forest_anomaly'] == -1)
]

common_anomalies = common_anomalies.drop_duplicates(subset=['MMSI'])

# Print common anomalies
print(f"Number of common anomalies detected: {len(common_anomalies)}")
print(common_anomalies.head())

# Save common anomalies to a CSV file
common_anomalies.to_csv('common_anomalies.csv', index=False)

Number of common anomalies detected: 4381
       MMSI         BaseDateTime   LAT_iso   LON_iso  SOG_iso  COG_iso  \
0        11  2023-01-01 01:08:06  27.29230 -90.96793      0.1    207.2   
9       111  2023-01-01 03:34:21  27.35387 -94.62561      0.1    201.1   
32  3660489  2023-01-01 09:17:51  27.37022 -89.92430      0.0    135.0   
40  3669883  2023-01-01 06:05:32  41.34564 -72.09592      0.0    360.0   
42  3791472  2023-01-01 10:03:13  26.13172 -92.04032      0.0    298.5   

    Heading_iso VesselName_iso     IMO_iso CallSign_iso  ...  Draft_auto  \
0         511.0   CONSTITUTION  IMO0000007       GC 680  ...         0.0   
9         511.0       BOOMVANG  IMO0000001        EB643  ...         0.0   
32          0.0    NEPTUNE TLP         NaN      WQGV318  ...         NaN   
40        511.0            NaN         NaN          NaN  ...         0.0   
42          1.0    LUCIUS SPAR  IMO1108561       WQXP40  ...         0.0   

    Cargo_auto  TransceiverClass_auto  distance_auto  ti