### 0. Thoughts and ideas 

#### 0.1 General

One possible strategy:
- Treat the prediction of future AIS data as a prediction task itself (X: Historic AIS and positions, Y: AIS data in next timestep) and create a model for this
- Use the predicted AIS data as well as historic AIS data and positions to predict new position. 


Another:
- Let a model use the previous timesteps to predict all information about the next timestep.

#### 0.2 About the datasets

#### 0.3 Research


##### Article evaluating several models to predict ship trajectories

Definitions:
- Ship trajectory is the sequence if timestamped points Pi = {Ti, LATi, LONi, SOGi, COGi}


Methodology:
- Information from the first four timestamps are used to predict the next.
- Implemented using a Pytorch framework
- Use ADAM as optimizer
- Use the following hyperparameters: Learning rate: 0.0001, epoch: 100, dropout: 0.5, Hidden size:128 (15), input/output dimensions: 2 and hidden layer: 1


Interesting points
- "Deep learning exhibits remarkable performance in AIS data-driven ship trajectory prediction"
- "Deep learning are in general better than machine learning for this application"
- Transformer, BI-GRU and GRU performs the best, transformer only outperforms on medium sized datasets
- SVR is the best machine learning algorithm

##### Brainstorming - 18.09.2024

AIS - data:
- Parameters that intuitively give us the next position (COG, SOG and ROT), (current position, ETARAW and PortID)
- Should try merging navstat codes used to describe the same activity



General:
- Should somehow allow the algorithm to keep the last values - research different strategies (CNN or LSTM?)
- Might want to use a classifier to predict features
- Ship-ID is probably a pointless input for the classifier


##### Other


- Gustav & co brukte autogluon: https://auto.gluon.ai/stable/index.html


In [6]:
# IMPORTS

import numpy as np
import pandas as pd
import xgboost as xgb

### 1. Data 

#### 1.1 Load data into dataframes

In [5]:



ais_train_data_path = '../../Project materials/ais_train.csv'


ais_data_train = pd.read_csv(ais_train_data_path, sep='|')


ais_data_train.head()

Unnamed: 0,time,cog,sog,rot,heading,navstat,etaRaw,latitude,longitude,vesselId,portId
0,2024-01-01 00:00:25,284.0,0.7,0,88,0,01-09 23:00,-34.7437,-57.8513,61e9f3a8b937134a3c4bfdf7,61d371c43aeaecc07011a37f
1,2024-01-01 00:00:36,109.6,0.0,-6,347,1,12-29 20:00,8.8944,-79.47939,61e9f3d4b937134a3c4bff1f,634c4de270937fc01c3a7689
2,2024-01-01 00:01:45,111.0,11.0,0,112,0,01-02 09:00,39.19065,-76.47567,61e9f436b937134a3c4c0131,61d3847bb7b7526e1adf3d19
3,2024-01-01 00:03:11,96.4,0.0,0,142,1,12-31 20:00,-34.41189,151.02067,61e9f3b4b937134a3c4bfe77,61d36f770a1807568ff9a126
4,2024-01-01 00:03:51,214.0,19.7,0,215,0,01-25 12:00,35.88379,-5.91636,61e9f41bb937134a3c4c0087,634c4de270937fc01c3a74f3


In [7]:
#Create dataframes for each ship-id:

ship_train_groups = ais_data_train.groupby('vesselId')
ship_train_dataframes = {ship_id: group for ship_id, group in ship_train_groups}

#Split data into input and output. Input can now be accessed as ship_dataframes[shipID][0] and output as ship_dataframes[shipID][1]

# for key in ship_train_dataframes:
#     ship_train_dataframes[key] = [ship_train_dataframes[key].drop(columns=['latitude', 'longitude']), ship_train_dataframes[key][['latitude', 'longitude']]]


# print(ship_train_dataframes)

#### 1.2 Split data into X and Y

### 2. Try to create predictions using simple models:

#### 2.1 XG-boost

In [13]:
# xgb_simple = xgb.XGBClassifier()


# for key in ship_train_dataframes:
#     xgb_simple.fit(ship_train_dataframes[key][0], ship_train_dataframes[key][1])

### 3. Attempting to implement a similar approach as in the article: 

##### Status

Result: 701

Weaknesses
- Only uses Cog, Sog and previous position to predict
- Hyperparameters are chosen on slump, not tuned (learning rate)
- Model Arcitecture (number of layers and hidden sizes, number of previous timesteps that are used as input)
- No regularizaion techniques (Dropout, etc.)
- No iterpolation to fill empty values


Next steps
- Figure out how to add input features that are not timestep dependant
- Create new features (moored, etc.)
- Figure out how to handle holes in the data (interpolation)
- Research how to tune hyperparameters
- Add features to the timesteps

In [2]:
#Imports

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from collections import defaultdict

#### 3.1 Preprocess data into timeseries

In [8]:
def categorize_navstat_contrast(navstat):
    if navstat in [0, 8]:
        return 1  # Underway
    elif navstat in [2, 3, 4]:
        return 0.5  # Restricted Movement
    elif navstat in [1, 5, 6]:
        return -1  # Stationary
    else:
        return 0  # Unknown

all_timeseries = []
scaler = MinMaxScaler()

sequence_length = 5


for ship_id, df in ship_train_dataframes.items():

    df['movement_status'] = df['navstat'].apply(categorize_navstat_contrast)
    features = df[['longitude', 'latitude', 'sog', 'cog', 'movement_status']].values

    features_normalized = scaler.fit_transform(features)

    for i in range(len(features_normalized) - sequence_length):
        timeseries = features_normalized[i:i+sequence_length+1]
        all_timeseries.append(timeseries)


all_timeseries = np.array(all_timeseries)

X_data = all_timeseries[:, :-1, :]
Y_data = all_timeseries[:, -1, :]




print(all_timeseries.shape)
print(X_data.shape)
print(Y_data.shape)

(1518629, 6, 5)
(1518629, 5, 5)
(1518629, 5)


#### 3.2 GRU - model

In [21]:
class GRUnet(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_layers):
        super(GRUnet, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers

        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first = True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, X):

        hidden_initialize = torch.zeros(self.num_layers, X.size(0), self.hidden_size).to(X.device)

        out, _ = self.gru(X, hidden_initialize)

        out = self.fc(out[:, -1, :])

        out = torch.tanh(out)

        return out
    

#Model parameters:
input_size = 5
hidden_size = 64   #Random guess on what is best
output_size = 5     
num_layers = 2 

GRU_model = GRUnet(input_size=input_size, hidden_size=hidden_size, output_size=output_size, num_layers=num_layers)

print(GRU_model)

GRUnet(
  (gru): GRU(5, 64, num_layers=2, batch_first=True)
  (fc): Linear(in_features=64, out_features=5, bias=True)
)


#### 3.3 Train Model

In [22]:
# Hyperparameters 

learning_rate = 0.001
num_epochs = 10
batch_size = 64


optimizer = optim.Adam(GRU_model.parameters(), lr = learning_rate)
loss_function = nn.MSELoss()


#Preprocess data:

X_tensor = torch.tensor(X_data, dtype=torch.float32)
Y_tensor = torch.tensor(Y_data, dtype=torch.float32)

train_dataset = torch.utils.data.TensorDataset(X_tensor, Y_tensor)
data_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle = True)

for epoch in range(num_epochs):
    for inputs, targets in data_loader:

        optimizer.zero_grad()

        outputs = GRU_model(inputs)

        loss = loss_function(outputs, targets)

        loss.backward()
        optimizer.step()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')




Epoch [1/10], Loss: 0.0138
Epoch [2/10], Loss: 0.0134
Epoch [3/10], Loss: 0.0090
Epoch [4/10], Loss: 0.0147
Epoch [5/10], Loss: 0.0196
Epoch [6/10], Loss: 0.0082
Epoch [7/10], Loss: 0.0193
Epoch [8/10], Loss: 0.0130
Epoch [9/10], Loss: 0.0217
Epoch [10/10], Loss: 0.0186


In [23]:
##Save model

torch.save(GRU_model.state_dict(), "gru_model_test.pth")

#### 3.4 Make Predictions

In [24]:
loaded_GRU_model = GRUnet(input_size=input_size, hidden_size=hidden_size, output_size=output_size, num_layers=num_layers)
loaded_GRU_model.load_state_dict(torch.load("gru_model_test.pth"))

loaded_GRU_model.eval()

GRUnet(
  (gru): GRU(5, 64, num_layers=2, batch_first=True)
  (fc): Linear(in_features=64, out_features=5, bias=True)
)

In [22]:


# def predict_ship_position(ship_id, time, model, sequence_length = 5):

#     if ship_id not in ship_train_dataframes:
#         print(f"No training data available for ship_id: {ship_id}")
#         return None
    
#     ship_df = ship_train_dataframes[ship_id]

#     ship_df['time'] = pd.to_datetime(ship_df['time'])

#     ship_df = ship_df.sort_values(by='time').reset_index(drop=True)

#     prediction_time = pd.to_datetime(time)

#     last_known_time = df['time'].iloc[-1]
#     time_diff = (prediction_time - last_known_time).total_seconds()
    
#     if time_diff <= 0:
#         print(f"Prediction time {prediction_time} is before or equal to the last known time {last_known_time}")
#         return None
    
#     steps_needed = int(time_diff//20)

#     df['hour'] = df['time'].dt.hour
#     df['minute'] = df['time'].dt.minute
#     df['second'] = df['time'].dt.second

#     features = df[['hour', 'minute', 'second', 'longitude', 'latitude', 'sog', 'cog']].values

#     if len(features) < sequence_length:
#         print(f"Not enough historical data to predict for ship_id: {ship_id}")
#         return None

#     input_sequence = features[-sequence_length:]

#     input_sequence_normalized =scaler.transform(input_sequence)
#     input_tensor = torch.tensor(input_sequence_normalized, dtype=torch.float32).unsqueeze(0)

#     for _ in range(steps_needed):
#         with torch.no_grad():
           
#             next_position = model(input_tensor)

        
#         next_position_tensor = next_position.unsqueeze(0)
#         input_tensor = torch.cat((input_tensor[:, 1:, :], next_position_tensor), dim=1)
    
#     next_position_np = next_position.numpy()
#     next_position_inverse = scaler.inverse_transform(next_position_np)

#     return next_position_inverse


In [25]:
ais_test_data_path = '../../Project materials/ais_test.csv'
ais_data_test = pd.read_csv(ais_test_data_path)
unique_ship_ids = ais_data_test['vesselId'].unique()

In [29]:

# for idx, row in ais_data_test.iterrows():
#     ship_id_sample = row['vesselId']
#     prediction_time = row['time']

#     # Predict the position at the specified future time
#     predicted_position = predict_ship_position(ship_id=ship_id_sample, time=prediction_time, model=loaded_GRU_model)

#     if predicted_position is not None:
#         #print(f"Predicted position for ship_id {ship_id} at {prediction_time}: {predicted_position}")

In [26]:
##Attempt at a faster approach:


ship_predictions = defaultdict(dict)

for ship_id in unique_ship_ids:
    if ship_id not in ship_train_dataframes:
        print(f"No training data available for ship_id: {ship_id}")
        continue

    df = ship_train_dataframes[ship_id]
    
    df['time'] = pd.to_datetime(df['time'])
    
    df['hour'] = df['time'].dt.hour
    df['minute'] = df['time'].dt.minute
    df['second'] = df['time'].dt.second

    features = df[['longitude', 'latitude', 'sog', 'cog', 'movement_status']].values
    if len(features) < sequence_length:
        print(f"Not enough historical data to predict for ship_id: {ship_id}")
        continue

    input_sequence = features[-sequence_length:]
    scaler = MinMaxScaler().fit(features)
    input_sequence_normalized = scaler.transform(input_sequence)

    input_tensor = torch.tensor(input_sequence_normalized, dtype=torch.float32).unsqueeze(0)

    ship_test_times = pd.to_datetime(ais_data_test[ais_data_test['vesselId'] == ship_id]['time'])
    farthest_time = ship_test_times.max()

    last_known_time = df['time'].iloc[-1]
    total_steps_needed = int((farthest_time - last_known_time).total_seconds() // (20 * 60))

    current_input = input_tensor.clone()

    for step in range(1, total_steps_needed + 1):
        with torch.no_grad():
            next_position = loaded_GRU_model(current_input)

        prediction_time = last_known_time + pd.Timedelta(minutes=20 * step)
        prediction_np = next_position.cpu().numpy()
        prediction_original_scale = scaler.inverse_transform(prediction_np.reshape(1, -1))
        
        ship_predictions[ship_id][prediction_time] = prediction_original_scale[0]

        current_input = torch.cat((current_input[:, 1:, :], next_position.unsqueeze(0)), dim=1)

predictions_list = []

for idx, row in ais_data_test.iterrows():
    ship_id = row['vesselId']
    target_time = pd.to_datetime(row['time']).round('min')

    target_time_str = target_time.strftime('%Y-%m-%d %H:%M')

    if ship_id in ship_predictions:
        if target_time_str in ship_predictions[ship_id]:
            prediction = ship_predictions[ship_id][target_time_str]
        else:

            available_times = list(ship_predictions[ship_id].keys())
            closest_time_str = min(available_times, key=lambda x: abs(pd.Timestamp(x) - target_time))
            prediction = ship_predictions[ship_id][closest_time_str]

        predictions_list.append({
            'ship_id': ship_id,
            'time': target_time,
            'predicted_latitude': prediction[1],  
            'predicted_longitude': prediction[0],
            'predicted_sog': prediction[2],
            'predicted_cog': prediction[3],
            'predicted_movement_status': prediction[4]
        })

predictions_df = pd.DataFrame(predictions_list)

print(predictions_df.head())

                    ship_id                time  predicted_latitude  \
0  61e9f3aeb937134a3c4bfe3d 2024-05-08 00:03:00           31.631407   
1  61e9f473b937134a3c4c02df 2024-05-08 00:06:00           14.870845   
2  61e9f469b937134a3c4c029b 2024-05-08 00:10:00           38.603489   
3  61e9f45bb937134a3c4c0221 2024-05-08 00:11:00          -42.677868   
4  61e9f38eb937134a3c4bfd8d 2024-05-08 00:12:00           48.747246   

   predicted_longitude  predicted_sog  predicted_cog  \
0           -83.850716       0.074459     194.464005   
1           120.107689       0.042288      29.438742   
2            10.837335      18.094183      32.100861   
3           172.166824       0.359098     189.451187   
4            -6.218265       1.373619     228.019608   

   predicted_movement_status  
0                  -0.975770  
1                  -0.988630  
2                   0.994855  
3                  -0.928127  
4                   0.319609  


In [29]:

# Check the minimum and maximum values for longitude
longitude_min = predictions_df['predicted_longitude'].min()
longitude_max = predictions_df['predicted_longitude'].max()

# Check the minimum and maximum values for latitude
latitude_min = predictions_df['predicted_latitude'].min()
latitude_max = predictions_df['predicted_latitude'].max()

# Print the results
print(f"Longitude: Min = {longitude_min}, Max = {longitude_max}")
print(f"Latitude: Min = {latitude_min}, Max = {latitude_max}")

Longitude: Min = -213.85166931152344, Max = 177.6414031982422
Latitude: Min = -59.9287223815918, Max = 60.745361328125


In [19]:
csv_data = predictions_df[['predicted_longitude', 'predicted_latitude']]

# Write to CSV file with a header
csv_file_path = 'submission.csv'

# Save the DataFrame to CSV
csv_data.to_csv(csv_file_path, index=True, index_label = 'ID', header=['longitude_predicted','latitude_predicted'])


#### 3.5 First attempt to improve model accuracy

##### 3.51 Inspecting ship-specific data

Vessel-specific data:
- Ship-type
- Draft
- Length
- Breadth
- Depth
- Engine Power
- Top speed
- longitude and latitude of destination port
- longditude and latitude of home port
- longditude and latutude of ais-port

Time-series data:
- Time since sailing
- Time untill arrival
- Mooring status (True if sog is close to zero, longitude and latitude does not change for a certain time, if the longitude and latitude are close to the known port-destinations)
- AIS data (position, cog, sog, rot, heading, ETARAW, navstat)


Ideas to derived features:
- Current speed percentage of max speed
- Historical average speed over different time frames
- Heading relative to destination port - suggests if the ship is taking a detour or not
- Changes in heading


##### 3.52 Preprocessing relevant features

In [24]:
ais_train_data_path = '../../Project materials/ais_train.csv'
ports_data_path = '../../Project materials/ports.csv'
vessels_data_path = '../../Project materials/vessels.csv'


ais_data_train = pd.read_csv(ais_train_data_path, sep='|')
ports = pd.read_csv(ports_data_path, sep='|')
vessels = pd.read_csv(vessels_data_path, sep='|')

ship_train_ais_groups = ais_data_train.groupby('vesselId')
ship_train_dataframes = {ship_id: group for ship_id, group in ship_train_ais_groups}


vessels = vessels.set_index('vesselId')
#ports = ports.set_index('portId')


#Handle NAN values:
columns_to_fill = ['GT', 'length', 'breadth', 'enginePower']

for column in columns_to_fill:
    median_value = vessels[column].median()  # Calculate the median value of the column
    vessels[column] = vessels[column].fillna(median_value)

In [74]:
longitude_min = 0
longitude_max = 0

latitude_min = 0
latitude_max = 0

for ship_id, df in ship_train_dataframes.items():


# Check the minimum and maximum values for longitude
    longitude_ship_min = df['longitude'].min()
    longitude_ship_max = df['longitude'].max()

# Check the minimum and maximum values for latitude
    latitude_ship_min = df['latitude'].min()
    latitude_ship_max = df['latitude'].max()

    if longitude_max < longitude_ship_max:
        longitude_max = longitude_ship_max
    if longitude_min > longitude_ship_min:
        longitude_min = longitude_ship_min
    if latitude_max < latitude_ship_max:
        latitude_max = latitude_ship_max
    if latitude_min > latitude_ship_min:
        latitude_min = latitude_ship_min
    

# Print the results
print(f"Longitude: Min = {longitude_min}, Max = {longitude_max}")
print(f"Latitude: Min = {latitude_min}, Max = {latitude_max}")

Longitude: Min = -167.54093, Max = 178.80538
Latitude: Min = -47.53287, Max = 70.5572


In [26]:
# Assuming 'df' is your DataFrame
nan_summary = vessels.isna().sum()

# Print out the summary of NaN values for each column
print("Number of NaN values in each column:")
print(nan_summary)


Number of NaN values in each column:
shippingLineId      0
CEU                 0
DWT                 8
GT                  0
NT                524
vesselType         12
breadth             0
depth             469
draft             701
enginePower         0
freshWater        490
fuel              490
homePort          138
length              0
maxHeight         676
maxSpeed          498
maxWidth          676
rampCapacity      677
yearBuilt           0
dtype: int64


In [48]:

# Define scalers
time_series_scaler = MinMaxScaler()
vessel_features_scaler = MinMaxScaler()

sequence_length = 5

# Collect all vessel features for scaling
all_vessel_features = []
all_features = []

for ship_id, df in ship_train_dataframes.items():
    if ship_id in vessels.index:
        vessel_features = vessels.loc[ship_id][['length', 'breadth', 'enginePower', 'GT']].values
        all_vessel_features.append(vessel_features)

    # Collect time-series features for scaling
    df['time'] = pd.to_datetime(df['time'])
    #df['etaRaw'] = df['etaRaw'].apply(lambda x: f"2024-{x}" if pd.notna(x) and isinstance(x, str) else np.nan)
    #df['etaRaw'] = pd.to_datetime(df['etaRaw'], format='%Y-%m-%d %H:%M', errors='coerce')
    df['hour'] = df['time'].dt.hour
    df['minute'] = df['time'].dt.minute
    df['second'] = df['time'].dt.second
    #df['time_to_eta'] = (df['etaRaw'] - df['time']).dt.total_seconds() / 3600

    for idx, row in df.iterrows():
        ais_features = [
            row['hour'], row['minute'], row['second'], 
            row['longitude'], row['latitude'], row['sog'], 
            row['cog'], row['rot'], row['heading'], row['navstat']
        ]
        all_features.append(ais_features)

# Convert collected features to numpy arrays and fit the scalers
all_vessel_features = np.array(all_vessel_features, dtype=np.float32)
all_features = np.array(all_features, dtype=np.float32)

vessel_features_scaler.fit(all_vessel_features)
time_series_scaler.fit(all_features)

# Transform each ship's data consistently using the fitted scalers
all_timeseries = []
vessel_features_list = []

for ship_id, df in ship_train_dataframes.items():
    if ship_id in vessels.index:
        # Extract and scale vessel-specific features
        vessel_features = vessels.loc[ship_id][['length', 'breadth', 'enginePower', 'GT']].values.reshape(1, -1)
        vessel_features = vessel_features_scaler.transform(vessel_features).flatten()
    else:
        continue

    # Extract and transform time-series features using the same fitted scaler
    df['time'] = pd.to_datetime(df['time'])
    # df['etaRaw'] = df['etaRaw'].apply(lambda x: f"2024-{x}" if pd.notna(x) and isinstance(x, str) else np.nan)
    # df['etaRaw'] = pd.to_datetime(df['etaRaw'], format='%Y-%m-%d %H:%M', errors='coerce')
    df['hour'] = df['time'].dt.hour
    df['minute'] = df['time'].dt.minute
    df['second'] = df['time'].dt.second
    #df['time_to_eta'] = (df['etaRaw'] - df['time']).dt.total_seconds() / 3600

    features_list = []
    for idx, row in df.iterrows():
        ais_features = [
            row['hour'], row['minute'], row['second'], 
            row['longitude'], row['latitude'], row['sog'], 
            row['cog'], row['rot'], row['heading'], row['navstat']
        ]
        features_list.append(ais_features)

    # Convert to numpy array and transform using the scaler
    features = np.array(features_list, dtype=np.float32)
    features_normalized = time_series_scaler.transform(features)

    # Create time-series data for training
    for i in range(len(features_normalized) - sequence_length):
        timeseries = features_normalized[i:i + sequence_length + 1]
        all_timeseries.append(timeseries)

        # Store vessel-specific features for each sequence
        vessel_features_list.append(vessel_features)

# Convert lists to numpy arrays
all_timeseries = np.array(all_timeseries, dtype=np.float32)
vessel_features_list = np.array(vessel_features_list, dtype=np.float32)

# Split time-series into X (input sequences) and Y (target values)
X_data = all_timeseries[:, :-1, :]  # Shape: (num_samples, sequence_length, num_features)
Y_data = all_timeseries[:, -1, :]  # Shape: (num_samples, num_features)


In [49]:
nan_summary = np.isnan(all_features).sum(axis=0)


print(nan_summary)

[0 0 0 0 0 0 0 0 0 0]


In [50]:
print(all_timeseries.shape)
print(X_data.shape)
print(Y_data.shape)
print(vessel_features_list.shape)

(1518629, 6, 10)
(1518629, 5, 10)
(1518629, 10)
(1518629, 4)


##### 3.53 Adapting the GRU model to handle non-timestep features:

In [75]:
class GRUnetExtended(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_layers, non_timestep_features_size):
        super(GRUnetExtended, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers

        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first = True)
        self.fc = nn.Linear(hidden_size + non_timestep_features_size, output_size)

    def forward(self, X, non_timestep_features):

        hidden_initialize = torch.zeros(self.num_layers, X.size(0), self.hidden_size).to(X.device)

        out, _ = self.gru(X, hidden_initialize)

        out = out[:, -1, :]

        combined = torch.cat((out, non_timestep_features), dim=1)

        out = self.fc(combined)

        out = torch.tanh(out)

        return out
    



#Model parameters:
input_size = 10
hidden_size = 64   #Random guess on what is best
output_size = 10     
num_layers = 2 
non_timestep_features_size = 4

extended_GRU_model = GRUnetExtended(input_size=input_size, hidden_size=hidden_size, output_size=output_size, num_layers=num_layers, non_timestep_features_size=non_timestep_features_size)


##### 3.54 Training the extended model

In [76]:
class ShipDataset(Dataset):
    def __init__(self, X_data, vessel_features_list, Y_data):
        self.X_data = torch.tensor(X_data, dtype=torch.float32)
        self.vessel_features = torch.tensor(vessel_features_list, dtype=torch.float32)
        self.Y_data = torch.tensor(Y_data, dtype=torch.float32)

    def __len__(self):
        return len(self.X_data)

    def __getitem__(self, idx):
        return {
            'time_series': self.X_data[idx],
            'vessel_features': self.vessel_features[idx],
            'target': self.Y_data[idx]
        }

In [77]:
all_timeseries = np.array(all_timeseries, dtype=np.float32)
vessel_features_list = np.array(vessel_features_list, dtype=np.float32)
Y_data = np.array(Y_data, dtype=np.float32)



dataset = ShipDataset(X_data, vessel_features_list, Y_data)
train_loader = DataLoader(dataset, batch_size=64, shuffle=True)

criterion = nn.MSELoss()
optimizer = optim.Adam(extended_GRU_model.parameters(), lr=0.0001)

# Training the model
num_epochs = 10
for epoch in range(num_epochs):
    extended_GRU_model.train()
    running_loss = 0.0
    for batch in train_loader:
        # Extract the features from the batch
        time_series = batch['time_series']
        vessel_features = batch['vessel_features']
        target = batch['target']


        # Forward pass
        outputs = extended_GRU_model(time_series, vessel_features)

        # Calculate loss
        loss = criterion(outputs, target)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()


# Print the gradient norms for all parameters
        for name, param in extended_GRU_model.named_parameters():
            if param.grad is not None:
                grad_norm = param.grad.norm()
        
                if grad_norm > 10:  # Threshold for detecting unusually large gradients
                    print(f"Warning: Unusually large gradient detected for {name}: {grad_norm}")


        #torch.nn.utils.clip_grad_norm_(extended_GRU_model.parameters(), max_norm=1.0)

        optimizer.step()

        # Track the loss
        running_loss += loss.item()

    # Print the average loss for each epoch
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}")


Epoch [1/10], Loss: 0.0191
Epoch [2/10], Loss: 0.0163
Epoch [3/10], Loss: 0.0155
Epoch [4/10], Loss: 0.0146
Epoch [5/10], Loss: 0.0144
Epoch [6/10], Loss: 0.0143
Epoch [7/10], Loss: 0.0142
Epoch [8/10], Loss: 0.0141
Epoch [9/10], Loss: 0.0140
Epoch [10/10], Loss: 0.0140


In [78]:

torch.save(extended_GRU_model.state_dict(), "extended_gru_model_test.pth")

In [79]:
loaded_extended_GRU_model = GRUnetExtended(input_size=input_size, hidden_size=hidden_size, output_size=output_size, num_layers=num_layers, non_timestep_features_size=non_timestep_features_size)
loaded_extended_GRU_model.load_state_dict(torch.load("extended_gru_model_test.pth"))

loaded_extended_GRU_model.eval()

GRUnetExtended(
  (gru): GRU(10, 64, num_layers=2, batch_first=True)
  (fc): Linear(in_features=68, out_features=10, bias=True)
)

In [80]:
ais_test_data_path = '../../Project materials/ais_test.csv'
ais_data_test = pd.read_csv(ais_test_data_path)
unique_ship_ids = ais_data_test['vesselId'].unique()

In [81]:

# Dictionary to store predictions for each ship
ship_predictions = defaultdict(dict)

for ship_id in unique_ship_ids:
    if ship_id not in ship_train_dataframes:
        print(f"No training data available for ship_id: {ship_id}")
        continue

    df = ship_train_dataframes[ship_id]

    # Convert time column to datetime
    df['time'] = pd.to_datetime(df['time'])

    # Extract time features
    df['hour'] = df['time'].dt.hour
    df['minute'] = df['time'].dt.minute
    df['second'] = df['time'].dt.second

    # Calculate `time_to_eta`
    # df['etaRaw'] = df['etaRaw'].apply(lambda x: f"2024-{x}" if pd.notna(x) else x)
    # df['etaRaw'] = pd.to_datetime(df['etaRaw'], format='%Y-%m-%d %H:%M', errors='coerce')
    # df['time_to_eta'] = (df['etaRaw'] - df['time']).dt.total_seconds() / 3600  # Time to ETA in hours

    # # Extract schedule-specific features (e.g., port latitude and longitude)
    # df['port_lat'] = df['portId'].apply(lambda x: ports.loc[x]['latitude'] if pd.notna(x) and x in ports.index else np.nan)
    # df['port_lon'] = df['portId'].apply(lambda x: ports.loc[x]['longitude'] if pd.notna(x) and x in ports.index else np.nan)

    # Extract features used during training
    features = df[['hour', 'minute', 'second', 'longitude', 'latitude', 'sog', 'cog', 'rot', 'heading', 'navstat']]

    # Handle missing values with different strategies based on feature context using .loc[]
    # features.loc[:, 'longitude'] = features['longitude'].fillna(features['longitude'].mean())
    # features.loc[:, 'latitude'] = features['latitude'].fillna(features['latitude'].mean())
    # features.loc[:, 'sog'] = features['sog'].fillna(features['sog'].median())
    # features.loc[:, 'cog'] = features['cog'].fillna(features['cog'].median())
    # features.loc[:, 'time_to_eta'] = features['time_to_eta'].fillna(features['time_to_eta'].median())
    # features.loc[:, ['port_lat', 'port_lon']] = features[['port_lat', 'port_lon']].fillna(features[['port_lat', 'port_lon']].median())

    # # Convert features to numpy array after handling NaN values
    features = features.values

    if len(features) < sequence_length:
        print(f"Not enough historical data to predict for ship_id: {ship_id}")
        continue

    # Use the already fitted time-series scaler to transform features
    input_sequence_normalized = time_series_scaler.transform(features[-sequence_length:])
    input_tensor = torch.tensor(input_sequence_normalized, dtype=torch.float32).unsqueeze(0)  # Shape: (1, sequence_length, num_features)

    # Extract and normalize vessel-specific features using the already fitted scaler
    if ship_id in vessels.index:
        vessel_features = vessels.loc[ship_id][['length', 'breadth', 'enginePower', 'GT']].values.reshape(1, -1)
        vessel_features_normalized = vessel_features_scaler.transform(vessel_features).flatten()
    else:
        print(f"No vessel-specific features available for ship_id: {ship_id}")
        continue

    vessel_features_tensor = torch.tensor(vessel_features_normalized, dtype=torch.float32).unsqueeze(0)  # Shape: (1, vessel_feature_size)

    # Get the farthest prediction time needed from test data
    ship_test_times = pd.to_datetime(ais_data_test[ais_data_test['vesselId'] == ship_id]['time'])
    farthest_time = ship_test_times.max()

    # Last known time in the training data
    last_known_time = df['time'].iloc[-1]
    total_steps_needed = int((farthest_time - last_known_time).total_seconds() // (20 * 60))  # Time difference in steps of 20 minutes

    # Make predictions recursively for the required number of steps
    current_input = input_tensor.clone()

    for step in range(1, total_steps_needed + 1):
        with torch.no_grad():
            # Make the prediction using the model
            next_position = loaded_extended_GRU_model(current_input, vessel_features_tensor)

        # Prediction time for the next step
        prediction_time = last_known_time + pd.Timedelta(minutes=20 * step)

        # Convert the prediction back to the original scale for storing
        prediction_np = next_position.cpu().numpy()
        prediction_original_scale = time_series_scaler.inverse_transform(prediction_np.reshape(1, -1))

        # Store the prediction
        ship_predictions[ship_id][prediction_time] = prediction_original_scale[0]

        # Re-normalize the prediction to feed it back into the model
        prediction_normalized = time_series_scaler.transform(prediction_original_scale)

        # Update the input tensor by removing the oldest time step and appending the new normalized prediction
        next_position_tensor = torch.tensor(prediction_normalized, dtype=torch.float32).unsqueeze(0)
        current_input = torch.cat((current_input[:, 1:, :], next_position_tensor), dim=1)


# Create a list to store the final predictions for the test data
predictions_list = []

for idx, row in ais_data_test.iterrows():
    ship_id = row['vesselId']
    target_time = pd.to_datetime(row['time']).round('min')

    if ship_id in ship_predictions:
        if target_time in ship_predictions[ship_id]:
            prediction = ship_predictions[ship_id][target_time]
        else:
            # If the exact target time is not found, find the closest prediction time
            available_times = list(ship_predictions[ship_id].keys())
            closest_time = min(available_times, key=lambda x: abs(pd.Timestamp(x) - target_time))
            prediction = ship_predictions[ship_id][closest_time]

        predictions_list.append({
            'ship_id': ship_id,
            'time': target_time,
            'predicted_latitude': prediction[4],  # Assuming latitude is at index 4
            'predicted_longitude': prediction[3],  # Assuming longitude is at index 3
            'predicted_sog': prediction[5],  # Assuming speed over ground (sog) is at index 5
            'predicted_cog': prediction[6]  # Assuming course over ground (cog) is at index 6
        })

# Convert the predictions list to a DataFrame
predictions_df = pd.DataFrame(predictions_list)


In [84]:
# Check the minimum and maximum values for longitude
longitude_min = predictions_df['predicted_longitude'].min()
longitude_max = predictions_df['predicted_longitude'].max()

# Check the minimum and maximum values for latitude
latitude_min = predictions_df['predicted_latitude'].min()
latitude_max = predictions_df['predicted_latitude'].max()

# Print the results
print(f"Longitude: Min = {longitude_min}, Max = {longitude_max}")
print(f"Latitude: Min = {latitude_min}, Max = {latitude_max}")

Longitude: Min = -122.76395416259766, Max = 166.85350036621094
Latitude: Min = -72.84368133544922, Max = 61.318416595458984


In [85]:
csv_data = predictions_df[['predicted_longitude', 'predicted_latitude']]

# Write to CSV file with a header
csv_file_path = 'submission2.csv'

# Save the DataFrame to CSV
csv_data.to_csv(csv_file_path, index=True, index_label = 'ID', header=['longitude_predicted','latitude_predicted'])