### Imports and CUDA

In [1]:
!pip uninstall -y typing_extensions
!pip install typing_extensions==4.11.0
!pip uninstall wandb -y
!pip install wandb
!pip install matplotlib
!pip install scikit-learn
!pip install pandas

Found existing installation: typing_extensions 4.13.2
Uninstalling typing_extensions-4.13.2:
  Successfully uninstalled typing_extensions-4.13.2
Collecting typing_extensions==4.11.0
  Using cached typing_extensions-4.11.0-py3-none-any.whl.metadata (3.0 kB)
Using cached typing_extensions-4.11.0-py3-none-any.whl (34 kB)
Installing collected packages: typing_extensions
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pydantic 2.11.3 requires typing-extensions>=4.12.2, but you have typing-extensions 4.11.0 which is incompatible.
typing-inspection 0.4.0 requires typing-extensions>=4.12.0, but you have typing-extensions 4.11.0 which is incompatible.[0m[31m
[0mSuccessfully installed typing_extensions-4.11.0
Found existing installation: wandb 0.19.9
Uninstalling wandb-0.19.9:
  Successfully uninstalled wandb-0.19.9
Collecting wandb
  Using cached wandb-0.19.9-py

In [1]:
# Matplotlib
import requests
import matplotlib.pyplot as plt
# Numpy
import numpy as np
# Torch
import torch
from torch.utils.data import TensorDataset, DataLoader
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import pandas as pd
import wandb
wandb.login()

[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33mdarrenchanyuhao[0m ([33mdarrenchanyuhao-singapore-university-of-technology-and-d[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [2]:
# Use GPU if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


### Objective

#### To develop a model that predicts taxi availability within a specific area for the next three hours. This means that if the model is run at 12 PM, it will provide predicted taxi availability for 1 PM, 2 PM, and 3 PM.

The area of interest is defined by the following geographical boundaries:

    North: 1.35106
    South: 1.32206
    East: 103.97839
    West: 103.92805

To identify the taxis currently available within this region, we use the TaxiAvailabilityScript.py.

This script collects real-time data, which serves as input for our predictive model.

By leveraging historical taxi availability trends and real-time data, our model aims to provide accurate forecasts, helping commuters, ride-hailing services, and urban planners make informed decisions.


# **To-Do List for Taxi Availability Prediction**

## **Step 1: Cleaning the Taxi Availability Data**
The first step involves retrieving and preprocessing the taxi availability dataset. The dataset consists of the following columns:

1. **DateTime**  
2. **Taxi Available Throughout Singapore**  
3. **Taxi Available in Selected Box Area**  
4. **Coordinates[]**  

For our specific use case, **the coordinates column will not be used for now**.  

To prepare the data for the neural network:  
- **Inputs:** We will use `DateTime` and `Taxi Available Throughout Singapore` as features.  
- **Output:** `Taxi Available in Selected Box Area` will be the target variable.  
- **DateTime Conversion:** Since `DateTime` is not in a format suitable for neural networks, we will extract relevant features:
  - **IsWeekend**: A binary feature (1 if it's a weekend, 0 otherwise).  
  - **Hour**: Transformed into a numerical value between **1 and 24** (avoiding 0, which may cause training issues).  

---

## **Step 2: Adding Additional Features**  
*(Partially completed; will be refined over time)*  

Aside from the existing columns, we aim to incorporate additional features that may improve prediction accuracy:  

1. **ERP Rates (Electronic Road Pricing) at the given time and location**  
   - Uncertain if this will significantly impact predictions. Further analysis is needed.  

2. **Number of LTA (Land Transport Authority) gantry locations**  
   - Again, its usefulness remains uncertain—further evaluation required.  

3. **Traffic Incidents in the Selected Area**  
   - A script (`TrafficIncidentScript.py`) has been written to update `traffic_incident.csv` with the latest traffic incidents.  
   - Over time, as the dataset grows, we hope this feature will become useful.  

4. **Number of Taxi Stands in the Area**  
   - Currently **not useful** because our area of interest is fixed.  
   - However, if we allow dynamic selection of areas in the future, this could become relevant.  

5. **Temperature at a Given Time and Date** *(To be implemented)*  

6. **Rainfall Data** *(To be implemented)*  

To ensure all features align properly, we will **synchronize all datasets based on DateTime** before feeding them into the model.  

---

## **Step 3: Creating the Training-Test Split**  
- Initially, we will perform an **80/20 Training-Test split** for simplicity.  
- In the future, we may introduce a **Training-Validation-Test split** to further refine model performance.  

---

## **Step 4: Building the Model**  
We will begin with an **LSTM model**, as LSTMs are well-suited for time-series forecasting.  
- **Initial Limitation:** The model, in its basic form, will only predict the next hour.  
- **Future Improvement:** A **sliding window approach** will be explored and implemented to extend predictions further.  

---

## **Step 5: Model Evaluation and Improvement**  
- After the initial model is trained, we will assess its performance.  
- Based on results, we will explore potential improvements, such as hyperparameter tuning, architectural modifications, or additional feature engineering.  

---

This structured approach will guide the development of a robust and accurate taxi availability prediction model. 🚖💡


## **Preparing the taxi_availability data here.**

Normalization of certain inputs are done as well, but I am unsure if it is the right thing to do as well.

In [3]:
# taxi_availability_file_path = "taxi_availability.csv"

# taxi_df = pd.read_csv(taxi_availability_file_path, delimiter=",")

merged_weather_taxi_df = "merged_file_with_mean.csv"
taxi_df = pd.read_csv(merged_weather_taxi_df, delimiter = ",")

#Adjusting for weather parameters
taxi_df = taxi_df.drop(columns = "stationId")

#Adjusting for taxi_vailability parameters
taxi_df_coordinates = taxi_df["Coordinates[]"]
taxt_df_datetime = taxi_df["DateTime"]
taxi_df = taxi_df.drop(columns = "Coordinates[]")
taxi_df["DateTime"] = pd.to_datetime(taxi_df["DateTime"])
taxi_df = taxi_df.drop(columns = "Taxi Available in Selected Box Area")

taxi_df["IsWeekend"] = (taxi_df["DateTime"].dt.weekday >= 5).astype(int)
taxi_df["Hour"] = taxi_df["DateTime"].dt.hour + 1  # Convert 0-23 to 1-24
taxi_df = taxi_df.drop(columns = "DateTime")

# print(taxi_df.dtypes)
print(taxi_df.head)

<bound method NDFrame.head of        Taxi Available throughout SG  temp_value  humidity_value  \
0                              1924        27.1            84.1   
1                              2259        27.3            82.5   
2                              2400        27.4            81.2   
3                              2677        27.5            81.9   
4                              2437        27.7            78.0   
...                             ...         ...             ...   
25586                          1962        27.0            82.9   
25587                          2025        27.3            81.7   
25588                          2144        27.4            82.2   
25589                          2447        27.3            82.9   
25590                          2615        27.6            82.3   

       rainfall_value  peak_period  Average Taxi Availability  IsWeekend  Hour  
0                 0.0            1                 102.416667          0    24  
1  

### Converting all dtypes into float32

In [4]:
# taxi_df=taxi_df[:5120]
numeric_columns = taxi_df.select_dtypes(include=['int64', 'int32','float64','object']).columns
print("numeric_columns",numeric_columns)
taxi_df[numeric_columns] = taxi_df[numeric_columns].astype('float32')
numeric_columns = taxi_df.select_dtypes(include=['int64', 'int32','float64','object']).columns

# Convert selected columns to float32
taxi_df[numeric_columns] = taxi_df[numeric_columns].astype('float32')

numeric_columns Index(['Taxi Available throughout SG', 'temp_value', 'humidity_value',
       'rainfall_value', 'peak_period', 'Average Taxi Availability',
       'IsWeekend', 'Hour'],
      dtype='object')


### Normalizing all values

In [5]:
#---------------Normalise-----------------------
data_min = taxi_df.min(axis=0)
data_max = taxi_df.max(axis=0)
taxi_df_normalized = (taxi_df - data_min) / (data_max - data_min)


taxi_df_output_normalized  = taxi_df_normalized["Average Taxi Availability"]
taxi_df_normalized = taxi_df_normalized.drop(columns = "Average Taxi Availability")
taxi_df_normalized.to_csv("normalized_data.csv", index=False)  # Set index=False to exclude row numbers

# Convert to NumPy arrays
input_data = taxi_df_normalized.values  # Shape: (5120, num_features)
output_data = taxi_df_output_normalized.values  # Shape: (5120,)

print("Input Data: ",input_data.shape)
print("Output Data: ",output_data.shape)


Input Data:  (25591, 7)
Output Data:  (25591,)


### No Normalization Style

In [6]:
# #---------------No Normalization-----------------------

# # Drop 'DateTime' as it's no longer needed
# taxi_df_no_norm = taxi_df  # Remove DateTime but keep raw values

# # Separate input and output data
# taxi_df_output_no_norm = taxi_df_no_norm["Taxi Available in Selected Box Area"]
# taxi_df_no_norm = taxi_df_no_norm.drop(columns=["Taxi Available in Selected Box Area"])

# # Save to CSV for checking
# taxi_df_no_norm.to_csv("checker_no_norm.csv", index=False)  # Set index=False to exclude row numbers

# # Convert to NumPy arrays (raw values)
# input_data = taxi_df_no_norm.values  # Shape: (5120, num_features)
# output_data = taxi_df_output_no_norm.values  # Shape: (5120,)

# print("Input Data: ", input_data.shape)
# print("Input Data: ", input_data[0])

# print("Output Data: ", output_data.shape)
# print("Input Data: ", output_data[0])


### Create Sequence Function

In [7]:
seq_length = 24
pred_horizon = 3  # Number of future time steps to predict

def create_sequences(data, labels, seq_length, pred_horizon):
    xs, ys = [], []
    for i in range(0, len(data), seq_length):  # Start from 0 and increment by seq_length
        if i + seq_length + pred_horizon <= len(data):  # Ensure enough data for prediction horizon
            xs.append(data[i:i + seq_length])  # Input sequence (continuous)
            ys.append(labels[i + seq_length : i + seq_length + pred_horizon])  # Next 3 values
    return np.array(xs), np.array(ys)


In [8]:

# X, y = create_sequences(input_data, output_data, seq_length,pred_horizon)
X, y = create_sequences(input_data, output_data, seq_length,pred_horizon)

# Convert to PyTorch tensors
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y[:, None], dtype=torch.float32)
y = y.permute(0, 2, 1)  # Shape: (samples, pred_horizon, 1)

# Split sizes
total_samples = len(X)
train_size = int(0.8 * total_samples)
val_size = int(0.1 * total_samples)
test_size = total_samples - train_size - val_size

# Split the data
trainX, valX, testX = X[:train_size], X[train_size:train_size+val_size], X[train_size+val_size:]
trainY, valY, testY = y[:train_size], y[train_size:train_size+val_size], y[train_size+val_size:]

# Create TensorDatasets
train_dataset = TensorDataset(trainX, trainY)
val_dataset = TensorDataset(valX, valY)
test_dataset = TensorDataset(testX, testY)

# DataLoaders
batch_size = 17
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, drop_last=True)

# Example of accessing a batch of data
for inputs, targets in train_loader:
    print(f'Inputs: {inputs.shape}, Targets: {targets.shape}')
    break  # Only print the first batch for verification

Inputs: torch.Size([17, 24, 7]), Targets: torch.Size([17, 3, 1])


In [9]:
print("test_loader",len(test_loader))

test_loader 6


In [10]:
class BiLSTM_pt(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(BiLSTM_pt, self).__init__()
        self.hidden_dim = hidden_dim
        self.layer_dim = layer_dim
        self.num_directions = 2  # Since it's bidirectional
        
        # LSTM layer
        self.lstm = torch.nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True, bidirectional=True)

        # Fully connected layer
        self.fc = torch.nn.Linear(hidden_dim * 2, output_dim)  # Fix here

    def forward(self, x, h0=None, c0=None):
        if h0 is None or c0 is None:
            h0 = torch.randn(self.layer_dim * self.num_directions, x.size(0), self.hidden_dim).to(x.device)
            c0 = torch.randn(self.layer_dim * self.num_directions, x.size(0), self.hidden_dim).to(x.device)

        # LSTM forward pass
        out, (hn, cn) = self.lstm(x, (h0, c0))

        # Pass only the last timestep's output to the FC layer
        out = self.fc(out[:, -1, :])  

        return out, hn, cn


In [55]:
def train(model, dataloader, val_loader, num_epochs, learning_rate):
    # Set the loss function and optimizer
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    model.train()  # Set the model to training mode
    loss_graph = []
    val_graph = []
    
    for epoch in range(num_epochs):
        epoch_loss = 0.0

        hidden_state, cell_state = None, None  # Reset for each epoch

        for batch_idx, (inputs, targets) in enumerate(dataloader):
            inputs, targets = inputs.to(device), targets.to(device)

            optimizer.zero_grad()

            output, hidden_state, cell_state = model(inputs, hidden_state, cell_state)
            output = output.unsqueeze(-1)
            output = output.permute(0, 2, 1)

            # Compute loss
            loss = criterion(output, targets)
            loss.backward()
            optimizer.step()

            hidden_state = hidden_state.detach()
            cell_state = cell_state.detach()

            epoch_loss += loss.item()

        avg_loss = epoch_loss / len(dataloader)
        loss_graph.append(avg_loss)

        # Validation step (optional)
        model.eval()
        val_loss = 0.0
        with torch.no_grad():
            for val_inputs, val_targets in val_loader:
                val_inputs, val_targets = val_inputs.to(device), val_targets.to(device)

                val_output, _, _ = model(val_inputs, hidden_state, cell_state)
                val_output = val_output.unsqueeze(-1)
                val_output = val_output.permute(0, 2, 1)

                # Compute validation loss
                v_loss = criterion(val_output, val_targets)
                val_loss += v_loss.item()
        avg_val_loss = v_loss / len(dataloader)
        val_graph.append(avg_val_loss)
        model.train()
        
        if epoch % 50 == 0 or epoch == num_epochs - 1:
            print(f'Epoch {epoch+1}/{num_epochs}, Loss: {avg_loss:.6f}, Validation Loss: {avg_val_loss:.6f}')
            # Save model after every epoch
            save_path = os.path.join(f'./bi_LSTM/bi_LSTM{epoch+1}_loss_{avg_loss:.6f}.pth')
            torch.save(model.state_dict(), save_path)

    return loss_graph, val_graph

In [56]:
import os
# Define the model parameters
# Following the research paper's instructions
input_size = 7
hidden_size = 256
num_layers = 2 # Can be changed to stack multiple LSTM layers!
output_size = 3
num_epochs = 1000
dataloader = train_loader
learning_rate = 1e-3

#Create the model
model = BiLSTM_pt(input_size, hidden_size, num_layers, output_size).to(device)
loss_graph, val_graph = train(dataloader = dataloader, val_loader=val_loader, model = model, num_epochs = num_epochs, learning_rate = learning_rate)

# Plot the loss graph
plt.plot(loss_graph)
plt.plot(val_graph)
plt.legend(['Training Loss', 'Validation Loss'])
plt.title("Loss Graph")
plt.xlabel("Epochs")
plt.ylabel("Loss")

Epoch 1/1000, Loss: 0.015474, Validation Loss: 0.000313
Epoch 51/1000, Loss: 0.004468, Validation Loss: 0.000132
Epoch 101/1000, Loss: 0.003814, Validation Loss: 0.000193
Epoch 151/1000, Loss: 0.002647, Validation Loss: 0.000107
Epoch 201/1000, Loss: 0.002149, Validation Loss: 0.000215
Epoch 251/1000, Loss: 0.001719, Validation Loss: 0.000244


KeyboardInterrupt: 

In [46]:
# Saving the final model
import os

# Create the 'models' directory if it doesn't exist
os.makedirs('./final_models', exist_ok=True)
torch.save(model.state_dict(), './final_mod els/Bi_LSTM.pth')

### This will be to set up the Sweep

In [None]:
sweep_config = {
    "method": "bayes",
    "metric": {
        "name": "loss",
        "goal": "minimize"
    },
    "parameters": {
        "learning_rate": {
            "min": 1e-3,
            "max": 0.01
        },
        "hidden_size": {
            "values": [64, 128, 256]
        },
        "num_layers": {
            "values": [1, 2, 3]
        },
        "num_epochs": {
            "values": [300, 500, 1000]
        }
    }
}

In [26]:
def sweep_train():
    config_defaults = {
        "learning_rate": 0.01,
        "num_epochs": 300,
        "hidden_size": 50,
        "num_layers": 2
    }

    # Initialize wandb
    wandb.init(config=config_defaults)
    config = wandb.config
    model = BiLSTM_pt(6, config.hidden_size, config.num_layers, 3).to(device)

    # Same training as above
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=config.learning_rate)
    model.train()
    loss_graph = []

    for epoch in range(config.num_epochs):
        epoch_loss = 0.0
        hidden_state, cell_state = None, None

        for batch_idx, (inputs, targets) in enumerate(train_loader):
            inputs, targets = inputs.to(device), targets.to(device)

            optimizer.zero_grad()
            output, hidden_state, cell_state = model(inputs, hidden_state, cell_state)
            output = output.unsqueeze(-1).permute(0, 2, 1)

            loss = criterion(output, targets)
            loss.backward()
            optimizer.step()

            hidden_state = hidden_state.detach()
            cell_state = cell_state.detach()

            epoch_loss += loss.item()

        avg_loss = epoch_loss / len(train_loader)
        loss_graph.append(avg_loss)

        # Log to wandb
        wandb.log({"epoch": epoch, "loss": avg_loss})

        if epoch % 50 == 0 or epoch == config.num_epochs - 1:
            print(f"Epoch {epoch+1}/{config.num_epochs}, Loss: {avg_loss:.6f}")

In [27]:
sweep_id = wandb.sweep(sweep_config, project="DeepLearning Project")
wandb.agent(sweep_id, function=sweep_train, count=30)

Create sweep with ID: fe6fgats
Sweep URL: https://wandb.ai/darrenchanyuhao-singapore-university-of-technology-and-d/DeepLearning%20Project/sweeps/fe6fgats


[34m[1mwandb[0m: Agent Starting Run: oi1oz7u8 with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.002730549256187793
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 2


[1;34mwandb[0m: 
[1;34mwandb[0m: 🚀 View run [33mscarlet-sweep-5[0m at: [34mhttps://wandb.ai/darrenchanyuhao-singapore-university-of-technology-and-d/DeepLearning%20Project/runs/nz3wi9d8[0m
[1;34mwandb[0m: Find logs at: [1;35mwandb/run-20250414_183820-nz3wi9d8/logs[0m


Epoch 1/500, Loss: 0.022477
Epoch 51/500, Loss: 0.006083
Epoch 101/500, Loss: 0.003258
Epoch 151/500, Loss: 0.001554
Epoch 201/500, Loss: 0.001047
Epoch 251/500, Loss: 0.000739
Epoch 301/500, Loss: 0.000480
Epoch 351/500, Loss: 0.000347
Epoch 401/500, Loss: 0.000295
Epoch 451/500, Loss: 0.000437
Epoch 500/500, Loss: 0.000321


0,1
epoch,▁▁▁▁▂▂▂▂▂▃▃▄▄▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇█████
loss,███▆▆▆▅▅▅▅▄▄▄▃▃▂▂▂▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00032


[34m[1mwandb[0m: Agent Starting Run: ejxiu4jd with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0024207902763121276
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/500, Loss: 0.027981
Epoch 51/500, Loss: 0.006172
Epoch 101/500, Loss: 0.004013
Epoch 151/500, Loss: 0.001187
Epoch 201/500, Loss: 0.000444
Epoch 251/500, Loss: 0.000284
Epoch 301/500, Loss: 0.000257
Epoch 351/500, Loss: 0.000353
Epoch 401/500, Loss: 0.000170
Epoch 451/500, Loss: 0.000456
Epoch 500/500, Loss: 0.000160


0,1
epoch,▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇███
loss,█▅▄▄▄▃▃▃▃▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00016


[34m[1mwandb[0m: Agent Starting Run: x91ft5fu with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.0044657173680859915
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/500, Loss: 0.019934
Epoch 51/500, Loss: 0.006537
Epoch 101/500, Loss: 0.005222
Epoch 151/500, Loss: 0.003832
Epoch 201/500, Loss: 0.002879
Epoch 251/500, Loss: 0.002034
Epoch 301/500, Loss: 0.001350
Epoch 351/500, Loss: 0.000924
Epoch 401/500, Loss: 0.001083
Epoch 451/500, Loss: 0.000824
Epoch 500/500, Loss: 0.000526


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▅▅▅▆▆▇▇▇▇▇▇██
loss,█▇▇▇▆▅▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00053


[34m[1mwandb[0m: Agent Starting Run: ckvtl4gt with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.003706092218269441
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/500, Loss: 0.016937
Epoch 51/500, Loss: 0.006171
Epoch 101/500, Loss: 0.003947
Epoch 151/500, Loss: 0.001879
Epoch 201/500, Loss: 0.001273
Epoch 251/500, Loss: 0.000730
Epoch 301/500, Loss: 0.000599
Epoch 351/500, Loss: 0.000405
Epoch 401/500, Loss: 0.000334
Epoch 451/500, Loss: 0.000371
Epoch 500/500, Loss: 0.000412


0,1
epoch,▁▁▁▁▁▁▂▂▂▂▃▃▃▃▃▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇█████
loss,█▆▆▆▆▆▆▅▅▅▄▄▄▃▃▃▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00041


[34m[1mwandb[0m: Agent Starting Run: 10pbypug with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0014371643773841217
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/500, Loss: 0.032382
Epoch 51/500, Loss: 0.006630
Epoch 101/500, Loss: 0.003676
Epoch 151/500, Loss: 0.001228
Epoch 201/500, Loss: 0.000716
Epoch 251/500, Loss: 0.000219
Epoch 301/500, Loss: 0.000236
Epoch 351/500, Loss: 0.000680
Epoch 401/500, Loss: 0.000178
Epoch 451/500, Loss: 0.000384
Epoch 500/500, Loss: 0.000165


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▂▂▂▃▃▃▃▃▄▄▅▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇████
loss,█▇▆▆▅▅▅▄▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00016


[34m[1mwandb[0m: Agent Starting Run: wb8n0kf9 with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0013877079015223214
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/500, Loss: 0.030115
Epoch 51/500, Loss: 0.006311
Epoch 101/500, Loss: 0.003959
Epoch 151/500, Loss: 0.001876
Epoch 201/500, Loss: 0.000500
Epoch 251/500, Loss: 0.000697
Epoch 301/500, Loss: 0.000315
Epoch 351/500, Loss: 0.000196
Epoch 401/500, Loss: 0.000447
Epoch 451/500, Loss: 0.000129
Epoch 500/500, Loss: 0.000147


0,1
epoch,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇▇██
loss,█▄▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00015


[34m[1mwandb[0m: Agent Starting Run: 6wmjfq8w with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.009830837032974552
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/1000, Loss: 0.157720
Epoch 51/1000, Loss: 0.020292
Epoch 101/1000, Loss: 0.021495
Epoch 151/1000, Loss: 0.025379
Epoch 201/1000, Loss: 0.024240
Epoch 251/1000, Loss: 0.022292
Epoch 301/1000, Loss: 0.020497
Epoch 351/1000, Loss: 0.021286
Epoch 401/1000, Loss: 0.021485
Epoch 451/1000, Loss: 0.021101
Epoch 501/1000, Loss: 0.019973
Epoch 551/1000, Loss: 0.019683
Epoch 601/1000, Loss: 0.018457
Epoch 651/1000, Loss: 0.018479
Epoch 701/1000, Loss: 0.018022
Epoch 751/1000, Loss: 0.018163
Epoch 801/1000, Loss: 0.018759
Epoch 851/1000, Loss: 0.018844
Epoch 901/1000, Loss: 0.018512
Epoch 951/1000, Loss: 0.018689
Epoch 1000/1000, Loss: 0.018389


0,1
epoch,▁▁▂▂▂▂▂▂▂▂▂▂▃▃▃▃▃▄▄▄▅▅▅▅▅▅▅▆▆▆▆▆▆▆▆▇████
loss,▂▃▁▃▃▂▃▅▃▃▅▃▂▂▃▂▃▃▂▃▃▂▂▃▂▄█▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.01839


[34m[1mwandb[0m: Agent Starting Run: 3e284wmx with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.001270022830404644
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/300, Loss: 0.028555
Epoch 51/300, Loss: 0.006564
Epoch 101/300, Loss: 0.004727
Epoch 151/300, Loss: 0.002557
Epoch 201/300, Loss: 0.001288
Epoch 251/300, Loss: 0.000796
Epoch 300/300, Loss: 0.000641


0,1
epoch,▁▁▁▁▁▂▂▂▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇██████
loss,█▇▇▅▅▄▄▄▄▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁

0,1
epoch,299.0
loss,0.00064


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: jftvjnsr with config:
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.0023563504028319138
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/300, Loss: 0.018930
Epoch 51/300, Loss: 0.006103
Epoch 101/300, Loss: 0.004988
Epoch 151/300, Loss: 0.004007
Epoch 201/300, Loss: 0.003015
Epoch 251/300, Loss: 0.001871
Epoch 300/300, Loss: 0.001120


0,1
epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇█████
loss,█▆▆▅▆▅▅▅▅▄▄▄▄▄▄▄▃▃▃▃▃▃▃▃▂▂▂▃▂▂▂▂▂▂▂▁▁▂▁▁

0,1
epoch,299.0
loss,0.00112


[34m[1mwandb[0m: Agent Starting Run: i7aise80 with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.0015800746543442473
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/300, Loss: 0.023433
Epoch 51/300, Loss: 0.006153
Epoch 101/300, Loss: 0.003062
Epoch 151/300, Loss: 0.001319
Epoch 201/300, Loss: 0.000403
Epoch 251/300, Loss: 0.000240
Epoch 300/300, Loss: 0.000251


0,1
epoch,▁▁▁▂▂▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇█████
loss,██▇▇▆▆▅▆▅▅▅▄▅▄▄▃▃▃▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,299.0
loss,0.00025


[34m[1mwandb[0m: Agent Starting Run: jrokk55d with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0012486355677667
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/500, Loss: 0.018938
Epoch 51/500, Loss: 0.006096
Epoch 101/500, Loss: 0.005892
Epoch 151/500, Loss: 0.004410
Epoch 201/500, Loss: 0.003786
Epoch 251/500, Loss: 0.003224
Epoch 301/500, Loss: 0.001782
Epoch 351/500, Loss: 0.001282
Epoch 401/500, Loss: 0.001021
Epoch 451/500, Loss: 0.000740
Epoch 500/500, Loss: 0.000718


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▆▆▆▆▇▇▇▇▇▇█████
loss,█▇▇▇▇▇▆▅▅▅▅▄▅▄▄▄▄▄▄▄▃▃▃▃▂▂▂▂▂▂▂▂▁▂▂▁▁▁▁▁

0,1
epoch,499.0
loss,0.00072


[34m[1mwandb[0m: Agent Starting Run: l6j3o85c with config:
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.0016927739077731783
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.020901
Epoch 51/1000, Loss: 0.005894
Epoch 101/1000, Loss: 0.004934
Epoch 151/1000, Loss: 0.004002
Epoch 201/1000, Loss: 0.002787
Epoch 251/1000, Loss: 0.002276
Epoch 301/1000, Loss: 0.001564
Epoch 351/1000, Loss: 0.001472
Epoch 401/1000, Loss: 0.001217
Epoch 451/1000, Loss: 0.000814
Epoch 501/1000, Loss: 0.000658
Epoch 551/1000, Loss: 0.000476
Epoch 601/1000, Loss: 0.000373
Epoch 651/1000, Loss: 0.000360
Epoch 701/1000, Loss: 0.000264
Epoch 751/1000, Loss: 0.000374
Epoch 801/1000, Loss: 0.000240
Epoch 851/1000, Loss: 0.000478
Epoch 901/1000, Loss: 0.000171
Epoch 951/1000, Loss: 0.000186
Epoch 1000/1000, Loss: 0.000311


0,1
epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇█████
loss,█▇▇▇▆▆▆▅▄▄▄▃▃▃▂▂▂▂▂▁▂▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00031


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 27dvgp9j with config:
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.00266899012927148
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.018729
Epoch 51/1000, Loss: 0.006006
Epoch 101/1000, Loss: 0.005073
Epoch 151/1000, Loss: 0.004079
Epoch 201/1000, Loss: 0.003366
Epoch 251/1000, Loss: 0.002562
Epoch 301/1000, Loss: 0.001743
Epoch 351/1000, Loss: 0.001476
Epoch 401/1000, Loss: 0.001500
Epoch 451/1000, Loss: 0.001520
Epoch 501/1000, Loss: 0.000606
Epoch 551/1000, Loss: 0.000584
Epoch 601/1000, Loss: 0.000657
Epoch 651/1000, Loss: 0.000480
Epoch 701/1000, Loss: 0.000432
Epoch 751/1000, Loss: 0.000462
Epoch 801/1000, Loss: 0.000459
Epoch 851/1000, Loss: 0.000340
Epoch 901/1000, Loss: 0.000480
Epoch 951/1000, Loss: 0.000419
Epoch 1000/1000, Loss: 0.000172


0,1
epoch,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▆▆▆▆▇▇▇▇▇▇███
loss,█▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00017


[34m[1mwandb[0m: Agent Starting Run: k2kmeott with config:
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.004722567174197851
[34m[1mwandb[0m: 	num_epochs: 500
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/500, Loss: 0.020423
Epoch 51/500, Loss: 0.006328
Epoch 101/500, Loss: 0.005198
Epoch 151/500, Loss: 0.004679
Epoch 201/500, Loss: 0.003356
Epoch 251/500, Loss: 0.002343
Epoch 301/500, Loss: 0.002143
Epoch 351/500, Loss: 0.001399
Epoch 401/500, Loss: 0.001181
Epoch 451/500, Loss: 0.001267
Epoch 500/500, Loss: 0.001282


0,1
epoch,▁▁▁▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇██
loss,█▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,499.0
loss,0.00128


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: sh291mvv with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.0019307308447955496
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.017830
Epoch 51/1000, Loss: 0.006120
Epoch 101/1000, Loss: 0.005402
Epoch 151/1000, Loss: 0.003828
Epoch 201/1000, Loss: 0.002879
Epoch 251/1000, Loss: 0.002018
Epoch 301/1000, Loss: 0.001474
Epoch 351/1000, Loss: 0.000970
Epoch 401/1000, Loss: 0.000761
Epoch 451/1000, Loss: 0.000472
Epoch 501/1000, Loss: 0.000407
Epoch 551/1000, Loss: 0.000574
Epoch 601/1000, Loss: 0.000324
Epoch 651/1000, Loss: 0.000429
Epoch 701/1000, Loss: 0.000330
Epoch 751/1000, Loss: 0.000182
Epoch 801/1000, Loss: 0.000351
Epoch 851/1000, Loss: 0.000252
Epoch 901/1000, Loss: 0.000236
Epoch 951/1000, Loss: 0.000225
Epoch 1000/1000, Loss: 0.000110


0,1
epoch,▁▁▁▁▁▁▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇███
loss,█▄▄▃▃▃▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00011


[34m[1mwandb[0m: Agent Starting Run: xskwkopr with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0046455279583774885
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/300, Loss: 0.026437
Epoch 51/300, Loss: 0.009766
Epoch 101/300, Loss: 0.005978
Epoch 151/300, Loss: 0.004329
Epoch 201/300, Loss: 0.003414
Epoch 251/300, Loss: 0.002650
Epoch 300/300, Loss: 0.002703


0,1
epoch,▁▁▁▂▂▂▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇████
loss,▆▇█▇▇▆▅▆▅▅▄▃▄▃▄▃▃▃▄▄▃▃▂▃▄▂▃▂▂▃▁▂▂▁▂▁▁▁▁▁

0,1
epoch,299.0
loss,0.0027


[34m[1mwandb[0m: Agent Starting Run: p6is7vhl with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.001775493541872476
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.017522
Epoch 51/1000, Loss: 0.007214
Epoch 101/1000, Loss: 0.005662
Epoch 151/1000, Loss: 0.004344
Epoch 201/1000, Loss: 0.003669
Epoch 251/1000, Loss: 0.003437
Epoch 301/1000, Loss: 0.002026
Epoch 351/1000, Loss: 0.001474
Epoch 401/1000, Loss: 0.001171
Epoch 451/1000, Loss: 0.000956
Epoch 501/1000, Loss: 0.000911
Epoch 551/1000, Loss: 0.000597
Epoch 601/1000, Loss: 0.000475
Epoch 651/1000, Loss: 0.000259
Epoch 701/1000, Loss: 0.000194
Epoch 751/1000, Loss: 0.000323
Epoch 801/1000, Loss: 0.000242
Epoch 851/1000, Loss: 0.000229
Epoch 901/1000, Loss: 0.000133
Epoch 951/1000, Loss: 0.000212
Epoch 1000/1000, Loss: 0.000153


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▃▃▃▄▄▄▄▄▄▄▅▅▅▅▆▆▆▆▇▇▇▇▇█
loss,█▇▇▆▆▆▅▅▅▄▄▄▄▃▃▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00015


[34m[1mwandb[0m: Agent Starting Run: m2f1ew4f with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.00212003751635999
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.018340
Epoch 51/1000, Loss: 0.006100
Epoch 101/1000, Loss: 0.004986
Epoch 151/1000, Loss: 0.004078
Epoch 201/1000, Loss: 0.002607
Epoch 251/1000, Loss: 0.002061
Epoch 301/1000, Loss: 0.001173
Epoch 351/1000, Loss: 0.001064
Epoch 401/1000, Loss: 0.000656
Epoch 451/1000, Loss: 0.000290
Epoch 501/1000, Loss: 0.000323
Epoch 551/1000, Loss: 0.000307
Epoch 601/1000, Loss: 0.000364
Epoch 651/1000, Loss: 0.000185
Epoch 701/1000, Loss: 0.000141
Epoch 751/1000, Loss: 0.000206
Epoch 801/1000, Loss: 0.000156
Epoch 851/1000, Loss: 0.000362
Epoch 901/1000, Loss: 0.000146
Epoch 951/1000, Loss: 0.000188
Epoch 1000/1000, Loss: 0.000104


0,1
epoch,▁▁▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇█████
loss,██▇▆▅▆▅▄▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.0001


[34m[1mwandb[0m: Agent Starting Run: uq5g8vls with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.001486474826746672
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.020812
Epoch 51/1000, Loss: 0.005816
Epoch 101/1000, Loss: 0.005207
Epoch 151/1000, Loss: 0.004432
Epoch 201/1000, Loss: 0.004115
Epoch 251/1000, Loss: 0.002354
Epoch 301/1000, Loss: 0.001422
Epoch 351/1000, Loss: 0.001124
Epoch 401/1000, Loss: 0.000772
Epoch 451/1000, Loss: 0.000756
Epoch 501/1000, Loss: 0.000458
Epoch 551/1000, Loss: 0.000625
Epoch 601/1000, Loss: 0.000269
Epoch 651/1000, Loss: 0.000254
Epoch 701/1000, Loss: 0.000368
Epoch 751/1000, Loss: 0.000227
Epoch 801/1000, Loss: 0.000175
Epoch 851/1000, Loss: 0.000180
Epoch 901/1000, Loss: 0.000261
Epoch 951/1000, Loss: 0.000242
Epoch 1000/1000, Loss: 0.000108


0,1
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇▇▇▇██
loss,█▆▆▆▆▅▅▅▄▄▃▃▂▁▂▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00011


[34m[1mwandb[0m: Agent Starting Run: 3nhtuq53 with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.00280666302398226
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.019388
Epoch 51/1000, Loss: 0.006429
Epoch 101/1000, Loss: 0.005438
Epoch 151/1000, Loss: 0.004182
Epoch 201/1000, Loss: 0.003310
Epoch 251/1000, Loss: 0.002141
Epoch 301/1000, Loss: 0.001710
Epoch 351/1000, Loss: 0.001280
Epoch 401/1000, Loss: 0.000853
Epoch 451/1000, Loss: 0.000597
Epoch 501/1000, Loss: 0.000660
Epoch 551/1000, Loss: 0.000606
Epoch 601/1000, Loss: 0.000435
Epoch 651/1000, Loss: 0.000546
Epoch 701/1000, Loss: 0.000501
Epoch 751/1000, Loss: 0.000792
Epoch 801/1000, Loss: 0.000202
Epoch 851/1000, Loss: 0.000091
Epoch 901/1000, Loss: 0.000217
Epoch 951/1000, Loss: 0.000197
Epoch 1000/1000, Loss: 0.000226


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▅▅▅▆▆▆▆▇▇▇▇▇▇███
loss,█▇▇▆▆▅▃▃▃▂▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00023


[34m[1mwandb[0m: Agent Starting Run: b1i646ws with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.001515426960485965
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.019925
Epoch 51/1000, Loss: 0.006399
Epoch 101/1000, Loss: 0.005406
Epoch 151/1000, Loss: 0.004252
Epoch 201/1000, Loss: 0.003574
Epoch 251/1000, Loss: 0.002624
Epoch 301/1000, Loss: 0.001869
Epoch 351/1000, Loss: 0.001343
Epoch 401/1000, Loss: 0.000802
Epoch 451/1000, Loss: 0.000651
Epoch 501/1000, Loss: 0.000382
Epoch 551/1000, Loss: 0.000309
Epoch 601/1000, Loss: 0.000285
Epoch 651/1000, Loss: 0.000280
Epoch 701/1000, Loss: 0.000308
Epoch 751/1000, Loss: 0.000215
Epoch 801/1000, Loss: 0.000296
Epoch 851/1000, Loss: 0.000122
Epoch 901/1000, Loss: 0.000299
Epoch 951/1000, Loss: 0.000181
Epoch 1000/1000, Loss: 0.000348


0,1
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇█████
loss,█▅▅▅▅▄▃▄▃▃▃▂▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00035


[34m[1mwandb[0m: Agent Starting Run: yq97gbm5 with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.0013395865691722096
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 1


Epoch 1/1000, Loss: 0.019662
Epoch 51/1000, Loss: 0.005995
Epoch 101/1000, Loss: 0.004983
Epoch 151/1000, Loss: 0.004540
Epoch 201/1000, Loss: 0.002998
Epoch 251/1000, Loss: 0.001912
Epoch 301/1000, Loss: 0.001212
Epoch 351/1000, Loss: 0.000953
Epoch 401/1000, Loss: 0.000721
Epoch 451/1000, Loss: 0.000323
Epoch 501/1000, Loss: 0.000371
Epoch 551/1000, Loss: 0.000222
Epoch 601/1000, Loss: 0.000286
Epoch 651/1000, Loss: 0.000218
Epoch 701/1000, Loss: 0.000436
Epoch 751/1000, Loss: 0.000321
Epoch 801/1000, Loss: 0.000242
Epoch 851/1000, Loss: 0.000126
Epoch 901/1000, Loss: 0.000129
Epoch 951/1000, Loss: 0.000121
Epoch 1000/1000, Loss: 0.000106


0,1
epoch,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▄▄▅▅▅▅▆▆▆▇▇▇▇▇████
loss,██▇▇▆▄▄▄▄▃▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00011


[34m[1mwandb[0m: Agent Starting Run: ideuhj75 with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.007481059678338014
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/300, Loss: 0.059969
Epoch 51/300, Loss: 0.011907
Epoch 101/300, Loss: 0.012815
Epoch 151/300, Loss: 0.013267
Epoch 201/300, Loss: 0.013763
Epoch 251/300, Loss: 0.013299
Epoch 300/300, Loss: 0.013535


0,1
epoch,▁▁▁▁▂▂▂▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▇▇▇▇▇▇██
loss,▃▆▂▂▆▁▁▁▁▃▁▁▁▁▁▃▄▃▃▃▅▄▆▄▄▅▆█▇█▅▅▆▇▅▅▅▅▅▆

0,1
epoch,299.0
loss,0.01354


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 5w860dqf with config:
[34m[1mwandb[0m: 	hidden_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.0019164722630955469
[34m[1mwandb[0m: 	num_epochs: 300
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/300, Loss: 0.020374
Epoch 51/300, Loss: 0.005888
Epoch 101/300, Loss: 0.003405
Epoch 151/300, Loss: 0.001606
Epoch 201/300, Loss: 0.001146
Epoch 251/300, Loss: 0.000631
Epoch 300/300, Loss: 0.000507


0,1
epoch,▁▁▁▁▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇█
loss,█▆▅▅▄▄▄▄▄▃▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,299.0
loss,0.00051


[34m[1mwandb[0m: Agent Starting Run: 6vpcrzke with config:
[34m[1mwandb[0m: 	hidden_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.008288380951227105
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 3


Epoch 1/1000, Loss: 0.028999
Epoch 51/1000, Loss: 0.012555
Epoch 101/1000, Loss: 0.015141
Epoch 151/1000, Loss: 0.013447
Epoch 201/1000, Loss: 0.013098
Epoch 251/1000, Loss: 0.012822
Epoch 301/1000, Loss: 0.013526
Epoch 351/1000, Loss: 0.013000
Epoch 401/1000, Loss: 0.012718
Epoch 451/1000, Loss: 0.012855
Epoch 501/1000, Loss: 0.012854
Epoch 551/1000, Loss: 0.012853
Epoch 601/1000, Loss: 0.012855
Epoch 651/1000, Loss: 0.013003
Epoch 701/1000, Loss: 0.012888
Epoch 751/1000, Loss: 0.012857
Epoch 801/1000, Loss: 0.012337
Epoch 851/1000, Loss: 0.012355
Epoch 901/1000, Loss: 0.012386
Epoch 951/1000, Loss: 0.012345
Epoch 1000/1000, Loss: 0.013301


0,1
epoch,▁▁▁▂▂▂▂▂▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▅▆▆▆▇▇▇▇▇▇██████
loss,▁▄▄▅█▆▆▇▇▇▅▆▅▅▅▅▅▅▅▅▅▅▆▅▅▅▅▅▃▃▄▄▄▄▄▄▄▄▄▄

0,1
epoch,999.0
loss,0.0133


[34m[1mwandb[0m: Agent Starting Run: 48olryl6 with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.002820130638489579
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/1000, Loss: 0.022001
Epoch 51/1000, Loss: 0.006605
Epoch 101/1000, Loss: 0.004763
Epoch 151/1000, Loss: 0.002298
Epoch 201/1000, Loss: 0.001470
Epoch 251/1000, Loss: 0.000503
Epoch 301/1000, Loss: 0.000250
Epoch 351/1000, Loss: 0.000271
Epoch 401/1000, Loss: 0.000370
Epoch 451/1000, Loss: 0.000502
Epoch 501/1000, Loss: 0.000200
Epoch 551/1000, Loss: 0.000303
Epoch 601/1000, Loss: 0.000624
Epoch 651/1000, Loss: 0.000582
Epoch 701/1000, Loss: 0.000370
Epoch 751/1000, Loss: 0.000176
Epoch 801/1000, Loss: 0.000344
Epoch 851/1000, Loss: 0.000220
Epoch 901/1000, Loss: 0.000076
Epoch 951/1000, Loss: 0.000201
Epoch 1000/1000, Loss: 0.000194


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▂▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇████
loss,█▆▅▅▅▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00019


[34m[1mwandb[0m: Agent Starting Run: ppksxukg with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0018977014214646068
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/1000, Loss: 0.029718
Epoch 51/1000, Loss: 0.006826
Epoch 101/1000, Loss: 0.004456
Epoch 151/1000, Loss: 0.002332
Epoch 201/1000, Loss: 0.001318
Epoch 251/1000, Loss: 0.000699
Epoch 301/1000, Loss: 0.000531
Epoch 351/1000, Loss: 0.000159
Epoch 401/1000, Loss: 0.000390
Epoch 451/1000, Loss: 0.000122
Epoch 501/1000, Loss: 0.000185
Epoch 551/1000, Loss: 0.000255
Epoch 601/1000, Loss: 0.000116
Epoch 651/1000, Loss: 0.000274
Epoch 701/1000, Loss: 0.000410
Epoch 751/1000, Loss: 0.000299
Epoch 801/1000, Loss: 0.000203
Epoch 851/1000, Loss: 0.000310
Epoch 901/1000, Loss: 0.000230
Epoch 951/1000, Loss: 0.000176
Epoch 1000/1000, Loss: 0.000118


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▆▆▆▆▆▆▇▇▇▇█████
loss,██▇▅▅▅▄▄▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.00012


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: wyqf35mq with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.0026271764502430583
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/1000, Loss: 0.019094
Epoch 51/1000, Loss: 0.006644
Epoch 101/1000, Loss: 0.004606
Epoch 151/1000, Loss: 0.002067
Epoch 201/1000, Loss: 0.001514
Epoch 251/1000, Loss: 0.000446
Epoch 301/1000, Loss: 0.000738
Epoch 351/1000, Loss: 0.000504
Epoch 401/1000, Loss: 0.000233
Epoch 451/1000, Loss: 0.000293
Epoch 501/1000, Loss: 0.000303
Epoch 551/1000, Loss: 0.000146
Epoch 601/1000, Loss: 0.000484
Epoch 651/1000, Loss: 0.000186
Epoch 701/1000, Loss: 0.000081
Epoch 751/1000, Loss: 0.000362
Epoch 801/1000, Loss: 0.000218
Epoch 851/1000, Loss: 0.000286
Epoch 901/1000, Loss: 0.000231
Epoch 951/1000, Loss: 0.000199
Epoch 1000/1000, Loss: 0.000062


0,1
epoch,▁▁▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
loss,█▆▅▅▄▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,6e-05


[34m[1mwandb[0m: Agent Starting Run: xd2gc8bz with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.002562080247089086
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/1000, Loss: 0.019193
Epoch 51/1000, Loss: 0.006152
Epoch 101/1000, Loss: 0.004618
Epoch 151/1000, Loss: 0.002365
Epoch 201/1000, Loss: 0.001086
Epoch 251/1000, Loss: 0.000639
Epoch 301/1000, Loss: 0.000275
Epoch 351/1000, Loss: 0.000337
Epoch 401/1000, Loss: 0.000394
Epoch 451/1000, Loss: 0.000324
Epoch 501/1000, Loss: 0.000312
Epoch 551/1000, Loss: 0.000234
Epoch 601/1000, Loss: 0.000171
Epoch 651/1000, Loss: 0.000473
Epoch 701/1000, Loss: 0.000365
Epoch 751/1000, Loss: 0.000293
Epoch 801/1000, Loss: 0.000200
Epoch 851/1000, Loss: 0.000100
Epoch 901/1000, Loss: 0.000252
Epoch 951/1000, Loss: 0.000142
Epoch 1000/1000, Loss: 0.000034


0,1
epoch,▁▁▁▁▁▁▂▂▂▂▂▂▃▃▃▃▄▄▄▄▅▅▅▅▅▅▅▆▆▇▇▇▇███████
loss,█▆▅▅▄▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,3e-05


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: bwusyetc with config:
[34m[1mwandb[0m: 	hidden_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.003370260102354819
[34m[1mwandb[0m: 	num_epochs: 1000
[34m[1mwandb[0m: 	num_layers: 2


Epoch 1/1000, Loss: 0.029390
Epoch 51/1000, Loss: 0.007093
Epoch 101/1000, Loss: 0.004531
Epoch 151/1000, Loss: 0.002134
Epoch 201/1000, Loss: 0.001447
Epoch 251/1000, Loss: 0.000552
Epoch 301/1000, Loss: 0.000306
Epoch 351/1000, Loss: 0.000518
Epoch 401/1000, Loss: 0.000489
Epoch 451/1000, Loss: 0.000495
Epoch 501/1000, Loss: 0.000337
Epoch 551/1000, Loss: 0.000308
Epoch 601/1000, Loss: 0.000246
Epoch 651/1000, Loss: 0.000156
Epoch 701/1000, Loss: 0.000176
Epoch 751/1000, Loss: 0.000226
Epoch 801/1000, Loss: 0.000227
Epoch 851/1000, Loss: 0.000303
Epoch 901/1000, Loss: 0.000276
Epoch 951/1000, Loss: 0.000455
Epoch 1000/1000, Loss: 0.000195


0,1
epoch,▁▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▅▅▅▆▆▆▇▇▇████
loss,█▆▆▅▃▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,999.0
loss,0.0002


In [30]:
print(len(test_loader))

6


In [70]:
model = BiLSTM_pt(input_size, hidden_size, num_layers, output_size).to(device)
model.load_state_dict(torch.load('./bi_LSTM/bi_LSTM251_loss_0.001719.pth'))

<All keys matched successfully>

In [None]:
model.eval()

# Initialize variables to track loss
loss_value = 0
num_batches = 0

# Define the loss function
criterion = torch.nn.MSELoss()

# Initialize hidden state and cell state
hidden_state, cell_state = None, None  
mae_list = []

# Disable gradient computation for validation
with torch.no_grad():
    for batch_idx, (inputs, targets) in enumerate(test_loader):
        if batch_idx == len(test_loader) - 1:  
            break  # Skip the last batch

        # Forward pass
        inputs = inputs.to(device)
        output, cell_state, hidden_state = model(inputs, cell_state, hidden_state)
        output = output.unsqueeze(-1).permute(0, 2, 1)

        output = output.to(device)
        targets = targets.to(device)
        
        # Denormalize predictions and targets (for all 3 time steps)
        output_denorm = output * (data_max["Average Taxi Availability"] - data_min["Average Taxi Availability"]) + data_min["Average Taxi Availability"]
        targets_denorm = targets * (data_max["Average Taxi Availability"] - data_min["Average Taxi Availability"]) + data_min["Average Taxi Availability"]

        # Compute loss on normalized data
        loss_value += criterion(output, targets)
        mae = torch.mean(torch.abs(output_denorm - targets_denorm))
        mae_list.append(mae)

        # Print a sample of the normalized and denormalized values
        print("Normalized output[0]:", output[0].tolist())  
        print("Normalized target[0]:", targets[0].tolist())  
        print("Denormalized output[0]:", output_denorm[0].tolist())  
        print("Denormalized target[0]:", targets_denorm[0].tolist())  
        print("-" * 50)

# Compute average loss
loss_value = loss_value / (len(test_loader) - 1)
print("Predicted output shape:", output.shape)
print("True output shape:", targets.shape)
print(f'Average Validation Loss: {loss_value:.4f}')

mae = torch.mean(torch.tensor(mae_list))
print(f'Mean Absolute Error: {mae:.4f}')

: 