## LSTM_training.ipynb

Original file is located at
https://github.com/KyujinHan/Object-Depth-detection-based-hybrid-Distance-estimator/blob/master/odd_train/LSTM_train_sample.ipynb

### @author: ODD team
### This original file has been modified by our team on 2024-10-15.

# LSTM Training

This notebook outlines the process of estimating the real distance (in meters) of an object using bounding box coordinates. The model predicts the 3D location of a camera based on these coordinates.

## 1. Distance Estimator
- **Purpose**: To estimate the real distance (unit: meter) of an object.
- **Input**: Bounding box coordinates `(xmin, ymin, xmax, ymax)`.
- **Output**: 3D location `(z)` of camera coordinates `(z_loc)`.

## 2. Load Module
- **Required Libraries**: 
  - `tqdm`, `os`, `pandas`, `matplotlib.pyplot`, `numpy`, `time`, `torch`, `sklearn.preprocessing`, `custom_datasets`, `sklearn.metrics`, `math`.

In [None]:
!pip tqdm pandas matplotlib numpy torch scikit-learn

In [None]:
import os
import tqdm
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import time
import torch
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.metrics import mean_absolute_error, mean_squared_error
import math
import custom_datasets  # Assuming this is a custom module in the project

- **Directory Setup**: Ensure the weights directory exists.

In [None]:
# Create a directory for model weights if it doesn't exist
weights_dir = './weights'
if not os.path.exists(weights_dir):
    os.makedirs(weights_dir)

## 3. Dataset Preparation
- **Data Loading**: Load training, validation, and test datasets from CSV files.

In [None]:
# Load datasets
train_data = pd.read_csv('data/train_dataset.csv')
val_data = pd.read_csv('data/val_dataset.csv')
test_data = pd.read_csv('data/test_dataset.csv')

- **Data Cleaning**: Check for missing values and sort the `z_loc` values.


In [None]:
# Check for missing values
train_data.isnull().sum()

# Sort dataset by z_loc
train_data = train_data.sort_values(by='z_loc')

- **One-Hot Encoding**: Apply one-hot encoding to the class variable.


In [None]:
# One-hot encoding for the 'class' column
onehot_encoder = OneHotEncoder(sparse=False)
class_onehot = onehot_encoder.fit_transform(train_data[['class']])

- **Label Encoding**: Transform class labels into numerical format.

In [None]:
# Label encoding for 'class' column
label_encoder = LabelEncoder()
train_data['class_encoded'] = label_encoder.fit_transform(train_data['class'])

## 4. Data Information
- **Dataset Overview**: Display information about the training dataset.

In [None]:
# Display information about the dataset
train_data.info()
train_data.describe()

- **Variable Selection**: Define the variables used for training.

In [None]:
# Select features and target
X_train = train_data[['xmin', 'ymin', 'xmax', 'ymax']]
y_train = train_data['z_loc']

X_val = val_data[['xmin', 'ymin', 'xmax', 'ymax']]
y_val = val_data['z_loc']

X_test = test_data[['xmin', 'ymin', 'xmax', 'ymax']]
y_test = test_data['z_loc']


## 5. Model Definition
- **Model Architecture**: Define the `Zloc_Estimator` class using LSTM layers and fully connected layers.

In [None]:
import torch.nn as nn

class Zloc_Estimator(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Zloc_Estimator, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        out = self.fc(lstm_out[:, -1, :])  # Take the output from the last time step
        return out
    

- **Alternative Model**: A simpler version of the model (`Zloc_Estimator_s`).

In [None]:
class Zloc_Estimator_s(nn.Module):
    def __init__(self, input_size, output_size):
        super(Zloc_Estimator_s, self).__init__()
        self.fc = nn.Linear(input_size, output_size)
    
    def forward(self, x):
        return self.fc(x)


## 6. Training Setup
- **Hyperparameters**: Specify input dimensions, hidden dimensions, and layer dimensions.

In [None]:
input_size = 4  # Bounding box coordinates
hidden_size = 128  # Hidden units in LSTM layer
output_size = 1  # z_loc (distance)

- **Loss Function and Optimizer**: Use L1 loss and Adam optimizer.

In [None]:
# Loss function and optimizer
loss_fn = nn.L1Loss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

- **Early Stopping**: Implement early stopping to prevent overfitting.

In [None]:
# Early stopping criterion
early_stopping = 5  # Number of epochs to wait for validation loss improvement

## 7. Training and Validation Functions
- **Training Function**: Define the training loop, including loss calculation and backpropagation.

In [None]:
def train_model(model, X_train, y_train, epochs):
    model.train()
    for epoch in range(epochs):
        optimizer.zero_grad()
        outputs = model(X_train)
        loss = loss_fn(outputs, y_train)
        loss.backward()
        optimizer.step()
        print(f'Epoch {epoch+1}/{epochs}, Loss: {loss.item()}')
        

- **Evaluation Function**: Define the evaluation loop to assess model performance on validation data.

In [None]:
def evaluate_model(model, X_val, y_val):
    model.eval()
    with torch.no_grad():
        outputs = model(X_val)
        loss = loss_fn(outputs, y_val)
    return loss.item()


## 8. Training Process
- **Epochs**: Set the number of epochs for training.

In [None]:
epochs = 50

- **Model Saving**: Save the model weights if validation loss improves.

In [None]:
best_loss = float('inf')
for epoch in range(epochs):
    train_model(model, X_train, y_train, 1)
    val_loss = evaluate_model(model, X_val, y_val)
    
    if val_loss < best_loss:
        best_loss = val_loss
        torch.save(model.state_dict(), f'{weights_dir}/best_model.pth')


## 9. Visualization
- **Loss Visualization**: Plot training and validation loss over epochs.

In [None]:
# Plot training and validation loss
plt.plot(train_losses, label='Train Loss')
plt.plot(val_losses, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()


## 10. Prediction and Evaluation
- **Model Loading**: Load the best model for predictions.

In [None]:
# Load the best model
model.load_state_dict(torch.load(f'{weights_dir}/best_model.pth'))
model.eval()

- **Prediction on Datasets**: Predict values for training, validation, and test datasets.

In [None]:
# Make predictions
y_train_pred = model(X_train)
y_val_pred = model(X_val)
y_test_pred = model(X_test)

- **Performance Metrics**: Calculate MAE, RMSE, and accuracy for predictions.

In [None]:
# Calculate performance metrics
mae = mean_absolute_error(y_test, y_test_pred)
rmse = math.sqrt(mean_squared_error(y_test, y_test_pred))
print(f'MAE: {mae}, RMSE: {rmse}')

## 11. Class-wise Performance
- **Performance by Class**: Calculate and display performance metrics for each class in the test dataset.

In [None]:
# Calculate performance metrics for each class
for class_label in test_data['class'].unique():
    class_data = test_data[test_data['class'] == class_label]
    class_pred = model(class_data[['xmin', 'ymin', 'xmax', 'ymax']])
    mae = mean_absolute_error(class_data['z_loc'], class_pred)
    print(f'Class: {class_label}, MAE: {mae}')


## 12. Additional Metrics
- **Relative Differences**: Calculate additional performance metrics based on relative differences.

In [None]:
# Calculate relative differences
relative_diff = np.abs(y_test - y_test_pred) / y_test
print(f'Average Relative Difference: {np.mean(relative_diff)}')

## 13. Accuracy by Distance Range
- **Distance Segmentation**: Divide predictions by distance ranges and calculate accuracy.

In [None]:
# Segment by distance range and calculate accuracy
distance_bins = [0, 10, 20, 30, 40, 50]  # Define ranges in meters
accuracy_by_range = []

for i in range(len(distance_bins) - 1):
    mask = (y_test >= distance_bins[i]) & (y_test < distance_bins[i+1])
    accuracy = np.mean(np.abs(y_test[mask] - y_test_pred[mask]) < 1)  # Accuracy threshold
    accuracy_by_range.append(accuracy)
    print(f'Distance range {distance_bins[i]}-{distance_bins[i+1]}m, Accuracy: {accuracy}')


## 14. Visualization of Results
- **Scatter Plots**: Create scatter plots to visualize the relationship between predicted and actual values.

In [None]:
# Scatter plot of actual vs predicted distances
plt.scatter(y_test, y_test_pred)
plt.xlabel('Actual Distance (z_loc)')
plt.ylabel('Predicted Distance (z_loc)')
plt.title('Actual vs Predicted Distance')
plt.show()
