![Traffic](traffic.png)

Traffic data fluctuates constantly or is affected by time. Predicting it can be challenging, but this task will help sharpen your time-series skills. With deep learning, you can use abstract patterns in data that can help boost predictability.

Your task is to build a system that can be applied to help you predict traffic volume or the number of vehicles passing at a specific point and time. Determining this can help reduce road congestion, support new designs for roads or intersections, improve safety, and more! Or, you can use to help plan your commute to avoid traffic!

The dataset provided contains the hourly traffic volume on an interstate highway in Minnesota, USA. It also includes weather features and holidays, which often impact traffic volume.

Time to predict some traffic!

### The data:

The dataset is collected and maintained by UCI Machine Learning Repository. The target variable is `traffic_volume`. The dataset contains the following and has already been normalized and saved into training and test sets:

`train_scaled.csv`, `test_scaled.csv`
| Column     | Type       | Description              |
|------------|------------|--------------------------|
|`temp`                   |Numeric            |Average temp in kelvin|
|`rain_1h`                |Numeric            |Amount in mm of rain that occurred in the hour|
|`snow_1h`                |Numeric            |Amount in mm of snow that occurred in the hour|
|`clouds_all`             |Numeric            |Percentage of cloud cover|
|`date_time`              |DateTime           |Hour of the data collected in local CST time|
|`holiday_` (11 columns)  |Categorical        |US National holidays plus regional holiday, Minnesota State Fair|
|`weather_main_` (11 columns)|Categorical     |Short textual description of the current weather|
|`weather_description_` (35 columns)|Categorical|Longer textual description of the current weather|
|`traffic_volume`         |Numeric            |Hourly I-94 ATR 301 reported westbound traffic volume|
|`hour_of_day`|Numeric|The hour of the day|
|`day_of_week`|Numeric|The day of the week (0=Monday, Sunday=6)|
|`day_of_month`|Numeric|The day of the month|
|`month`|Numeric|The number of the month|
|`traffic_volume`         |Numeric            |Hourly I-94 ATR 301 reported westbound traffic volume|

In [2]:
# Import the relevant libraries
import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

In [2]:
# Read the traffic data from the CSV training and test files
train_scaled_df = pd.read_csv('train_scaled.csv')
test_scaled_df = pd.read_csv('test_scaled.csv')

# Convert the DataFrame to NumPy arrays
train_scaled = train_scaled_df.to_numpy()
test_scaled = test_scaled_df.to_numpy()

In [6]:
# Start coding here
# Use as many cells as you like!

Build a deep learning model that predicts traffic volume and helps tackle challenges like congestion, road design, and smarter commutes:
* Build a deep learning model using PyTorch to predict the traffic volume using the provided dataset. Initialize and save this model as traffic_model.
* Train and evaluate your model using an appropriate loss function. Save the final training loss as a tensor variable, final_training_loss (aim for less than 20).
* Predict the traffic volume against the test set and evaluate the performance using Mean Squared Error (MSE). Save your result as a tensor float, test_mse.

**1. Prepare the data for modeling**
To model time-series data, you need to generate sequences of past values as inputs and predict the next value as the target. One way to do this is by writing a function and passing in the available data. These then need to be converted to PyTorch tensors and loaded.
- Creating sequences with a function
    - Define a function that takes the dataset, sequence length, and target column index as inputs.
    * Use a for loop to loop through the dataset: The loop should run from 0 to len(data) - sequence length. This ensures that you're able to get a full sequence of the desired length and a target value.
    * Extract the input sequence by slicing the dataset starting at index i and ending at i + sequence length.
    * Extract the target value that immediately follows the sequence (i.e., at index i + sequence length).
    * Append the input sequences to a list of inputs, and the targets to a list of outputs.
    * Return the inputs and targets as NumPy arrays at the end of the function using np.array().
    * Apply the function on both the training and test data set. You should end up with a X_train, y_train, X_test, y_test set, not necessarily with these names.
- Converting to tensors
    - Convert the input and output sequences from NumPy arrays to PyTorch tensors using torch.tensor().
    - PyTorch expects floating-point numbers for neural network inputs and targets, so you’ll need to convert the NumPy arrays accordingly. Do this by combining .astype(), np.float32 and .float().
    - Apply this to both your training and testing splits.
    - Use TensorDataset() to combine the feature and label tensors into a dataset.
- Loading data
    - Use DataLoader() to load the dataset in batches.
    * Batch size 64 is a common choice to balance training speed and memory usage.
    * shuffle can be set to True for the training set to reduce the risk of overfitting, but can be False for the test set.

**2. Creating a neural network model**
Select and build an appropriate neural network that is good at handling time series data. Consider using Recurrent Neural Networks, which are designed to capture temporal dependencies in sequential data.
- Choosing the right neural network
    * RNNs are ideal for handling time-series data because they can retain information from previous time steps.
    * Use nn.LSTM() or nn.GRU() to capture both short and long term dependencies.
- Defining the __init__() method and RNN Layer
    * In the __init__() method, define the layers of your neural network, starting with the RNN layer, for example, use nn.LSTM() to define the LSTM layer.
    * The LSTM layer could include the input_size, hidden_size, num_layers, and batch_first.
    * input_size should match the number of input features you have for each time step. If using the full dataset, that's 66 features.
    * hidden_size of 64 should provide a good balance for speed and capturing complex patterns.
    * Try more than more layer in num_layers to have the model learn more complex patterns.
    * batch_first=True ensures the input tensor has the shape (batch_size, sequence_length, input_size).
    * Add any additional layers after this, but it's recommended to include an activation function and a fully connected later to map the output to the target value.
    * The fully connected layer can be added with nn.Linear(), defining the size of the input to and the output from the layer. The prediction here is a single value (traffic volume), so the output size is 1.
- Selecting an activation function
    - Include an activation function to the model learn complex relationships between inputs and outputs.
    - Try experimenting with different activation functions and see how the model trains. For example: nn.LeakyReLU(), nn.ReLU(), or nn.Sigmoid().
- Writing the forward method
    * The forward() method defines how data flows through the network during training and inference.
    * Pass your date through the RNN layer, such as the LSTM() layer. This returns two values: the hidden states from all time steps, and the hidden and cell states from the last time step for each layer of the LSTM. You only want to extract the final hidden state (h_0) to predict traffic volume.
    * Pass the final hidden state through the fully connected layer.
    * Apply the activation function.
    * Return the final output, or prediction.

**3. Training the model**
Set up and train the neural network model to predict traffic volume. This involves initializing the model, selecting an appropriate loss function and optimizer, and running the training loop for multiple epochs to minimize the loss function.
- Initializing the model
    * Create an instance of your model class to initialize the neural network, saving it to traffic_model.
    * For example, if your model is called Net(), this would look like traffic_model = Net().
- Choosing the loss function
    - Choose a loss function that is appropriate for regression tasks. One such loss function is Mean Squared Error (MSE) (nn.MSELoss()).
- Selecting an optimizer
    * Use an optimizer to update the model's weights during training.
    * Try experimenting with different optimizers such as optim.Adam(), optim.SGD() or optim.Adagrad().
    * Set the learning rate appropriately. Increase the learning rate if you don't seem to be reaching convergence, or decrease it if the training loss is fluctuating a lot. Smaller learning rates lead to more gradual updates, which can help with convergence but may take longer.
- Running a training loop
    - Run the training loop for 2 epochs to submit the project, you can increase this after to see how it impacts your model training.
    * For each epoch, iterate over the batches of training data with a for loop.
    * Zero the gradients with optimizer.zero_grad().
    * Pass the batch inputs through the model to generate predictions.
    * Compare the predictions to the actual labels using your chosen loss function.
    * Use loss.backward() to calculate the gradients.
    * Use optimizer.step()` to adjust the model's weights based on the gradients.
    * Print the loss for each epoch for monitoring. After the final epoch, store the final training loss in a variable (final_training_loss).

**4. Evaluating the model**

After training the model, it's essential to evaluate its performance on unseen data (test set). This step involves running the model in evaluation mode, collecting the predictions, and comparing them to the actual labels using an appropriate metric.
- Setting evaluation mode
    * Set the model to evaluation mode by using .eval().
- Running an evaluation loop
    * Start by disabling gradient calculation. This is not needed during evaluation. You can do this with torch.no_grad().
    * Loop over the test data using the test DataLoader. For each batch of sequences, pass them through the model to get predictions.
    * Squeeze the output with .squeeze() to ensure the shape is appropriate.
    * Store the predictions and actual labels in lists for later evaluation.
    * After the loop, use torch.cat() to concatenate all predictions and labels into two tensors, making it easier to compute the final evaluation metric over the entire test set.
- Calculating the MSE
    - Evaluate how well the model's predictions match the actual traffic volumes with Mean Squared Error (MSE), F.mse_loss().
    * Save this to test_mse.

In [None]:
#Create sequences

import numpy as np 
def create_sequences(data, seq_length):
    xs, ys = [], []
    for i in range(len(data) - seq_length):
        x = data.iloc[i:(i+seq_length), 1]
        y = data.iloc[i+seq_length, 1]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

In [None]:
#Tensor Dataset
X_train, y_train = create_sequences(train_data, seq_length)
print(X_train.shape, y_train.shape)

In [None]:
#Convert to Torch Dataset
from torch.utils.data import TensorDataset 

dataset_train = TensorDataset(
    torch.from_numpy(X_train).float(),
    torch.from_numpy(y_train).float()
)

In [None]:
#RNN model
class RNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.rnn = nn.RNN(
            input_size=1, 
            hidden_size=32,
            num_layers=2,
            batch_first=True,
        )
        self.fc = nn.Linear(32, 1)
    
    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.rnn(x, h0)
        out = self.fc((out[:, -1, :]))
        return out 

In [3]:
#LSTM Model
class LSTM(nn.Module):
    def __init__(self, input_size):
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=1,
            hidden_size=32,
            num_layers=2,
            batch_first=True, 
        )
        self.fc = nn.Linear(32, 1)
    
    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)
        c0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])
        return out 

In [None]:
#GRU model
class GRU(nn.Module):
    def __init__(self, input_size):
        super().__init__()
        self.gru = nn.GRU(
            input_size=1,
            hidden_size=32,
            num_layers=2,
            batch_first=True, 
        )
        self.fc = nn.Linear(32, 1)
    
    def forward(self, x):
        h0 = torch.zeros(2, x.size(0), 32)
        out, _ = self.gru(x, h0)
        out = self.fc(out[:, -1, :])
        return out
    

In [None]:
criterion = nn.MSELoss()

In [None]:
#Training loop
model = LSTM(input_size=1)
criterion = nn.MSELoss()
optimizer = optim.Adam(
    model.parameters(), lr=0.001
)

for epoch in range(num_epochs):
    for seqs, labels in dataloader_train:
        seqs = seqs.view(32, 96, 1)
        outputs = model(seqs)
        loss = criterion(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

In [None]:
#Evaluation loop

mse = torchmetrics.MeanSquaredError()

model.eval()
with torch.no_grad():
    for seqs, labels in test_loader:
        seqs = seqs.view(32, 96, 1)
        outputs = model(seqs).squeeze()
        mse(outputs, labels)

print(f"Test MSE: {mse.compute()}")