# Time Series Forecasting Project using LSTM and PyTorch

In this project, we'll use the PyTorch library to develop a time series forecasting model using Long Short-Term Memory (LSTM) networks.

The LSTM model, a type of recurrent neural network (RNN), is particularly well-suited for time series data due to its ability to capture temporal dependencies and patterns over time. This project will go through several stages, including data preprocessing, model design, training, and evaluation, leveraging PyTorch’s capabilities to build an effective forecasting model.

## Long Short-Term Memory Models

LSTM models are a type of recurrent neural network (RNN) designed to handle long-range dependencies in sequential data. Their architecture allows them to capture both short-term and long-term patterns, which is crucial for tasks like time series forecasting and natural language processing.

In standard RNNs, as the length of the input sequence increases, the gradients tend to either become very small (vanish) or very large (explode), due to the multiplicative gradient that can exponentially decrease or increase through layers. [Vanishing gradients](https://en.wikipedia.org/wiki/Vanishing_gradient_problem) make the network unable to learn from long-range dependencies because the information gets lost over the layers. On the other hand, exploding gradients can lead to unstable networks where the model weights can oscillate or diverge.

Moreover, the ability of LSTMs to remember information over long periods makes them exceptionally well-suited for time-series data, where understanding the context and dependencies over time is crucial. Whether it's predicting stock prices, analyzing weather patterns, or understanding speech, LSTMs can maintain a stable learning process over these sequences, leading to more accurate and reliable predictions.

### Mathematical definitions

<center>
    <img src="imgs/lstm-model.jpg" alt="Legend" style="width: 60%;">
    <br>
    <em>Architecture of a LSTM Unit (Credits: <a href="https://d2l.ai/chapter_recurrent-modern/lstm.html">DIVE INTO DEEP LEARNING</a>)</em>
</center>

The LSTM key components include:

1. **Cell State (C):** Acts as a container for storing memories, helping the LSTM retain information over time. It's updated as follows:

   $$
   C_t = F_t ⊙ C_{t-1} + I_t ⊙ \tilde{C}_t
   $$

   where $C_t$ is the new cell state, $f_t$ is the forget gate's activation, $i_t$ is the input gate's activation, and $\tilde{C}_t$ is the candidate cell state.

2. **Gates:** Function like valves that control the flow of information into and out of the memory cell.
   - **Forget Gate (F):** Determines which parts of the memory can be discarded.

     $$
     F_t = \sigma(X_tW_{xf} + H_{t-1}W_{hf} + b_f)
     $$

   - **Input Gate (F) and Candidate Cell State ($\tilde{C}$):** Updates the memory cell with new data from the current input.

     $$
     I_t = \sigma(X_tW_{xi} + H_{t-1}W_{hi} + b_i)
     $$
     $$
     \tilde{C}_t = \tanh(X_tW_{xc} + H_{t-1}W_{hc} + b_C)
     $$

   - **Output Gate (O):** Decides the output based on the current contents of the memory cell.

     $$
     O_t = \sigma(X_tW_{xo} + H_{t-1}W_{ho} + b_o)
     $$

3. **Hidden State (H):** The hidden state is updated using the output gate and the cell state.

   $$
   H_t = O_t ⊙ \tanh(C_t)
   $$

In these equations, $W$ and $b$ are the weights and biases specific to each gate, $\sigma$ represents the sigmoid function, and $\tanh$ is the hyperbolic tangent function.

LSTM networks efficiently process and predict based on time-series data, overcoming issues like vanishing and exploding gradients common in standard RNNs.


# Project: Time Series Forecasting with LSTM using PyTorch

## Step 1: Data Preprocessing

### Data Loading
- Load the dataset into a suitable format for analysis, using a library like Pandas.

### Data Cleaning
- Check for missing or inconsistent data and handle them appropriately.

### Feature Engineering
- Convert categorical variables like 'StateHoliday' into a format that can be fed into the model, such as one-hot encoding.

### Normalization
- Normalize the sales data as LSTMs are sensitive to the scale of the input data.

### Time Series Transformation
- Convert the data into a time series format suitable for LSTM, typically involving creating sequences of a fixed window size.

## Step 2: Dataset Splitting

### Train-Test Split
- Split the dataset into training and testing sets, ensuring the temporal sequence is maintained.

### Validation Set Creation
- Optionally, create a validation set from the training set for model tuning.

## Step 3: Model Design

### LSTM Architecture
- Design the LSTM model architecture using PyTorch, including the number of layers, hidden units, and other hyperparameters.

### Loss Function and Optimizer
- Choose an appropriate loss function (e.g., MSE for regression tasks) and an optimizer (like Adam).

## Step 4: Model Training

### Data Loading in Batches
- Utilize DataLoader in PyTorch to load data in batches.

### Model Training
- Train the model on the training set while validating on the validation set, if available.

### Hyperparameter Tuning
- Adjust hyperparameters like learning rate, batch size, and the number of epochs based on performance on the validation set.

## Step 5: Model Evaluation

### Testing
- Evaluate the model’s performance on the test set to gauge its forecasting ability.

### Error Analysis
- Calculate error metrics like MAE, RMSE, etc., to understand the accuracy of the model.

## Step 6: Model Deployment (Optional)

### Saving the Model
- Save the trained model for later use or deployment.

### Deployment
- Deploy the model in a suitable environment for real-time or batch predictions.

## Step 7: Reporting and Visualization

### Performance Reporting
- Document the model’s performance, including error metrics and potential areas of improvement.

### Data Visualization
- Visualize predictions vs actual sales to get a qualitative sense of the model’s performance.

## Step 8: Iteration and Improvement

### Feedback Loop
- Incorporate feedback to refine the model.

### Continuous Improvement
- Regularly update the model with new data and adjust the model as necessary.
