<a href="https://colab.research.google.com/github/Patrikwork/Course/blob/main/Amazon_Stock_Forecasting_with_LSTM_Updated.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Amazon Stock Price Forecasting Using LSTM

<img src="https://moodle.lut.fi/pluginfile.php/1/theme_maker_lab/logo/1697517856/LAB_eng_NEG.png" alt="drawing" width="350"/>

## Overview
This Jupyter notebook presents a comprehensive workflow for forecasting Amazon's stock prices using a Long Short-Term Memory (LSTM) network, a type of recurrent neural network (RNN) well-suited for time series prediction. The notebook encompasses the entire data science pipeline from data preprocessing, model training, to the final evaluation of the model's performance.

## Objectives
- **Data Preprocessing**: The notebook begins with loading and preprocessing the historical stock price data of Amazon. This step includes normalization of data to make the training process more efficient and to help the LSTM model learn the patterns more effectively.
- **Model Building**: An LSTM model is constructed using the PyTorch framework, which is designed to predict the closing stock price based on historical prices.
- **Training and Validation**: The model is trained and validated across several epochs, with visual feedback provided on the training and validation losses to monitor the model's learning progress.
- **Evaluation**: The model's predictive performance is evaluated by comparing the forecasted prices against the actual stock prices in the test dataset. This step involves inverse transforming the predictions to compare them on the same scale as the original prices.
- **Visualization**: The notebook includes various plots to visually assess the model's performance, displaying both the training and test predictions against the actual closing prices.

## Conclusion
By the end of this notebook, we will have a trained LSTM model that can forecast Amazon's stock prices with a certain degree of accuracy. The visualizations and evaluation metrics will help us understand the model's strengths and limitations in predicting stock market trends.

---

**Note**: Stock price prediction is inherently uncertain and complex due to the volatile nature of the financial markets. The results from this notebook are not financial advice and should not be used as the sole basis for investment decisions.


In [None]:
!curl -O https://raw.githubusercontent.com/Patrikwork/Course/main/AMZN.csv
!head -n 5 /content/AMZN.csv


# Data Import and Preparation
This code cell performs the following actions:
1. **Import Libraries**: It imports `pandas` for data handling, `numpy` for numerical operations, and `matplotlib.pyplot` for data visualization.
2. **Import PyTorch Modules**: It imports the PyTorch library and its neural network module `nn` for model building.
3. **Load Data**: It reads the Amazon stock price data from a CSV file named 'AMZN.csv' using pandas and displays the DataFrame.

These steps are essential for setting up the environment and loading the data needed for subsequent preprocessing and model training.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import torch
import torch.nn as nn

data = pd.read_csv('/content/AMZN.csv')

data


# Data Selection
This code cell narrows down the dataset to the columns of interest:
1. **Date**: The date of the stock data record.
2. **Close**: The closing price of Amazon's stock, which is often used as the primary variable for forecasting stock prices.

By reducing the dataset to these two columns, it streamlines the data for the modeling process that will follow, focusing on the time series aspect of the stock prices.


In [None]:
data = data[['Date', 'Close']]
data


# Device Configuration for PyTorch
This code cell determines the computing device to be used by PyTorch:
1. **GPU Check**: It checks if CUDA is available for GPU acceleration.
2. **Device Setting**: It sets the device to 'cuda:0' to use the GPU if available; otherwise, it falls back to the CPU.

Utilizing GPU acceleration is crucial for deep learning tasks as it can greatly expedite model training and processing times.


In [None]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
device


# Data Conversion and Visualization
In this code cell, the following steps are taken:
1. **Date Conversion**: The 'Date' column is converted from string format to `datetime` objects to facilitate time series analysis.
2. **Plotting**: It plots the 'Close' price against the 'Date' to visualize the trend of Amazon's stock price over time.

This visualization is a preliminary step that helps in understanding the general behavior of the stock price before diving into more complex analysis and forecasting.


In [None]:
data['Date'] = pd.to_datetime(data['Date'])

plt.plot(data['Date'], data['Close'])


# Preparing Data for LSTM Model
This code cell accomplishes the following:
1. **Deep Copy Import**: It imports the `deepcopy` function to ensure that the original DataFrame is not altered during the data preparation process.
2. **Function Definition**: A function `prepare_dataframe_for_lstm` is defined to transform the DataFrame into a format suitable for LSTM training. This includes creating lagged versions of the 'Close' price to serve as input features.
3. **Setting Index**: The 'Date' column is set as the DataFrame's index, which is a common practice in time series analysis.
4. **Feature Engineering**: New columns are generated by shifting the 'Close' price column to create a series of lagged features, providing the LSTM with the ability to learn from past observations.
5. **NaN Handling**: Rows with NaN values created by the shifting process are dropped to maintain data integrity.
6. **Applying Transformation**: The `prepare_dataframe_for_lstm` function is applied with a specified number of lookback steps (7 days in this case), producing a new DataFrame `shifted_df` with the additional lagged features.

The resulting DataFrame is well-suited for LSTM models, which are designed to recognize patterns in time series data.


In [None]:
from copy import deepcopy as dc

def prepare_dataframe_for_lstm(df, n_steps):
    df = dc(df)

    df.set_index('Date', inplace=True)

    for i in range(1, n_steps+1):
        df[f'Close(t-{i})'] = df['Close'].shift(i)

    df.dropna(inplace=True)

    return df

lookback = 7
shifted_df = prepare_dataframe_for_lstm(data, lookback)
shifted_df


# Conversion to NumPy Array
This code cell converts the DataFrame into a NumPy array:
1. **Conversion**: The `shifted_df` DataFrame, which contains the lagged features for the LSTM model, is converted to a NumPy array using the `to_numpy()` method.
2. **Display Array**: It then displays the resulting NumPy array to ensure the conversion was successful.

This step is important as machine learning models, including LSTMs, typically require input data in the form of arrays rather than DataFrame structures.


In [None]:
shifted_df_as_np = shifted_df.to_numpy()

shifted_df_as_np


# Checking Array Dimensions
This code cell performs a simple yet important operation:
1. **Shape Checking**: It retrieves the shape of the NumPy array `shifted_df_as_np` using the `.shape` attribute.

This action is a quick check to verify the data structure's size and to ensure that the array has the expected number of rows and columns before it is used for training the LSTM model.


In [None]:
shifted_df_as_np.shape


# Data Normalization
This code cell is concerned with scaling the data:
1. **MinMaxScaler Import**: It imports the `MinMaxScaler` class from scikit-learn, which is a preprocessing module used to scale features.
2. **Scaler Initialization**: A `MinMaxScaler` instance is created to scale all features to the range \((-1, 1)\). This range is often chosen as it matches the activation function's range used in many neural network nodes.
3. **Scaling**: The NumPy array `shifted_df_as_np` is scaled, and the scaler is fitted to the data. This transformation is crucial as it normalizes the closing prices and the generated lagged features, enabling the LSTM to learn more efficiently.
4. **Verification**: The scaled data is displayed for verification, ensuring that the scaling process has been applied correctly.

Normalized data is essential for the effective training of LSTM models, as it ensures that all input features contribute equally to the model's learning process.


In [None]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(-1, 1))
shifted_df_as_np = scaler.fit_transform(shifted_df_as_np)

shifted_df_as_np


# Feature and Target Variable Separation
This code cell prepares the input and output for the LSTM model:
1. **Feature Separation**: All columns except the first one are designated as input features `X`. These include the lagged 'Close' prices which the LSTM will use to make its predictions.
2. **Target Variable**: The first column, which represents the current 'Close' price, is set as the target variable `y`.
3. **Shape Verification**: The shapes of `X` and `y` are displayed to confirm that the data has been structured correctly, with `X` containing multiple columns for the lagged features and `y` being a single column for the prediction target.

It's crucial that `X` and `y` are correctly defined and shaped for successful model training and forecasting.


In [None]:
X = shifted_df_as_np[:, 1:]
y = shifted_df_as_np[:, 0]

X.shape, y.shape


# Reordering Input Features
This code cell modifies the input feature array `X`:
1. **Reversal of Order**: The columns in `X` are flipped, meaning the sequence of input features is reversed. This is done so that the most recent observations are placed last, as is typically expected by LSTM models during training and prediction.
2. **Display Modified Array**: The reversed array is displayed to ensure that the transformation has been applied correctly.

The reordering of input features aligns with the LSTM's ability to remember recent information better, making it more effective for time series forecasting tasks.


In [None]:
X = dc(np.flip(X, axis=1))
X


# Dataset Splitting
The purpose of this code cell is to establish a division between the training and testing data:
1. **Determine Split Index**: It calculates the index for splitting the dataset by taking 95% of the data for training. This convention reserves a majority of the data for training while allowing for a substantial enough test set to validate the model's performance.
2. **Index Display**: The split index is displayed to confirm the position at which the dataset will be divided.

Establishing a training and test set is a standard practice in model development, providing a way to train the model on one set of data and evaluate its generalization on another.


In [None]:
split_index = int(len(X) * 0.95)

split_index


# Executing the Dataset Split
This code cell applies the split to the dataset:
1. **Training and Test Sets for Features**: It uses the `split_index` to separate `X` into `X_train` for training and `X_test` for testing.
2. **Training and Test Sets for Target Variable**: Similarly, `y` is split into `y_train` and `y_test`.
3. **Shape Confirmation**: The shapes of `X_train`, `X_test`, `y_train`, and `y_test` are displayed to verify the correct partitioning of the dataset.

The split ensures that the model is trained on a large portion of the data while still being validated on a separate unseen portion, which is critical for evaluating model performance.


In [None]:
X_train = X[:split_index]
X_test = X[split_index:]

y_train = y[:split_index]
y_test = y[split_index:]

X_train.shape, X_test.shape, y_train.shape, y_test.shape


# Reshaping for LSTM Input
The LSTM model requires input data to be in a specific format, which is addressed in this code cell:
1. **Input Features Reshaping**: `X_train` and `X_test` are reshaped to a three-dimensional array format, which the LSTM expects. The dimensions correspond to the number of samples, the number of time steps to look back (defined by the `lookback` variable), and one feature per time step.
2. **Target Variable Reshaping**: `y_train` and `y_test` are reshaped to have a single feature column, suitable for regression predictions.
3. **Shape Confirmation**: The new shapes for all sets are displayed to ensure they match the LSTM's expected input format.

Reshaping the data appropriately is crucial for the model to process and learn from the time series sequence effectively.


In [None]:
X_train = X_train.reshape((-1, lookback, 1))
X_test = X_test.reshape((-1, lookback, 1))

y_train = y_train.reshape((-1, 1))
y_test = y_test.reshape((-1, 1))

X_train.shape, X_test.shape, y_train.shape, y_test.shape


# Conversion to PyTorch Tensors
This code cell is focused on preparing the data for PyTorch:
1. **Tensor Conversion**: The input features and target variables for both the training and test sets are converted into PyTorch tensors. This is essential since PyTorch models operate on tensors.
2. **Float Conversion**: The data is cast to the `float` data type, which is the standard for numerical tensors in PyTorch, particularly when working with floating-point numbers for regression tasks.
3. **Shape Verification**: The shapes of the PyTorch tensors for `X_train`, `X_test`, `y_train`, and `y_test` are displayed to verify that the data is correctly formatted for the model.

This preparation is a prerequisite for the training phase, ensuring that the data is in the appropriate form for processing by the LSTM model within the PyTorch framework.


In [None]:
X_train = torch.tensor(X_train).float()
y_train = torch.tensor(y_train).float()
X_test = torch.tensor(X_test).float()
y_test = torch.tensor(y_test).float()

X_train.shape, X_test.shape, y_train.shape, y_test.shape


# Custom Dataset for Time Series
This code cell involves the setup of a custom dataset for handling time series data:
1. **Dataset Class**: The `Dataset` class from PyTorch's utility module is imported to serve as a base for the custom dataset.
2. **Custom Class Definition**: The `TimeSeriesDataset` class is defined to handle the features and target variables specifically for time series forecasting.
3. **Initialization Method**: The class is initialized with `X` and `y`, storing the features and labels respectively.
4. **Length Method**: The `__len__` method enables the dataset to report its total number of samples.
5. **Item Access Method**: The `__getitem__` method allows for the retrieval of a specific sample and its corresponding target, making it compatible with PyTorch's data loading utilities.
6. **Training and Test Instances**: Instances of the `TimeSeriesDataset` class are created for both the training set (`train_dataset`) and the test set (`test_dataset`).

The establishment of a custom dataset class ensures that data handling is optimized for the LSTM model's requirements within the PyTorch framework.


In [None]:
from torch.utils.data import Dataset

class TimeSeriesDataset(Dataset):
    def __init__(self, X, y):
        self.X = X
        self.y = y

    def __len__(self):
        return len(self.X)

    def __getitem__(self, i):
        return self.X[i], self.y[i]

train_dataset = TimeSeriesDataset(X_train, y_train)
test_dataset = TimeSeriesDataset(X_test, y_test)


# Verifying the Training Dataset
This brief code cell serves a simple purpose:
1. **Dataset Display**: It outputs the `train_dataset` instance to verify its creation.

Displaying the dataset instance is a quick check to ensure that the custom dataset has been instantiated correctly and is ready for use in the data loading and model training processes.


In [None]:
train_dataset


# Data Loaders for Training and Testing
This code cell is about setting up PyTorch data loaders:
1. **DataLoader Import**: It imports the `DataLoader` class from `torch.utils.data` which facilitates batching and shuffling of the data.
2. **Batch Size**: A batch size of 16 is defined, dictating how many samples are fed to the model at a time during training.
3. **Training DataLoader**: The `train_loader` is created with shuffling enabled to randomize the order of data points, which helps in preventing the model from learning the sequence of the training data.
4. **Testing DataLoader**: The `test_loader` is similarly established but without shuffling, as the sequence of data does not affect model evaluation.

Data loaders are crucial for managing data flow during model training and evaluation, ensuring efficient and randomized batch processing.


In [None]:
from torch.utils.data import DataLoader

batch_size = 16

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)


# Verifying Batch Dimensions
In this code cell, we are performing a check on the data loading process for training:
1. **Batch Iteration**: It iterates through the `train_loader`, which is responsible for providing batches of data to the model during training.
2. **Device Assignment**: Each batch is split into features and targets, and they are both moved to the configured device (either CPU or GPU).
3. **Dimension Check**: The shapes of the feature and target batches are printed out. This is to ensure that the data is correctly batched and that each batch is of the expected dimensionality.
4. **Break after First Batch**: The loop breaks after the first batch, indicating that this cell is used for a quick verification rather than processing the entire dataset.

This check is important to confirm that the data loader is set up properly before commencing the training process.



# Batch Shape Verification
This code cell is used to verify the shape of data batches:
1. **Batch Iteration**: It begins by iterating through the data batches provided by `train_loader`.
2. **Device Allocation**: Each batch of input features and target variables is allocated to the appropriate device (CPU or GPU), preparing them for model training.
3. **Shape Display**: The shapes of the input features `x_batch` and target variables `y_batch` for the first batch are printed out. This is a sanity check to confirm that the data is in the correct format for the LSTM model.
4. **Early Stop**: The iteration is halted after the first batch as this cell is intended only for verification, not for actual training.

Ensuring that data batches are of the correct shape is a critical step before proceeding to model training.


In [None]:
for _, batch in enumerate(train_loader):
    x_batch, y_batch = batch[0].to(device), batch[1].to(device)
    print(x_batch.shape, y_batch.shape)
    break


# Defining the LSTM Model
In this code cell, an LSTM model class is defined:
1. **Class Definition**: A new class `LSTM` is defined, extending `nn.Module`, which is the standard neural network module in PyTorch.
2. **Initialization Method**: The `__init__` method sets up the LSTM layer and a fully connected layer. It also defines the size of the hidden layers and the number of stacked LSTM layers.
3. **Forward Pass**: The `forward` method outlines the operations that the model performs on the input data, including initializing the hidden state, passing data through the LSTM layer, and transforming the output with a fully connected layer.
4. **Model Instantiation**: An instance of the `LSTM` class is created with a specific input size, hidden layer size, and number of layers. It is then assigned to the computation device.

This model will be used for forecasting the stock prices using the prepared and processed input data.


In [None]:
class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_stacked_layers):
        super().__init__()
        self.hidden_size = hidden_size
        self.num_stacked_layers = num_stacked_layers

        self.lstm = nn.LSTM(input_size, hidden_size, num_stacked_layers,
                            batch_first=True)

        self.fc = nn.Linear(hidden_size, 1)

    def forward(self, x):
        batch_size = x.size(0)
        h0 = torch.zeros(self.num_stacked_layers, batch_size, self.hidden_size).to(device)
        c0 = torch.zeros(self.num_stacked_layers, batch_size, self.hidden_size).to(device)

        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])
        return out

model = LSTM(1, 4, 1)
model.to(device)
model


# Training Function for One Epoch
This code cell defines the `train_one_epoch` function, which is responsible for training the model for a single epoch:
1. **Training Mode**: It sets the model to training mode to enable certain layers to adjust their behavior accordingly.
2. **Loss Initialization**: A variable for accumulating the loss during training is initialized.
3. **Batch Processing**: The function loops over the data in `train_loader`, computes the output, calculates the loss, performs backpropagation, and updates the weights.
4. **Progress Monitoring**: It prints the loss after every 100 batches to provide insight into the training process and potential convergence.
5. **Epoch Completion**: An empty print statement at the end serves to separate the output between epochs for better readability.

This function is a key component of the training loop, allowing the model to learn from the data incrementally, batch by batch.


In [None]:
def train_one_epoch():
    model.train(True)
    print(f'Epoch: {epoch + 1}')
    running_loss = 0.0

    for batch_index, batch in enumerate(train_loader):
        x_batch, y_batch = batch[0].to(device), batch[1].to(device)

        output = model(x_batch)
        loss = loss_function(output, y_batch)
        running_loss += loss.item()

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch_index % 100 == 99:  # print every 100 batches
            avg_loss_across_batches = running_loss / 100
            print('Batch {0}, Loss: {1:.3f}'.format(batch_index+1,
                                                    avg_loss_across_batches))
            running_loss = 0.0
    print()


# Validation Function for One Epoch
This code cell defines the `validate_one_epoch` function, which evaluates the model on the test dataset:
1. **Evaluation Mode**: The model is switched to evaluation mode, which is important for layers that have different behavior during training and evaluation (e.g., dropout).
2. **Loss Initialization**: A variable for accumulating the loss across batches in the test dataset is initialized.
3. **Batch Evaluation**: In a loop over the test data loader, the function calculates the model's output without gradient computation, measures the loss, and updates the running loss total.
4. **Average Loss Computation**: Once all batches are processed, the average loss is computed and printed, serving as an indicator of the model's performance.
5. **Output Formatting**: Additional print statements are used to visually separate the validation output, making it easier to distinguish in the console or log file.

This function allows the evaluation of the model's generalization to new data, which is essential for assessing the quality of the trained model.


In [None]:
def validate_one_epoch():
    model.train(False)
    running_loss = 0.0

    for batch_index, batch in enumerate(test_loader):
        x_batch, y_batch = batch[0].to(device), batch[1].to(device)

        with torch.no_grad():
            output = model(x_batch)
            loss = loss_function(output, y_batch)
            running_loss += loss.item()

    avg_loss_across_batches = running_loss / len(test_loader)

    print('Val Loss: {0:.3f}'.format(avg_loss_across_batches))
    print('***************************************************')
    print()

# Model Training Setup and Execution
This code block sets up and begins the training process for the machine learning model:

1. **Learning Rate**: `learning_rate = 0.001` sets the learning rate for the optimization algorithm. The learning rate controls how much the model's weights are adjusted with respect to the gradient of the loss function during training. A smaller learning rate requires more training epochs but can lead to a more accurate model by allowing for gradual learning.

2. **Number of Epochs**: `num_epochs = 10` specifies the number of complete passes through the entire training dataset. An epoch refers to one cycle through the full training dataset.

3. **Loss Function**: `loss_function = nn.MSELoss()` establishes the mean squared error loss function. This is a common loss function for regression problems, which measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value.

4. **Optimizer**: `optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)` selects the Adam optimizer and sets it up with the model's parameters and the previously defined learning rate. The Adam optimizer is an algorithm for gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.

5. **Training Loop**: The `for` loop iterates over the number of epochs, calling `train_one_epoch()` to perform the training steps and `validate_one_epoch()` to validate the model after each epoch. The functions `train_one_epoch` and `validate_one_epoch` are assumed to handle the details of the training and validation processes, respectively.

By structuring the training process in this manner, the model is given the opportunity to learn from the data iteratively, adjusting its weights to minimize the loss function and improve its predictive accuracy with each epoch.


In [None]:
learning_rate = 0.001
num_epochs = 10
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    train_one_epoch()
    validate_one_epoch()

# Model Training Configuration and Execution
This code snippet is responsible for setting up the training configuration and initiating the training loop for a machine learning model, specifically using PyTorch:

1. **Learning Rate Configuration**: `learning_rate = 0.001` - This line sets the learning rate, which is a hyperparameter that controls how much to adjust the model in response to the estimated error each time the model weights are updated. A smaller learning rate may lead to more precise convergence to a minimum.

2. **Epochs Setting**: `num_epochs = 10` - Here, we define the number of epochs for the training process. An epoch corresponds to one complete pass through the entire training dataset.

3. **Loss Function Initialization**: `loss_function = nn.MSELoss()` - This initializes the Mean Squared Error (MSE) loss function, commonly used for regression tasks. MSE calculates the average squared difference between the estimated values and the actual values, providing a metric for the optimization process.

4. **Optimizer Selection**: `optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)` - The Adam optimizer is chosen for updating the model's weights. It is initialized with the model's parameters and the specified learning rate. Adam is known for combining the advantages of two other extensions of stochastic gradient descent: Adaptive Gradient Algorithm (AdaGrad) and Root Mean Square Propagation (RMSProp).

5. **Training Loop**: The `for` loop is constructed to iterate over the set number of epochs. Within each iteration, `train_one_epoch()` is called to perform the training step for one epoch, and `validate_one_epoch()` is called subsequently to validate the model's performance. These functions are user-defined and are expected to encapsulate the logic for training and validation, including forward propagation, loss computation, backpropagation, and metric evaluation.

With this setup, the model undergoes a series of iterations where it learns from the training data by adjusting its parameters to reduce the loss, while validation steps provide a measure of the model's performance on data it hasn't seen during training.


In [None]:
learning_rate = 0.001
num_epochs = 10
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    train_one_epoch()
    validate_one_epoch()

# Model Prediction and Performance Visualization on Training Data
This section of the code is dedicated to visualizing the performance of the trained model by making predictions on the training data and plotting the actual versus predicted values:

1. **Prediction Without Gradient Calculation**:
   - `with torch.no_grad():` - This context manager tells PyTorch that no gradients should be computed in the following block. This is important because it saves memory and computations during inference since gradients are only necessary during the training phase.
   - `predicted = model(X_train.to(device)).to('cpu').numpy()` - Within this context, the model makes predictions on the training set `X_train`. The data is first moved to the device (GPU or CPU) that the model is on, then the predictions are moved back to the CPU and converted to a NumPy array for plotting.

2. **Plotting Predictions versus Actual Values**:
   - The actual `Close` prices from the training set (`y_train`) are plotted as a line on the graph.
   - The predicted `Close` prices obtained from the model are also plotted as a line on the same graph.
   - `plt.xlabel('Day')` and `plt.ylabel('Close')` label the x-axis and y-axis respectively with 'Day' and 'Close', indicating the units being plotted.
   - `plt.legend()` adds a legend to the plot to distinguish between the actual and predicted lines.
   - Finally, `plt.show()` displays the plot.

This visualization is crucial for a quick qualitative assessment of how well the model has learned to predict the closing price from the training data. It allows for a visual comparison between the predicted closing prices and the actual historical closing prices, providing immediate insights into the model's performance and potential areas for improvement.


In [None]:
with torch.no_grad():
    predicted = model(X_train.to(device)).to('cpu').numpy()

plt.plot(y_train, label='Actual Close')
plt.plot(predicted, label='Predicted Close')
plt.xlabel('Day')
plt.ylabel('Close')
plt.legend()
plt.show()


# Reversing Scaling on Training Predictions
This code block is concerned with preparing the predicted values from the model for comparison with the actual values by reversing the scaling transformation applied during preprocessing:

1. **Flatten Predictions**:
   - `train_predictions = predicted.flatten()` - The predictions obtained from the model are flattened. This is done to convert the predictions from a two-dimensional array (if it was in such a shape) to a one-dimensional array for easier manipulation and to match the shape of the actual values array.

2. **Prepare Dummy Array for Inverse Scaling**:
   - `dummies = np.zeros((X_train.shape[0], lookback + 1))` - A dummy array is created with the same number of rows as the `X_train` dataset and columns equal to `lookback + 1`. The `lookback` variable is typically used to define the number of previous time steps to include as input variables for the model. In this context, the array is padded to match the shape expected by the scaler for inverse transformation.
   - `dummies[:, 0] = train_predictions` - The predictions are placed in the first column of the dummy array. This ensures that only the predictions will be transformed back, while the rest of the dummy array remains zeros.

3. **Apply Inverse Scaling**:
   - `dummies = scaler.inverse_transform(dummies)` - The scaler's `inverse_transform` method is applied to the dummy array. This reverses the prior scaling that was applied to the data before training, bringing the predictions back to their original scale.

4. **Extract Inverse Scaled Predictions**:
   - `train_predictions = dc(dummies[:, 0])` - The inverse scaled predictions are extracted from the first column of the dummy array. The `dc` (presumably shorthand for `deepcopy`) ensures that the extracted predictions are a separate copy of the data, avoiding any potential issues with referencing or modifying the original dummy array.

5. **Display Predictions**:
   - Finally, the `train_predictions` are displayed, which now represent the model's output in the same scale as the original training data.

This process is critical for evaluating the model's performance with metrics that require comparisons on the original data scale, such as root mean squared error (RMSE) or mean absolute error (MAE). It allows for a meaningful comparison between the model's predictions and the actual observed values.


In [None]:
train_predictions = predicted.flatten()

dummies = np.zeros((X_train.shape[0], lookback+1))
dummies[:, 0] = train_predictions
dummies = scaler.inverse_transform(dummies)

train_predictions = dc(dummies[:, 0])
train_predictions

# Inverse Scaling of Actual Training Data
The following code is used to revert the scaling applied to the actual training data values, which allows them to be compared to the model's predictions in their original units:

1. **Creating a Dummy Array**:
   - `dummies = np.zeros((X_train.shape[0], lookback + 1))` - A dummy array is initialized with the same number of rows as the training input data `X_train` and a number of columns equal to `lookback + 1`. The `lookback` parameter typically indicates how many previous time steps are used to predict the next time step. This array structure matches the format expected by the scaler used during data preprocessing.

2. **Populating Dummy Array with Actual Data**:
   - `dummies[:, 0] = y_train.flatten()` - The actual training data `y_train`, which may have multiple dimensions depending on the model structure, is flattened to a single dimension and then inserted into the first column of the dummy array. The flattening ensures that the data structure is consistent for the inverse scaling process.

3. **Applying Inverse Scaling**:
   - `dummies = scaler.inverse_transform(dummies)` - The `inverse_transform` function of the pre-fitted scaler is applied to the dummy array, specifically targeting the first column where the actual data resides. This action converts the scaled values back to their original range before scaling was applied.

4. **Extracting the Inverted Data**:
   - `new_y_train = dc(dummies[:, 0])` - After inverse scaling, the actual data is now in the original scale and is extracted from the dummy array. The use of `dc` (assumed to be an abbreviation for `deepcopy`) ensures that the extracted data is a standalone copy, which prevents any unintentional modifications to the dummy array from affecting the actual data.

5. **Output the Inverted Data**:
   - `new_y_train` is output, which should now be in the same scale as the original data prior to any preprocessing steps like normalization or standardization.

This procedure is crucial for ensuring that the actual data used for training the model is in the same scale as the predictions when performing comparisons or computing performance metrics. It facilitates a direct and meaningful comparison between the predicted values and the true values of the dataset.


In [None]:
dummies = np.zeros((X_train.shape[0], lookback+1))
dummies[:, 0] = y_train.flatten()
dummies = scaler.inverse_transform(dummies)

new_y_train = dc(dummies[:, 0])
new_y_train

# Visual Comparison of Actual vs. Predicted Training Prices
The code block below is designed to plot the actual and predicted closing prices of the stock using the training dataset to visually assess the model's performance:

1. **Plotting Actual Closing Prices**:
   - `plt.plot(new_y_train, label='Actual Close')` - This line plots the actual closing prices (`new_y_train`) that have been inversely scaled to their original values. The label 'Actual Close' will appear in the plot legend, helping to identify this line on the plot.

2. **Plotting Predicted Closing Prices**:
   - `plt.plot(train_predictions, label='Predicted Close')` - In this line, the predicted closing prices (`train_predictions`) obtained from the model are plotted. These predictions have also been inversely scaled to the original price scale. The label 'Predicted Close' distinguishes this line in the legend.

3. **Labeling the Axes**:
   - `plt.xlabel('Day')` - Sets the label of the x-axis to 'Day', which typically represents the time axis in a time-series plot.
   - `plt.ylabel('Close')` - Sets the label of the y-axis to 'Close', indicating that the plotted values represent closing prices of the stock.

4. **Adding a Legend**:
   - `plt.legend()` - This command adds a legend to the plot, which contains the labels 'Actual Close' and 'Predicted Close' for easy identification of the corresponding lines.

5. **Displaying the Plot**:
   - `plt.show()` - Finally, this function call renders the plot and displays it to the user. It is essential for visual output in notebooks or Python scripts.

By plotting both the actual and predicted closing prices on the same chart, this visualization allows for a quick and intuitive comparison of the model's predictions against the true values, providing immediate visual feedback on the model's accuracy and highlighting any discrepancies between the predicted and actual prices.


In [None]:
plt.plot(new_y_train, label='Actual Close')
plt.plot(train_predictions, label='Predicted Close')
plt.xlabel('Day')
plt.ylabel('Close')
plt.legend()
plt.show()


# Generating and Rescaling Predictions on Test Data
This code block performs predictions on the test data using the trained model and applies an inverse scaling transformation to bring the predictions to their original scale:

1. **Generating Test Predictions**:
   - `test_predictions = model(X_test.to(device)).detach().cpu().numpy().flatten()` - The model is used to make predictions on the test set `X_test`. The `to(device)` ensures that the test data is on the same device as the model (CPU or GPU). After the predictions are made, `detach()` is called to remove any history of operations, allowing for the tensor to be moved to CPU memory. The tensor is then converted to a NumPy array and flattened, making it a one-dimensional array that is easier to work with in subsequent steps.

2. **Preparing Dummy Array for Inverse Transformation**:
   - A dummy array `dummies` is created with the same number of rows as `X_test` and `lookback + 1` columns. The extra columns are required to match the shape that the scaler expects based on how it was originally fitted.
   - `dummies[:, 0] = test_predictions` - The flattened predictions are placed in the first column of the dummy array.

3. **Applying Inverse Scaling**:
   - `dummies = scaler.inverse_transform(dummies)` - The inverse scaling transformation is applied to the dummy array using the `inverse_transform` method of the scaler. This operation reverts the predictions from the scaled representation back to the original data scale.

4. **Extracting the Rescaled Predictions**:
   - `test_predictions = dc(dummies[:, 0])` - The first column of the dummy array, which contains the inverse-scaled predictions, is extracted and stored in `test_predictions`. The use of `dc` (assumed to stand for `deepcopy`) ensures that a separate copy of the array is created.

5. **Outputting the Final Predictions**:
   - The variable `test_predictions` is output, containing the final predictions in the same scale as the original test data. These predictions are ready for comparison against the actual test data values for evaluation purposes.

This procedure is crucial for evaluating the model's performance on unseen data. By rescaling the predictions, we can compare them to the actual values in a meaningful way, using metrics such as RMSE or MAE, to understand how well the model generalizes to new data.


In [None]:
test_predictions = model(X_test.to(device)).detach().cpu().numpy().flatten()

dummies = np.zeros((X_test.shape[0], lookback+1))
dummies[:, 0] = test_predictions
dummies = scaler.inverse_transform(dummies)

test_predictions = dc(dummies[:, 0])
test_predictions

# Reverting Scaling on Actual Test Data
The code snippet below is responsible for converting the actual test data values back to their original scale after they have been processed by a scaling function during data preprocessing:

1. **Initialize Dummy Array for Inverse Transformation**:
   - `dummies = np.zeros((X_test.shape[0], lookback+1))` - A dummy array is created with dimensions that match the number of test samples (`X_test.shape[0]`) and `lookback + 1` columns. The `lookback` parameter signifies the number of past observations used as input features for the model. The additional columns align with the expected input shape for the scaler's inverse transformation.

2. **Assign Actual Test Data to Dummy Array**:
   - `dummies[:, 0] = y_test.flatten()` - The actual test data `y_test` is flattened to convert it into a one-dimensional array if it isn't already, and then placed into the first column of the dummy array. This ensures that only the test data is targeted for inverse scaling.

3. **Perform Inverse Scaling Transformation**:
   - `dummies = scaler.inverse_transform(dummies)` - The scaler, which has been previously fitted to the training data, is now used to apply the inverse transformation to the dummy array. This action rescales the test data in the first column back to its original range.

4. **Extract the Rescaled Actual Test Data**:
   - `new_y_test = dc(dummies[:, 0])` - The inverse scaled actual test data is then extracted from the dummy array. The use of `dc` (likely a shorthand for `deepcopy`) ensures that the `new_y_test` is a standalone copy, thus any further modifications to the dummy array will not affect the extracted data.

5. **Output the Inverted Test Data**:
   - The `new_y_test` array is output, containing the actual test data values in their original scale, which allows for direct comparison with the test predictions.

Inverse scaling of the actual test data is a critical step for the evaluation of the model's predictive performance. It ensures that the predicted values and the actual values are on the same scale, thereby making metrics such as RMSE or MAE directly applicable and meaningful.


In [None]:
dummies = np.zeros((X_test.shape[0], lookback+1))
dummies[:, 0] = y_test.flatten()
dummies = scaler.inverse_transform(dummies)

new_y_test = dc(dummies[:, 0])
new_y_test

# Plotting Actual vs. Predicted Test Prices
This code block is for visualizing the performance of the trained model on the test dataset by plotting the actual versus predicted stock closing prices:

1. **Actual Test Data Plot**:
   - `plt.plot(new_y_test, label='Actual Close')` - This command plots the actual closing prices from the test data (`new_y_test`), which have been adjusted back to their original scale if any normalization or scaling was previously applied. The series is labeled 'Actual Close' for clarity in the legend.

2. **Predicted Data Plot**:
   - `plt.plot(test_predictions, label='Predicted Close')` - Similarly, this line plots the predicted closing prices (`test_predictions`), which should also be in the original scale due to the inverse scaling performed earlier. It's labeled 'Predicted Close' to differentiate it from the actual prices in the plot legend.

3. **Axis Labeling**:
   - `plt.xlabel('Day')` - This labels the x-axis as 'Day', which assumes that the data points are sequenced over consecutive days.
   - `plt.ylabel('Close')` - The y-axis is labeled 'Close', signifying that the plot's values represent the closing prices of the stock.

4. **Legend and Plot Display**:
   - `plt.legend()` - Adds a legend to the plot, which helps to distinguish between the two lines representing the actual and predicted closing prices.
   - `plt.show()` - This function call displays the plot. It's particularly useful in a notebook environment to ensure that the plot is rendered directly below the code cell.

Plotting the predicted closing prices alongside the actual prices for the test set is an essential step in the model evaluation process. It provides a visual representation of the model's predictive accuracy on unseen data and can quickly highlight any discrepancies between the predictions and the true values.


In [None]:
plt.plot(new_y_test, label='Actual Close')
plt.plot(test_predictions, label='Predicted Close')
plt.xlabel('Day')
plt.ylabel('Close')
plt.legend()
plt.show()
