# Overview

This notebook demonstrates how to predict stock closing prices using a neural network with PyTorch.  
It walks through the full workflow for time series forecasting on financial data, including:

- **Data Loading and Visualization:**  
  Downloads historical stock data, displays it using pandas, and visualizes the closing prices over time.

- **Data Preprocessing:**  
  Scales the closing price data, then creates sliding windows of sequential data to prepare it for time series modeling.

- **Dataset Preparation:**  
  Splits the data into training and test sets, reshaping it for input into an LSTM (Long Short-Term Memory) neural network.

- **Model Definition:**  
  Defines an LSTM-based neural network architecture for sequence prediction.

- **Training:**  
  Trains the model using Mean Squared Error loss and the Adam optimizer, running multiple training loops to track performance.

- **Evaluation:**  
  Makes predictions on the test set, inverse-transforms the results to the original price scale, and calculates the Root Mean Squared Error (RMSE) to assess accuracy.

- **Visualization:**  
  Plots actual vs. predicted prices and visualizes prediction error over time to help interpret model performance.

This end-to-end example provides a practical introduction to time series forecasting with deep learning on financial data.


In [None]:
import time
import numpy as np
import pandas as pd
import pandas_datareader.data as web
import datetime
import yfinance as yf
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import torch
import torch.nn as nn
import torch.optim as optim

from sklearn.preprocessing import StandardScaler
from sklearn.metrics import root_mean_squared_error

# This code will check for CUDA, MPS (Apple Silicon), and ROCm (AMD) support
# and set the device accordingly.
# This is important for PyTorch to use the GPU
# if available, otherwise it will default to CPU

if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using CUDA device")
elif torch.backends.mps.is_available():
    device = torch.device("mps")
    print("Using Apple MPS device")
elif torch.backends.hip.is_available():
    device = torch.device("hip")
    print("Using AMD ROCm device")
else:
    device = torch.device("cpu")
    print("No Useable GPU available, fallback to CPU")

# Load and Visualize Stock Data

This cell sets the stock ticker symbol, loads historical daily stock data for that ticker from Stooq starting from January 1, 2020, and sorts the data by date in ascending order.  
It then prints the DataFrame and plots the closing price over time to give a visual overview of the stock's historical performance.

In [None]:
# Enter Ticker to be predicted here.
ticker = 'AMD'

start = datetime.datetime(2020, 1, 1)
end = datetime.datetime.today()
# end = datetime.datetime(2025, 1, 31)
df = web.DataReader(ticker, 'stooq', start)
df = df.sort_index()
print(df)

df.Close.plot(figsize=(12, 8))
plt.title(f'{ticker} Stock Price')
plt.xlabel('Date')

# Scaler Explainer

Scaler is going to scale the data to fit into a normal distribution with a range of 0 to 1.
compaing past results to predict future movements


In [None]:
scaler = StandardScaler()
df["Close"] = scaler.fit_transform(df[["Close"]])
df.Close

# Create Sliding Windows for Time Series Data

This cell creates overlapping sequences (windows) of 30 consecutive closing prices from the normalized stock data.  
Each window is used as a sample for the neural network to learn patterns in the time series.  
The cell also prints the shape of the resulting data array and shows the first and last window to help you verify the windowing process.


In [None]:
seq_length = 30
data = []

for i in range(len(df) - seq_length):
    data.append(df.Close[i:i + seq_length])
data = np.array(data)

# Print the shape and a sample
# print("Shape of data:", data.shape)
# print("First window:", data[0])
# print("Last window:", data[-1])

# Output to see the data and its shape [dimensions]
# data

# Reshape Data and Split into Training and Test Sets

This cell reshapes the windowed data to add a feature dimension, which is required for LSTM input.  
It then splits the data into training and test sets: the first 80% of the windows are used for training, and the last 20% for testing.  
For each window, the input (`x_train`/`x_test`) is the first 29 days, and the target (`y_train`/`y_test`) is the 30th day.  
All arrays are converted to PyTorch tensors and moved to the selected device (CPU or GPU).

In [None]:
# Reshape data to add an extra dimension
data = data.reshape(data.shape[0], data.shape[1], 1)

# Use the first 80% of the data for training and the last 20% for testing
train_size = int(0.8 * len(data))

# Models
x_train = torch.from_numpy(
    data[:train_size, :-1, :]).type(torch.Tensor).to(device)
y_train = torch.from_numpy(
    data[:train_size, -1, :]).type(torch.Tensor).to(device)

# Test Model
x_test = torch.from_numpy(
    data[train_size:, :-1, :]).type(torch.Tensor).to(device)
y_test = torch.from_numpy(
    data[train_size:, -1, :]).type(torch.Tensor).to(device)

# Define the LSTM Prediction Model

This cell defines the neural network architecture using PyTorch.  
It creates a class called `PredictionModel` that uses an LSTM (Long Short-Term Memory) layer to process sequences of stock prices, followed by a fully connected layer to produce the final prediction.  
The model is designed to learn patterns in time series data and predict the next closing price based on previous values.

In [None]:
class PredictionModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim, dropout=0.2):
        # Initialize the parent class
        super(PredictionModel, self).__init__()

        # Initialize the model parameters
        self.num_layers = num_layers
        self.hidden_dim = hidden_dim

        # Define the LSTM layer
        self.lstm = nn.LSTM(input_dim, hidden_dim,
                            num_layers, batch_first=True, dropout=dropout)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        # Initialize the hidden state(h) and cell state(c)
        h0 = torch.zeros(self.num_layers, x.size(
            0), self.hidden_dim, device=device)
        c0 = torch.zeros(self.num_layers, x.size(
            0), self.hidden_dim, device=device)

        # Forward propagate the LSTM
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[:, -1, :])

        return out

# Hidden Layers

A **hidden layer** in a neural network is any layer between the input and output layers. It’s called “hidden” because you don’t directly interact with it—the network learns what happens there.

**How it works:**

- Each hidden layer consists of neurons (nodes) that take inputs, apply weights and biases, and pass the result through an activation function.
- The hidden layers allow the network to learn complex patterns and representations from the data.
- In deep learning, having more hidden layers (a “deep” network) enables the model to learn more abstract features.

**In your LSTM model:**

- The LSTM’s `hidden_dim` parameter controls the size of the hidden state vector (how much information each LSTM cell can store).
- `num_layers` controls how many LSTM layers (hidden layers) are stacked.

**Summary:**  
Hidden layers are where the neural network “thinks”—they transform the input step by step, allowing the network to model complex relationships in your data.


# Set Loss Function and Model/Optimizer Factory

This cell sets the loss function for training (Mean Squared Error, which measures how close predictions are to actual values).  
It also defines a function that creates a new LSTM model and its optimizer, making it easy to initialize fresh models for each training run.

In [None]:
# Set criterion for training
# Mean Squared Error Loss, works from the derirviative of the loss function
criterion = nn.MSELoss()

# Create the function model & optimizer for training


def opt_model():
    model = PredictionModel(input_dim=1, hidden_dim=32,
                            num_layers=2, output_dim=1).to(device)
    # Optimizer - lr = learning rate lower = more accurate but slower
    optimizer = optim.Adam(model.parameters(), lr=0.01)
    return model, optimizer


# Data Check
# print("Shape of data (after model definition):", data.shape)
# print("First window (after model definition):", data[0])
# print("Last window (after model definition):", data[-1])

# Train the LSTM Model

This cell runs the training loop for the LSTM neural network.  
For each run, it creates a new model and optimizer, then trains the model for a set number of epochs.  
During training, it predicts on the training data, calculates the loss, performs backpropagation, and updates the model weights.  
After each run, it records the training time and final loss, and finally prints the average loss and total training time across all runs.

In [None]:
# training loop
runs = 100  # Number of runs
run_times = []  # Store the run times
num_epochs = 1000  # Number of epochs
final_losses = []  # Store the final loss for each run

for j in range(runs):
    model, optimizer = opt_model()
    # Training the model
    for i in range(num_epochs):
        start_time = time.time()
        # Model training
        y_train_pred = model(x_train)
        # Calculate the loss
        loss = criterion(y_train_pred, y_train)
        # Optimization
        optimizer.zero_grad()  # Zero the gradients
        loss.backward()  # Backpropagation
        optimizer.step()  # step to the right direction to optimize the loss
    elaspsed = time.time() - start_time
    run_times.append(elaspsed)
    final_losses.append(loss.item())
    print(
        f"Run {j+1}/{runs}, complete | Time: {elaspsed:.3f}s | Final Loss: {loss.item():.6f}")

# Print the final loss for each run
avg_loss = sum(final_losses) / len(final_losses)
total_time = sum(run_times)  # / len(run_times)

print(f"Average time over {runs} runs: {total_time:.4f}s")
print(f"Average loss over {runs} runs: {avg_loss:.8f}")


# Nural Nine original print
# if i % 25 == 0:  # Print the loss every 25 epochs, % means epoch
#            print(i, loss.item())

# Make Predictions, Inverse Transform, and Evaluate

This cell puts the trained model in evaluation mode and generates predictions for the test set.  
It then inverse-transforms both the predicted and actual values from the normalized scale back to the original stock price scale.  
Finally, it calculates and prints the Root Mean Squared Error (RMSE) for both the training and test sets, giving a quantitative measure of prediction accuracy.

In [None]:
# NOTE:
# If you get an error like "AttributeError: 'numpy.ndarray' object has no attribute 'detach'" here,
# it means y_train or y_test is already a NumPy array (not a PyTorch tensor).
# This happens if you run this cell (Cell 18) more than once without first re-running the data prep cell (Cell 13),
# which recreates y_train and y_test as tensors.
#
# To fix:
# - Always re-run Cell 13 before running this cell.
# - OR, add a type check before using .detach(), e.g.:
#     if isinstance(y_test, torch.Tensor):
#         y_test_np = y_test.detach().cpu().numpy()
#     else:
#         y_test_np = y_test
#     y_test = scaler.inverse_transform(y_test_np)
# Do the same for y_train if needed.

model.eval()  # Set the model to evaluation mode
y_test_pred = model(x_test)  # Make predictions

# Reverse the sacling of the data, shift it to the cpu and convert to numpy array

# Training Data
y_train_pred = scaler.inverse_transform(y_train_pred.detach().cpu().numpy())
y_train = scaler.inverse_transform(y_train.detach().cpu().numpy())

# Test Data
y_test_pred = scaler.inverse_transform(y_test_pred.detach().cpu().numpy())
y_test = scaler.inverse_transform(y_test.detach().cpu().numpy())
# Calculate the Root Mean Squared Error (RMSE)
# is more inline with the prices of the stock
train_rmse = root_mean_squared_error(y_train[:, 0], y_train_pred[:, 0])
test_rmse = root_mean_squared_error(y_test[:, 0], y_test_pred[:, 0])
print(f"Train RMSE: {train_rmse:.4f}")
print(f"Test RMSE: {test_rmse:.4f}")

# Plot Actual vs. Predicted Prices and Error Variance

This cell visualizes the model's performance.  
It plots the actual and predicted stock prices for the test period on the first subplot.  
The second subplot shows the prediction error (variance) over time and includes a dashed line for the RMSE value.  
This helps you see how closely the model tracks real prices and where prediction errors are largest.

In [None]:
# Plot the Price and Prediction as a line graph
# Below the P & P we plot the error variance from the RSME
test_start = len(df) - len(y_test)
test_index = df.index[-len(y_test):]


fig = plt.figure(figsize=(12, 10))
# Grid specification figure (rows, columns)
gs = fig.add_gridspec(4, 1)


# Axis Specs
ax1 = fig.add_subplot(gs[:3, 0])
ax2 = fig.add_subplot(gs[3, 0])

# Give all of the data from the beginning to the end as the date on the x axis.
ax1.plot(df.iloc[-len(y_test):].index, y_test,
         color='green', label='Actual Price')
ax1.plot(df.iloc[-len(y_test):].index, y_test_pred,
         color='red', label='Predected Price')


ax1.legend()

plt.title(f"{ticker} Stock Price Prediction")
plt.xlabel('Date')
plt.ylabel('Price')


# Plot the RMSE
ax2.axhline(test_rmse, color='black', linestyle='--', label='RMSE')
ax2.plot(test_index, abs(y_test - y_test_pred),
         color='blue', label='Prediction Variance')

ax2.legend()

plt.xlabel('Date')
plt.ylabel('Date')
plt.title('RSME Variance')
plt.tight_layout()
plt.show()