# PatchTST Model Development

In this section we will develop the PatchTST model to predict S&P Close, Dow Jones Close.

## Model Congfiguration 

We will configure the PatchTST model based on the `Economic_Data_1994_2025` dataset we processed

In [None]:
from transformers import PatchTSTConfig, PatchTSTForPrediction, PatchTSTForPretraining
from torch.utils.data import TensorDataset, DataLoader
import pandas as pd
import numpy as np
import torch

In [None]:
dataset = pd.read_csv('../data/Economic_Data_1994-2025.csv')
dataset = dataset.drop(['DATE', 'Unnamed: 0'], axis=1)

col_id = {}
curr_id = 0
for col in dataset:
    col_id[col] = curr_id
    curr_id = curr_id + 1

# Dictionary of what each index represents
col_id

### Understanding PatchTST

- **Context Length**

    Context length is how far we look back in total. If we were trying to predict the closing price for the SP500 tomorrow, our context length would be how far we look back to make our prediction.

- **Patch Length**

    Patch length is like a subset of our context length. When looking at our entire context length, patch length is the looking at each individual week up until tomorrow to make our final prediction

- **Patch Stride**

    Patch stride is how far our patch length will move after observing an individual week. We can overlap weeks to see any comparisons. 

In [None]:
# How many features we are including 
NUM_INPUT = len(dataset.columns)

# For What we are predicting
NUM_TARGET = 2

# How many steps we take in the context length
CONTEXT_LEN = 365

# How many steps we take in the context length
PATCH_LEN = 10

# How far we move our patch length
PATCH_STRD = 5

NUM_ATT_HEADS = 8

# How many days to predict into the future
PRED_LEN = int(365 / 4)

# Configuring Model
config = PatchTSTConfig(
    num_input_channels = NUM_INPUT,
    num_target = NUM_TARGET,
    context_length = CONTEXT_LEN,
    patch_length = PATCH_LEN,
    patch_stride = PATCH_STRD,
    num_attention_heads = NUM_ATT_HEADS,
    prediction_length = PRED_LEN
)

model = PatchTSTForPrediction(config)
model

In [None]:
# Set up constraints for development
num_train = int(len(dataset) * .7)
num_test = int(len(dataset) * .2)
num_val = int(len(dataset) * .1)

targets = dataset
features = dataset

**Getting Target/Input Features**

This part is a little odd. 

- **Input Features**

    To get the target features all we need to do is construct a window that looks at the past N amount of days for each data point.
    We include the features we want to predict which makes it **Self Supervised**. 

- **Output Features**

    What we are doing is getting the actual targets we want to predict and making a future window for just the 2 features. 
    In this case we are looking 90 days into the future, or what the model is predicting, and grabbing those values. This is used 
    for the model to evalute it's prediction

In [None]:
# Creates a context window for each data point to feed into the model during training
def create_sequence_windows(data, window_size):
    windows = []
    
    # We start in the dataFrame at an index 'window_size' and look back depending on the window size
    # We will grab a context window for all data points
    for i in range(len(data) - window_size + 1):
        windows.append(data.iloc[i:i+window_size].values)
    return np.array(windows)

input_windows = create_sequence_windows(features, CONTEXT_LEN)

# Remove values at the end that don't have enough future data
input_windows = input_windows[0:7459-91]

In [None]:
# Gets indices for the target variables, starting from where we first start predicting with a full context length
# to the last index that will allow for a full prediction
target_indices = range(CONTEXT_LEN, len(features) - PRED_LEN + 1)

In [None]:
target_windows = [targets.iloc[i:i+PRED_LEN].values for i in target_indices]
target_windows = np.array(target_windows)
target_windows.shape

In [None]:
past_values = torch.tensor(input_windows, dtype=torch.float32)
future_values = torch.tensor(target_windows, dtype=torch.float32)

In [None]:
data = TensorDataset(past_values, future_values)
dataloader = DataLoader(data, batch_size=32, shuffle=True)

In [None]:
device = torch.device('mps')
model = model.to(device)

In [None]:
from tqdm import tqdm

In [None]:
optimizer = torch.optim.Adam(model.parameters(), lr=.001)
epochs = 10

model.train()
for epoch in range(epochs):

    loop = tqdm(dataloader, leave=True)
    losses = []
    
    for past_values, future_values in loop:
        optimizer.zero_grad()
        
        past_values = past_values.to(device)
        future_values = future_values.to(device)
        
        outputs = model(past_values=past_values, future_values=future_values)
        
        loss = outputs.loss
        loss.backward()
        
        optimizer.step()
        
        loop.set_description(f'Epoch {epoch}')
        loop.set_postfix(loss=loss.item())
