The general thought process behind our idea is not different from the writeup we submitted which is pasted below. Some details of the actual model/strategy are different than what the writeup says, but the overall idea is intact.

Idea:
Our idea is to train a multi-layer perceptron (MLP) to predict chaotic systems and retrain it specifically on financial market data of some liquid asset. By first training the model on general chaotic systems, we hope that the model will be able to better generalize its behavior and predictions compared to training it purely on the historical financial data of some security. The chaotic systems the model is trained on can either be self-generated (Lorenz systems, Henón maps, Mackey-Glass equations, etc.) or natural (weather patterns, animal populations, biological systems, etc.). Our choice of using MLPs comes from chaotic systems research by our group member Badis Labbedi. When the predicted future price of the instrument exceeds a set deviation from the actual current value, we will trade against the direction of deviation and exit the position when the price falls back within our bounds. We will track the delta between the initial price and the final price among consistent time intervals (which will stay constant between training and testing).


Synthetic chaos:
We can generate our own datasets using pre-established equations that describe chaotic systems, which we can then feed into the model. Additionally, the wide range of possibilities means that there will be more unique data that the model can be trained with, allowing it to increase its accuracy while avoiding overfitting on any particular dataset. We also have a theoretically infinite amount of this data since we can just adjust the starting conditions and other parameters of the equations to generate an entirely new chaotic system.

Natural chaos:
We can also find natural instances of chaotic systems and train our model on that as well. These have the added benefit of being more “truly” chaotic compared to data generated by equations, so the results of training on these datasets may be more realistic compared to the synthetic data. The caveats here are that the data may be very messy or difficult to use in its raw form, meaning that we will have to go through a lot of trouble to process it. Additionally, there is not a ton of publicly available data fitting our criteria in general, so the benefits of our approach are rather limited.

Adaptation to markets:
After training on natural chaotic datasets and synthetic chaos datasets, we’ll remove the output layer and then create a new output layer which will be trained upon the financial data of some index or other security. We have yet to determine specifically what we want our model to trade, but for it to fit our criteria it should probably be a very liquid asset like an S&P 500 Index tracker or some large-cap stock.

Backtesting/Tuning:
In backtesting, we will primarily seek to tune the time period of signal generation where the model performs optimally and the execution threshold for the proportional difference between the modeled price and real price for risk-adjusted returns.


Deployment:
We plan on feeding real-time financial data through our model by using API’s like databento or whatever we find to work best. We expect real-world execution to deviate from our predicted p/l as broker execution will be delayed due to hardware and brokerage constraints.

Resources/Proof of Concept

https://www.mdpi.com/2227-7390/12/12/1920
https://link.springer.com/article/10.1007/s43069-021-00071-2
https://www.researchgate.net/publication/380583239_MLP_and_RBF_Algorithms_in_Finance_Predicting_and_Classifying_Stock_Prices_amidst_Economic_Policy_Uncertainty



In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.integrate import solve_ivp
import os
import torch
from torch.utils.data import DataLoader, Dataset
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from sklearn.preprocessing import MinMaxScaler


We used this section of the code to generate datasets of x, y, and z values of a Lorenz attractor, which is a chaotic system of equations which returns highly variable solutions with only small changes in initial values. By generating a large number of separate datasets, we hope to minimize the possibility for the model to overfit even before we started transfer learning.

In [None]:
def lorenz(t, state, sigma=10, beta=8/3, rho=28):
    x, y, z = state
    dxdt = sigma * (y - x)
    dydt = x * (rho - z) - y
    dzdt = x * y - beta * z
    return [dxdt, dydt, dzdt]

# Initial conditions
state0 = [1.0, 1.0, 1.0]
time_span = (0, 10000)


# Generate integer time points
#time_eval = np.linspace(time_span[0], time_span[1], 10000) #Original Line
time_eval = np.arange(time_span[0], time_span[1] + 1, dtype=int)  # New Line

output_dir = 'lorenz_output'
if not os.path.exists(output_dir):
       os.makedirs(output_dir)


for i in range(-1400,2000):

    #Scrambles x y and z starting conditions to generate a new system
    state0[i % 3] = i / 100000

    # Solve the system
    sol = solve_ivp(lorenz, time_span, state0, t_eval=time_eval)

    #Save the time series data of the x y and z axes as individual csvs
    dfx = pd.DataFrame({'Time': sol.t, 'X': sol.y[0]})
    dfx.to_csv("lorenz_output_x" + str(i) + ".csv", index=False)
    print("Data saved to lorenz_output_x" + str(i) + ".csv")

    dfy = pd.DataFrame({'Time': sol.t, 'Y': sol.y[1]})
    dfy.to_csv("lorenz_output_y " + str(i) + ".csv", index=False)
    print("Data saved to lorenz_output_y" + str(i) + ".csv")

    dfz = pd.DataFrame({'Time': sol.t, 'Z': sol.y[2]})
    dfz.to_csv("lorenz_output_z" + str(i) + ".csv", index=False)
    print("Data saved to lorenz_output_z" + str(i) + ".csv")

KeyboardInterrupt: 

This section of the code combines the various Lorenz datasets for x, y and z coordinates for ease of training. The cell may not work on a new runtime, but it's not necessary for the model and algorithm to actually function.

In [None]:
import pandas as pd
import glob
import os

def combine_csv_files(directory, output_prefix):
    """Combines CSV files in a directory based on their prefixes (x, y, z)."""

    for prefix in ['x', 'y', 'z']:
        all_files = glob.glob(os.path.join(directory, f'lorenz_output_{prefix}*.csv'))
        if not all_files:
            print(f"No files found with prefix '{prefix}' in the specified directory.")
            continue

        combined_df = pd.concat([pd.read_csv(f) for f in all_files], ignore_index=True)
        output_filename = os.path.join(directory, f'{output_prefix}_{prefix}.csv')
        combined_df.to_csv(output_filename, index=False)
        print(f"Combined '{prefix}' files into '{output_filename}'")

# Example usage:
combine_csv_files('/content', 'combined_lorenz')


No files found with prefix 'x' in the specified directory.
No files found with prefix 'y' in the specified directory.
No files found with prefix 'z' in the specified directory.


This section of the code defines the infrastructure we need to train the model. It should be largely self-explanatory.

In [None]:
class TimeSeriesDataset(Dataset):
    def __init__(self, data, sequence_length):
        self.data = data
        self.sequence_length = sequence_length

    def __len__(self):
        return len(self.data) - self.sequence_length  # Changed here

    def __getitem__(self, index):
        x = self.data[index:index + self.sequence_length]  # Input: Past sequence_length values
        y = self.data[index + self.sequence_length]  # Target: The next value after the sequence
        return x, y

class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_hidden_layers=1):  # Added num_hidden_layers
        super(MLP, self).__init__()

        layers = [nn.Linear(input_size, hidden_size), nn.ReLU()]  # Start with input layer

        # Add hidden layers dynamically
        for _ in range(num_hidden_layers):
            layers.extend([nn.Linear(hidden_size, hidden_size), nn.ReLU()])

        layers.append(nn.Linear(hidden_size, output_size))  # Add output layer

        self.network = nn.Sequential(*layers)  # Unpack layers into nn.Sequential

    def forward(self, x):
        return self.network(x)

# Training function (remains the same)
def train_model(model, dataloader, criterion, optimizer, num_epochs):
    model.train()
    for epoch in range(num_epochs):
        for data, targets in dataloader:
            # Forward pass
            outputs = model(data).squeeze(1)  # Squeeze the output tensor along dimension 1
            loss = criterion(outputs, targets)

            # Backward pass and optimization
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

def normalize_tensor(tensor, ref_tensor, feature_range=(0, 1)):
    min_val, max_val = feature_range
    tensor_min = ref_tensor.min(dim=0, keepdim=True)[0]
    tensor_max = ref_tensor.max(dim=0, keepdim=True)[0]

    # Normalize the tensor
    normalized_tensor = (tensor - tensor_min) / (tensor_max - tensor_min)
    normalized_tensor = normalized_tensor * (max_val - min_val) + min_val

    return normalized_tensor

def unnormalize_tensor(normalized_tensor, original_tensor, feature_range=(0, 1)):

    min_val, max_val = feature_range
    tensor_min = original_tensor.min(dim=0, keepdim=True)[0]
    tensor_max = original_tensor.max(dim=0, keepdim=True)[0]

    # Unnormalize the tensor
    unnormalized_tensor = (normalized_tensor - min_val) / (max_val - min_val)
    unnormalized_tensor = unnormalized_tensor * (tensor_max - tensor_min) + tensor_min

    return unnormalized_tensor


This section defines our hyperparameters. We adjusted these after a little experimentation from our initial values but not significantly.

In [None]:
# Hyperparameters
hidden_size = 15
num_hidden_layers = 3
output_size = 1
num_epochs = 10
batch_size = 64
learning_rate = 0.01
sequence_length = 20  # Length of the input sequence for time-series prediction

This section defines the model and sets our loss and optimization functions.

In [None]:
model = MLP(sequence_length, hidden_size, output_size, num_hidden_layers)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Loading Data

This cell loads all our different data sets, normalizes them, and trains the model using those datasets. On top of the Lorenz data sets we generated, we also trained the model on various weather-based measurements for Delhi. Ideally we would have incorporated more diverse sources of real-world data but we did not have enough time to find and process usable datasets.

In [None]:
dfx = pd.read_csv('/content/combined_lorenz_x.csv')  # Load data for 'x'
data = torch.tensor(dfx['X'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/combined_lorenz_y.csv')  # Load data for 'x'
data = torch.tensor(dfx['Y'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/combined_lorenz_z.csv')  # Load data for 'x'
data = torch.tensor(dfx['Z'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/DailyDelhiClimateTrain.csv')  # Load data for 'x'
data = torch.tensor(dfx['meantemp'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/DailyDelhiClimateTrain.csv')  # Load data for 'x'
data = torch.tensor(dfx['humidity'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/DailyDelhiClimateTrain.csv')  # Load data for 'x'
data = torch.tensor(dfx['wind_speed'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)

dfx = pd.read_csv('/content/DailyDelhiClimateTrain.csv')  # Load data for 'x'
data = torch.tensor(dfx['meanpressure'].values, dtype=torch.float32)
data = normalize_tensor(data, data)
dataset = TimeSeriesDataset(data,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)
train_model(model, dataloader, criterion, optimizer, num_epochs)


Epoch [1/10], Loss: 0.0434
Epoch [2/10], Loss: 0.0415
Epoch [3/10], Loss: 0.0413
Epoch [4/10], Loss: 0.0413
Epoch [5/10], Loss: 0.0413
Epoch [6/10], Loss: 0.0414
Epoch [7/10], Loss: 0.0420
Epoch [8/10], Loss: 0.0408
Epoch [9/10], Loss: 0.0410
Epoch [10/10], Loss: 0.0373
Epoch [1/10], Loss: 0.0207
Epoch [2/10], Loss: 0.0206
Epoch [3/10], Loss: 0.0206
Epoch [4/10], Loss: 0.0200
Epoch [5/10], Loss: 0.0197
Epoch [6/10], Loss: 0.0194
Epoch [7/10], Loss: 0.0193
Epoch [8/10], Loss: 0.0193
Epoch [9/10], Loss: 0.0195
Epoch [10/10], Loss: 0.0192
Epoch [1/10], Loss: 0.0198
Epoch [2/10], Loss: 0.0196
Epoch [3/10], Loss: 0.0197
Epoch [4/10], Loss: 0.0208
Epoch [5/10], Loss: 0.0210
Epoch [6/10], Loss: 0.0199
Epoch [7/10], Loss: 0.0209
Epoch [8/10], Loss: 0.0206
Epoch [9/10], Loss: 0.0207
Epoch [10/10], Loss: 0.0191
Epoch [1/10], Loss: 0.1022
Epoch [2/10], Loss: 0.0557
Epoch [3/10], Loss: 0.0672
Epoch [4/10], Loss: 0.0638
Epoch [5/10], Loss: 0.0644
Epoch [6/10], Loss: 0.0642
Epoch [7/10], Loss: 0.064

# Tuning/Backtesting


Here we process the data for the transfer learning aspect of our model. For simplicity, we decided to train the model to trade only SPY, since it is liquid and traded often. If given more time we could have trained the model on different equities to allow it to be applied more generally to other equities but that was not realistic under the time constraints. Since the values of SPY have gradually grown over time, training on the normalized train data would not work very well since the model would not be able to predict a value outside of the minimum and maximum of the training data prices. Therefore, we decided to have the model predict the percent change in SPY open price day-to-day and trade on that prediction instead. There is the caveat that the opening price is not necessarily the price the model would realistically trade at, but the general idea behind the strategy shouldn't be majorly affected.

In [None]:
# Load your specific time-series financial data
dfx = pd.read_csv('/content/HistoricalData_1739678509429.csv')

print(dfx.head())

# CSV data is in reverse chronological order which this code fixes
new_df = pd.DataFrame(index=dfx.index, columns=['Open'])
reversed_open_values = dfx['Open'].iloc[::-1].values
new_df['Open'] = reversed_open_values
dfx['Openpctdiff'] = new_df['Open'].pct_change()


dfx.drop([0], inplace = True) # first value is nan

# creates the train-test split
data_train = torch.tensor(dfx['Openpctdiff'][:2004].values, dtype=torch.float32)
data_train = normalize_tensor(data_train, data_train)
# although there is potential for some weird outputs if the percent change exceeds the normalized range, it should be exceptionally rare

data_test = torch.tensor(dfx['Openpctdiff'][2004:].values, dtype=torch.float32)

actual_prices = new_df['Open']

dataset = TimeSeriesDataset(data_train,sequence_length)  # Only pass data here
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=False)


         Date  Close/Last    Volume    Open      High     Low
0  02/14/2025      609.70  26910450  609.94  610.9900  609.07
1  02/13/2025      609.73  40921300  604.48  609.9400  603.20
2  02/12/2025      603.36  45076080  599.20  604.5500  598.51
3  02/11/2025      605.31  30056740  602.55  605.8600  602.43
4  02/10/2025      604.85  26048710  604.03  605.4999  602.74
1       0.001242
2      -0.001192
3       0.000334
4       0.006970
5       0.000853
          ...   
2511   -0.004713
2512   -0.002450
2513   -0.005560
2514    0.008812
2515    0.009033
Name: Openpctdiff, Length: 2515, dtype: float64
tensor([0.5891, 0.5730, 0.5831,  ..., 0.5669, 0.5535, 0.6480])


This cell freezes the model weights for two of the three hidden layers, then retrain the remaining layers based on the SPY data.

In [None]:
for param in model.network[1].parameters():  # Freezing the first Linear layer (hidden layer)
    param.requires_grad = False
for param in model.network[2].parameters():  # Freezing the second Linear layer (hidden layer)
    param.requires_grad = False

In [None]:
train_model(model, dataloader, criterion, optimizer, num_epochs)

Epoch [1/10], Loss: 0.0089
Epoch [2/10], Loss: 0.0079
Epoch [3/10], Loss: 0.0080
Epoch [4/10], Loss: 0.0079
Epoch [5/10], Loss: 0.0079
Epoch [6/10], Loss: 0.0079
Epoch [7/10], Loss: 0.0079
Epoch [8/10], Loss: 0.0079
Epoch [9/10], Loss: 0.0079
Epoch [10/10], Loss: 0.0079


In this section we implemented a rolling window backtesting method and calculated the PnL of our model on the test SPY price data. We used a fairly simplistic trading signal: if the model predicted that the next period's open price would be a percent change in price outside of a certain range, our model would trade accordingly. For simplicity's sake our model only predicted day by day and would close the position at the next day's open. To calculate our PnL, we subtracted the current price from the price of the next period, adding or subtracting that from PnL depending on what direction the model recommended we trade. The values defining this range were chosen from backtesting. Given more time we may have been able to come up with a more interesting and complex trading signal, but the one we use should serve our purposes fine.

In [None]:
def backtest_model(model, data_train, data_test, actual_prices, window_size=50, percent_diff=1.0):
    pnl = 0
    results = []
    # Normalize the test data
    data_test_normalized = normalize_tensor(data_test, ref_tensor=data_train) # need to keep same normalization standard since the model wouldn't have access to future data and therefore shouldn't be normalized on the entire test dataset

    for i in range(window_size, len(data_test_normalized)):
        X_test = data_test_normalized[i-window_size:i].unsqueeze(0)

        for j in range(i - sequence_length):
          X_input = X_test[0][j:j+sequence_length]

        # Test the model and get prediction
          model.eval()
          with torch.no_grad():
            prediction = model(X_input)

          if tradingsignal(prediction) == 1:
              # Get actual price at the predicted date
            actual_future_price = actual_prices[i + 1]

          elif tradingsignal(prediction) == -1:
            actual_future_price = actual_prices[i + 1]
            pnl -= 100 * (actual_future_price - actual_prices[i])

            if pnl>0:
              wins+=1
            else:
              losses+=1


          if X_input[-1] == X_test[0][-1]:
            break

    winrate = wins/(wins+losses)
    print(f'Winrate: {winrate}')

    return pnl


def tradingsignal(predicted):
    if unnormalize_tensor(predicted, data_test)[0] > 0.012:
        return 1
    elif unnormalize_tensor(predicted, data_test)[0] < -0.012:
        return -1
    return 0


pnl = backtest_model(model, data_train, data_test, actual_prices, window_size = 50)
print(f'PnL: {pnl}')

NameError: name 'model' is not defined

The second to last cell saves our model, which is attached to the email. To actually load the model, run the last cell.

In [None]:
torch.save(model.state_dict(), 'mlp_model.pth')


In [None]:
model = torch.load('mlp_model.pth')