# LSTM

## INTRO

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture designed to handle sequence prediction problems. It is particularly effective in capturing long-term dependencies in sequential data.

- Sequential Input: LSTM processes input data sequentially, one element at a time, while maintaining an internal state.
- Forget Gate: At each time step, the LSTM decides what information to keep or forget from the internal state. This is done using a "forget gate" that takes the input and the previous state as inputs and outputs a number between 0 and 1 for each element in the internal state. A value of 1 means "keep this" and 0 means "forget this".
- Input Gate: Next, the LSTM decides what new information to store in the internal state. This is done through an "input gate" that takes the input and the previous state as inputs, processes them, and outputs a new candidate value for the internal state.
- Update the State: The internal state is updated by combining the information from the forget gate and the input gate. The forget gate decides what to remove from the state, and the input gate decides what to add.
- Output Gate: Finally, the LSTM decides what to output based on the updated internal state. This is done using an "output gate" that takes the input and the current state, processes them, and produces the output for the current time step.

# GRU

## Intro

Gated Recurrent Unit (GRU) is another type of recurrent neural network (RNN) architecture, similar to LSTM but somewhat simpler. It is designed to capture dependencies in sequential data.

- Update Gate: GRU has an update gate that controls how much of the previous state to keep and how much of the new state to add. It takes the input and the previous state, processes them, and decides what information to update in the current state.
- Reset Gate: There's also a reset gate that helps the model decide how much of the previous state to forget. It takes the input and the previous state, processes them, and decides what information to reset.
- Current Memory: GRU computes a new current memory based on the input, the previous state, and the update gate. It decides how much of the previous state to keep and how much of the new state to add.
- Output: Finally, GRU produces an output based on the current memory. This output can be used for predictions or passed to the next time step as input.


### Tensorflow

For the sake of my sanity, I am not building the models. Luckily, **Tensorflow.keras** comes with a semi-built-in option to construct RNNs with either/both methods (ore even some extra steps).

In [2]:
import tensorflow as tf
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import *
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.losses import MeanSquaredError
from tensorflow.keras.metrics import RootMeanSquaredError
from tensorflow.keras.optimizers import Adam

from data_funcs import *




In [3]:
portfolio_dict, portfolio = get_data()

  0%|          | 0/31 [00:00<?, ?it/s]

100%|██████████| 31/31 [00:32<00:00,  1.04s/it]


In [4]:
portfolio['stocks']['AMD'].head()

Unnamed: 0,Adj Close,Close,Dividends,High,Low,Open,Stock Splits,Volume,adx,atr,day sin,returns,rsi
0,146.050003,146.050003,0.0,146.100006,145.75,145.828598,0.0,605769.0,0.343013,0.387335,-0.999048,0.001526,0.701828
1,146.098099,146.098099,0.0,146.199997,145.914993,146.042099,0.0,571215.0,0.338793,0.380085,-0.999762,0.000329,0.69725
2,145.779907,145.779907,0.0,146.085007,145.759995,146.085007,0.0,448473.0,0.323179,0.376199,-1.0,-0.002178,0.657009
3,145.898605,145.898605,0.0,145.970001,145.750198,145.779907,0.0,447584.0,0.308843,0.365207,-0.999762,0.000814,0.57497
4,145.850006,145.850006,0.0,146.178406,145.720505,145.889999,0.0,623706.0,0.305946,0.371698,-0.999048,-0.000333,0.571113


The model will predict:

- rsi
- adx

And the necessary data to calculate the Sortino ratio of the ticker:

- Adj Close

Using **all** of the variables in a 6-window periodn from that specific stock (df)

| $\hat{x}$<sub>t-5</sub> | $\hat{x}$<sub>t-4</sub> | $\hat{x}$<sub>t-3</sub> | $\hat{x}$<sub>t-2</sub> | $\hat{x}$<sub>t-1</sub> | $\hat{x}$<sub>t</sub> | $\hat{y}$ |
|:---:|:---:|:---:|:---:|:---:|:---:|---:|
| df.iloc[ 0, : ] | df.iloc[ 1, : ] | df.iloc[ 2, : ] | df.iloc[ 3, : ] | df.iloc[ 4, : ] | df.iloc[ 5, : ] | [rsi<sub>6</sub>, adx<sub>6</sub>, Adj_Close<sub>6</sub>] |
| df.iloc[ 1, : ] | df.iloc[ 2, : ] | df.iloc[ 3, : ] | df.iloc[ 4, : ] | df.iloc[ 5, : ] | df.iloc[ 6, : ] | [rsi<sub>7</sub>, adx<sub>7</sub>, Adj_Close<sub>7</sub>] |
| df.iloc[ 2, : ] | df.iloc[ 3, : ] | df.iloc[ 4, : ] | df.iloc[ 5, : ] | df.iloc[ 6, : ] | df.iloc[ 7, : ] | [rsi<sub>8</sub>, adx<sub>8</sub>, Adj_Close<sub>8</sub>] |
| df.iloc[ 3, : ] | df.iloc[ 4, : ] | df.iloc[ 5, : ] | df.iloc[ 6, : ] | df.iloc[ 7, : ] | df.iloc[ 8, : ] | [rsi<sub>9</sub>, adx<sub>9</sub>, Adj_Close<sub>9</sub>] |
| ... | ... | ... | ... | ... | ... | ... |

## Split the data

In [5]:
df = portfolio['stocks']['AMD'].copy()
targets = df.loc[:, ['rsi', 'adx', 'Adj Close']].to_numpy()
df = df.to_numpy()
X = []
Y = []
window = 6

for i in range(len(df)-window):
    r = [x for x in df[i:i+window]]
    X.append(r)
    Y.append(targets[i+window])

X = np.array(X)
Y = np.array(Y)

X.shape, Y.shape

((3052, 6, 13), (3052, 3))

In [29]:
idx = 8*int(X.shape[0]/10)
X_train = X[:idx, :, :]
Y_train = Y[:idx, :]
X_test =  X[idx:, :, :]
Y_test =  Y[idx:, :]

In [30]:
X_train.shape, X_test.shape

((2440, 6, 13), (612, 6, 13))