# LSTM model
This notebook deals with training & testing an LSTM model for bitcoin price prediction. We'll use PyTorch and particularly PyTorch Lightning modules to build the network.

In [97]:
# import necessary libraries and read data
import torch 
import torch.nn as nn
import torch.nn.functional as F
import pytorch_lightning as pl
import pandas as pd
import numpy as np
df = pd.read_csv('data/raw.csv').drop(columns='market_caps')

The LSTM takes data in the format `(batch_len, seq_len, n_features)`, so we'll do some data preprocessing to get it into that format (soon). In this case, the batch length is the number of labeled samples, and the sequence length is the number of timesteps per sample.

In [98]:
# preprocessing (again)
# our data is hourly, so we'll use 24 hour sequences to predict
# our batch length will be len(df) - 24, and we have 2 features
seq_len = 24
batch_len = len(df) - seq_len
n_features = 2

# TODO: relabel y's to be within {0, 1, 2} (since cross entropy loss expects indices to classes)
labels = [-1, 0, 1, -1, 0]
y = list(map(lambda x: 2 if x == -1 else x, labels))

In [99]:
# testing & misc. work
fake_data = np.ones((2, seq_len, n_features))
t = torch.Tensor(fake_data)
lstm = nn.LSTM(input_size=n_features, hidden_size=1, batch_first=True)
out, _ = lstm(t)

In [89]:
# lightning module structure
class LSTM_Classifier(pl.LightningModule):
    def __init__(self, n_features, hidden, seq_len):
        super().__init__()
        
        self.hidden = hidden
        self.seq_len = seq_len
        
        # lstm layer and linear hidden-state to classes layer
        # lstm inputs a batch of samples of shape (seq_len, n_features),
        #   outputs 1 hidden state of shape (seq_len, hidden_size)
        self.lstm = nn.LSTM(input_size=n_features, hidden_size=2, batch_first=True)
        self.h2c = nn.Linear(hidden * seq_len, 3)
    
    # forward step - classification
    def forward(self, X):
        lstm_out, _ = self.lstm(X)
        class_preds = self.h2c(lstm_out[-1].view((self.hidden * self.seq_len, -1)))
        return F.softmax(class_preds)
        
    def training_step(self, batch):
        X, y = batch
        y_hat = self.forward(X)
        
        train_loss = F.cross_entropy(y_hat.view((1, 3)), y)
        self.log(train_loss)
        return train_loss
    
    def test_step(self, batch):
        X, y = batch
        y_hat = self.forward(X)
        
        test_loss = F.cross_entropy(y_hat.view((1, 3)), y)
        self.log(test_loss)
        return test_loss
        
    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters())

Notes/TODO:
- Softmax is good for multiclass, and works well with cross-entropy loss. However, the torch cross-entropy loss fn. expects scores $0 \leq s \leq 1$ for each possible class. I've added a linear layer that maps LSTM output to a score for each class, and applied the softmax activation fn. to it. I also changed the label for "buy" from -1 to 2, to be within the expected $0 \leq l \leq C - 1$ range.