# Recurrent Neural Networks

Grzegorz Statkiewicz, Mateusz Matukiewicz

## Overview

The structure of the direcotry should be as follows:

```
.
├── data
│   ├── train.pkl
│   └── test_no_target.pkl
└── main.ipynb
```



## Setup

Select the device to use

In [2]:
import torch

device = torch.device("cuda") if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using device: {device}")

Using device: mps


## Data preparation

Load the data

In [3]:
train_path = "data/train.pkl"

In [4]:
import pickle

with open(train_path, "rb") as f:
    train_data = pickle.load(f)

print(f"Loaded {len(train_data)} training samples.")

Loaded 2939 training samples.


Print sample data

In [5]:
import random


idx = random.randint(0, len(train_data) - 1)
print(f"Sample data: {train_data[idx]}")

Sample data: (array([145., 145.,  80.,  92.,   5.,  65.,  15.,  32.,  33.,  78.,  78.,
        78.,  12., 145., 145.,  92.,  64.,  17.,  12.,  69.,  69.,  47.,
        93.,  78.,  12.,  12.,  88.,  78., 190.,  71.,  12.,  47., 156.,
        12.,  92.,  47.,  78.,  64.,  64.,  92., 149.,  39., 124., 126.,
        71., 156.,  78.,  78.,  12.,   5.,  80.,  30.,  78., 119., 140.,
        45.,  88.,  78.,  78.,  78.,  12.,  47.,  13., 124.,  79.,  77.,
        78.,  47.,  30.,  78.,  71.,  76.,  92.,  92.,  13., 159.,  76.,
       124.,  76., 159., 190.,  76.,  65.,  65.,  88.,  88., 159.,   8.,
         8., 159.,   5., 124.,  13., 124., 124., 125.,  44., 126.,  13.,
        93., 119.,  36.,  47.,  12.,  47.,  13.,  28.,  13.,  13.,  14.,
       127.,  13.,   7.,   7.,  92.,  13., 172., 127.,  12., 156.,  44.,
        47.,  85.,  13.,  13.,   7.,  47., 125.,  37., 127., 127., 127.,
        44.,  33.,  33.,  45.,  39.,  39., 124., 124., 159.,  12.,  92.,
        30., 141.,  92., 152.,  78., 

In [6]:
import numpy as np

sequences = [torch.tensor(seq, dtype=torch.long) for (seq, label) in train_data]
labels = [label for (seq, label) in train_data]

# Find the max chord index (vocab size, since chords are ints)
all_chords = set()
for seq in sequences:
    all_chords.update(seq.tolist())
vocab_size = int(max(all_chords)) + 2  # +1 for max, +1 for padding idx=0

print(f"Vocab size: {vocab_size}")

Vocab size: 193


In [7]:
from torch.utils.data import Dataset, DataLoader
from torch.nn.utils.rnn import pad_sequence

class ChordDataset(Dataset):
    def __init__(self, sequences, labels):
        self.sequences = sequences
        self.labels = labels
    def __len__(self):
        return len(self.sequences)
    def __getitem__(self, idx):
        return self.sequences[idx], self.labels[idx]

def collate_fn(batch):
    seqs, labels = zip(*batch)
    lengths = torch.tensor([len(s) for s in seqs], dtype=torch.long)
    padded_seqs = pad_sequence(seqs, batch_first=True, padding_value=0)
    return padded_seqs, lengths, torch.tensor(labels, dtype=torch.long)

In [8]:
import torch.nn as nn

class SimpleRNNClassifier(nn.Module):
    def __init__(self, vocab_size, embed_dim, hidden_dim, output_dim):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
        self.rnn = nn.GRU(embed_dim, hidden_dim, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)
    def forward(self, x, lengths):
        x = self.embedding(x)
        packed = nn.utils.rnn.pack_padded_sequence(x, lengths.cpu(), batch_first=True, enforce_sorted=False)
        packed_out, h_n = self.rnn(packed)
        # Use last hidden state (h_n) for classification
        logits = self.fc(h_n[-1])
        return logits

In [11]:
from sklearn.model_selection import train_test_split

train_data_split, val_data_split = train_test_split(train_data, test_size=0.2, random_state=42)

train_sequences = [torch.tensor(seq, dtype=torch.long) for (seq, label) in train_data_split]
train_labels = [label for (seq, label) in train_data_split]
val_sequences = [torch.tensor(seq, dtype=torch.long) for (seq, label) in val_data_split]
val_labels = [label for (seq, label) in val_data_split]

In [12]:
train_dataset = ChordDataset(train_sequences, train_labels)
val_dataset = ChordDataset(val_sequences, val_labels)

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, collate_fn=collate_fn)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False, collate_fn=collate_fn)

In [13]:
def evaluate(model, val_loader, device):
    model.eval()
    total_correct = 0
    total_samples = 0
    total_loss = 0
    with torch.no_grad():
        for batch_seqs, batch_lengths, batch_labels in val_loader:
            batch_seqs = batch_seqs.to(device)
            batch_lengths = batch_lengths.to(device)
            batch_labels = batch_labels.to(device)
            logits = model(batch_seqs, batch_lengths)
            loss = criterion(logits, batch_labels)
            preds = torch.argmax(logits, dim=1)
            total_correct += (preds == batch_labels).sum().item()
            total_samples += batch_labels.size(0)
            total_loss += loss.item() * batch_seqs.size(0)
    avg_loss = total_loss / total_samples
    accuracy = total_correct / total_samples
    return avg_loss, accuracy

In [None]:
BATCH_SIZE = 32
EPOCHS = 10
EMBED_DIM = 32
HIDDEN_DIM = 64
OUTPUT_DIM = 5

model = SimpleRNNClassifier(vocab_size, EMBED_DIM, HIDDEN_DIM, OUTPUT_DIM).to(device)

In [16]:
from tqdm import tqdm

optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
criterion = nn.CrossEntropyLoss()

for epoch in range(EPOCHS):
    model.train()
    total_loss = 0
    total_correct = 0
    total_samples = 0
    for batch_seqs, batch_lengths, batch_labels in tqdm(train_loader, desc=f"Epoch {epoch+1}/{EPOCHS}"):
        batch_seqs = batch_seqs.to(device)
        batch_lengths = batch_lengths.to(device)
        batch_labels = batch_labels.to(device)

        optimizer.zero_grad()
        logits = model(batch_seqs, batch_lengths)
        loss = criterion(logits, batch_labels)
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item() * batch_seqs.size(0)
        preds = torch.argmax(logits, dim=1)
        total_correct += (preds == batch_labels).sum().item()
        total_samples += batch_labels.size(0)
    
    avg_loss = total_loss / total_samples
    train_acc = total_correct / total_samples

    val_loss, val_acc = evaluate(model, val_loader, device)
    
    print(f"Epoch {epoch+1}/{EPOCHS} | Train Loss: {avg_loss:.4f} | Train Acc: {train_acc:.4f} | Val Loss: {val_loss:.4f} | Val Acc: {val_acc:.4f}")

Epoch 1/10: 100%|██████████| 74/74 [02:42<00:00,  2.19s/it]


Epoch 1/10 | Train Loss: 0.0928 | Train Acc: 0.9783 | Val Loss: 0.3496 | Val Acc: 0.9031


Epoch 2/10: 100%|██████████| 74/74 [02:43<00:00,  2.21s/it]


Epoch 2/10 | Train Loss: 0.0868 | Train Acc: 0.9787 | Val Loss: 0.3467 | Val Acc: 0.9014


Epoch 3/10: 100%|██████████| 74/74 [02:43<00:00,  2.22s/it]


Epoch 3/10 | Train Loss: 0.0837 | Train Acc: 0.9792 | Val Loss: 0.3454 | Val Acc: 0.9031


Epoch 4/10: 100%|██████████| 74/74 [02:41<00:00,  2.19s/it]


Epoch 4/10 | Train Loss: 0.0811 | Train Acc: 0.9817 | Val Loss: 0.3546 | Val Acc: 0.9014


Epoch 5/10: 100%|██████████| 74/74 [02:43<00:00,  2.21s/it]


Epoch 5/10 | Train Loss: 0.0779 | Train Acc: 0.9817 | Val Loss: 0.3526 | Val Acc: 0.9014


Epoch 6/10: 100%|██████████| 74/74 [02:40<00:00,  2.16s/it]


Epoch 6/10 | Train Loss: 0.0757 | Train Acc: 0.9817 | Val Loss: 0.3607 | Val Acc: 0.9031


Epoch 7/10: 100%|██████████| 74/74 [02:43<00:00,  2.20s/it]


Epoch 7/10 | Train Loss: 0.0732 | Train Acc: 0.9830 | Val Loss: 0.3574 | Val Acc: 0.9048


Epoch 8/10: 100%|██████████| 74/74 [02:41<00:00,  2.18s/it]


Epoch 8/10 | Train Loss: 0.0717 | Train Acc: 0.9843 | Val Loss: 0.3606 | Val Acc: 0.9014


Epoch 9/10: 100%|██████████| 74/74 [02:46<00:00,  2.25s/it]


Epoch 9/10 | Train Loss: 0.0685 | Train Acc: 0.9847 | Val Loss: 0.3701 | Val Acc: 0.8997


Epoch 10/10: 100%|██████████| 74/74 [02:45<00:00,  2.24s/it]


Epoch 10/10 | Train Loss: 0.0674 | Train Acc: 0.9855 | Val Loss: 0.3664 | Val Acc: 0.9048
