**Introduction**

In this assignment, you will implement a Recurrent Neural Network (RNN) for music generation.  

For this, you will use the Irish Massive ABC Notation (IrishMAN) dataset, which contains a collection of Irish folk tunes in ABC notation.  

The goal is to train an RNN to generate new tunes based on the patterns learned from the dataset.

**Dataset:**  
IrishMAN Dataset can be found at [https://huggingface.co/datasets/sander-wood/irishman](https://huggingface.co/datasets/sander-wood/irishman).


**Tasks**

a) Data Preparation: Download the IrishMAN dataset and preprocess the ABC notation files to create a suitable input format for the RNN.  

This includes tokenizing the ABC notation and creating sequences of tokens.

In [None]:
# from datasets import load_dataset

# import torch
# import numpy as np


# dataset = load_dataset("sander-wood/irishman")

# train_dataset = dataset["train"]
# validation_dataset = dataset["validation"]


# # print("Train dataset size:", len(train_dataset))

# # print("Validation dataset size:", len(validation_dataset))

# train_data = "\n".join(i['abc notation'] for i in train_dataset)

# # numberOfSequences = 10

# # sequenceLength = 100
# # # Create a list to hold the sequences


# # # Initialize the sequence list with None


# # sequence = [None] * numberOfSequences

# # # In the loop below, you can extract sequences of length 100 from the concatenated train_data string:
# # for i in range(numberOfSequences):
# #     start_idx = i * sequenceLength
# #     end_idx = start_idx + sequenceLength
# #     sequence[i] = train_data[start_idx:end_idx]







  from .autonotebook import tqdm as notebook_tqdm


In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from datasets import load_dataset
import random
from datetime import datetime
start_time = datetime.now()

# 0. Control the data size for quick testing
SAMPLE_SIZE = 50000     # number of tunes to use (set to None for full dataset)

# 1. Set parameters
SEQ_LENGTH = 75       # number of characters per training sequence
BATCH_SIZE = 32       # how many sequences per batch
EMBED_DIM = 64        # size of character embeddings
HIDDEN_DIM = 128      # size of LSTM hidden state
NUM_LAYERS = 1        # number of LSTM layers
LEARNING_RATE = 0.001 # optimizer learning rate
EPOCHS = 20            # number of passes through the data
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# 2. Load data from Hugging Face
print(f"Loading data on {DEVICE}...")
dataset = load_dataset('sander-wood/irishman', split='train')
texts = dataset['abc notation']  # list of music strings
if SAMPLE_SIZE is not None:
    texts = texts[:SAMPLE_SIZE]   # keep only a small subset
print(f"Using {len(texts)} tunes for training")

# 3. Build a character-level vocabulary
vocab = sorted(set(''.join(texts)))
char2idx = {ch: i for i, ch in enumerate(vocab)}
idx2char = {i: ch for ch, i in char2idx.items()}
VocabSize = len(vocab)
print(f"Vocab size: {VocabSize}")

# 4. Create a simple Dataset class
class MusicDataset(Dataset):
    def __init__(self, texts):
        self.data = []
        for t in texts:
            # turn text into list of indices
            seq = [char2idx.get(c, 0) for c in t]
            # slice into pairs of (input, target)
            for i in range(len(seq) - SEQ_LENGTH):
                inp = seq[i:i+SEQ_LENGTH]
                tgt = seq[i+1:i+SEQ_LENGTH+1]
                self.data.append((inp, tgt))
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        inp, tgt = self.data[idx]
        return torch.tensor(inp), torch.tensor(tgt)

# 5. Split into train/val
random.shuffle(texts)
split_idx = int(0.9 * len(texts))
train_texts = texts[:split_idx]
val_texts = texts[split_idx:]
train_ds = MusicDataset(train_texts)
val_ds = MusicDataset(val_texts)
train_loader = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True)
val_loader = DataLoader(val_ds, batch_size=BATCH_SIZE)

# 6. Define the LSTM model
class SimpleRNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.embed = nn.Embedding(VocabSize, EMBED_DIM)
        self.lstm = nn.LSTM(EMBED_DIM, HIDDEN_DIM, NUM_LAYERS, batch_first=True)
        self.fc = nn.Linear(HIDDEN_DIM, VocabSize)
    def forward(self, x, hidden=None):
        x = self.embed(x)
        out, hidden = self.lstm(x, hidden)
        out = out.reshape(-1, HIDDEN_DIM)
        logits = self.fc(out)
        return logits, hidden

model = SimpleRNN().to(DEVICE)
optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)
criterion = nn.CrossEntropyLoss()

# 7. Training loop
# Early stopping parameters
patience = 5              # how many epochs to wait without improvement
best_val_loss = float('inf')
epochs_no_improve = 0
early_stop = False

for epoch in range(1, EPOCHS+1):
    model.train()
    total_loss = 0
    for inputs, targets in train_loader:
        inputs = inputs.to(DEVICE)
        targets = targets.to(DEVICE).view(-1)
        optimizer.zero_grad()
        logits, _ = model(inputs)
        loss = criterion(logits, targets)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch}/{EPOCHS}, Loss: {avg_loss:.4f}")

    # Validation step
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for inputs, targets in val_loader:
            inputs = inputs.to(DEVICE)
            targets = targets.to(DEVICE).view(-1)
            logits, _ = model(inputs)
            val_loss += criterion(logits, targets).item()
    val_loss /= len(val_loader)
    print(f"Validation Loss: {val_loss:.4f}")

    # Check for improvement
    if val_loss < best_val_loss:
        best_val_loss = val_loss
        epochs_no_improve = 0
        # Optionally save the best model here
        torch.save(model.state_dict(), "best_model.pth")
    else:
        epochs_no_improve += 1
        if epochs_no_improve >= patience:
            print(f"Early stopping triggered after {epoch} epochs.")
            early_stop = True

    if early_stop:
        break
# 8. Generate a sample sequence with dynamic M, L, K
model.eval()
# Keep X and T fixed, but choose M, L, K randomly
M_options = ["M:6/8", "M:4/4", "M:3/4"]
L_options = ["L:1/8", "L:1/16", "L:1/4"]
K_options = ["K:D", "K:G", "K:C"]
# Build header
start = f"""X:1
T:Generated Tune
{random.choice(M_options)}
{random.choice(L_options)}
{random.choice(K_options)}
"""
# Convert header to indices
seq = [char2idx.get(c, 0) for c in start]
hidden = None
# Generate additional 200 characters of music
for _ in range(800):
    inp_seq = seq[-SEQ_LENGTH:]
    inp = torch.tensor(inp_seq).unsqueeze(0).to(DEVICE)
    logits, hidden = model(inp, hidden)
    probs = torch.softmax(logits[-1], dim=0)
    idx = torch.multinomial(probs, 1).item()
    seq.append(idx)

print("Training finished in:", datetime.now() - start_time)


# Save the model
torch.save({
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
}, 'best_model.pth')




# Convert back to characters and print
gen = ''.join(idx2char[i] for i in seq)
print("Generated ABC notation with dynamic headers:", gen)

Loading data on cuda...
Using 50000 tunes for training
Vocab size: 95
Epoch 1/20, Loss: 1.1400
Validation Loss: 1.1254
Epoch 2/20, Loss: 1.1043
Validation Loss: 1.1183
Epoch 3/20, Loss: 1.0973
Validation Loss: 1.1120
Epoch 4/20, Loss: 1.0936
Validation Loss: 1.1099
Epoch 5/20, Loss: 1.0912
Validation Loss: 1.1093
Epoch 6/20, Loss: 1.0895
Validation Loss: 1.1077
Epoch 7/20, Loss: 1.0882
Validation Loss: 1.1065
Epoch 8/20, Loss: 1.0871
Validation Loss: 1.1061


In [2]:

# print(f"Sequence 1:\n{sequence[0]}\n")
# print(f"Length of sequence 1: {len(sequence[0])} characters\n")


In [None]:
# def simple_tokenizer(abc: str) -> list:
#     """
#     Tokenizes a cleaned ABC notation string using a basic stateful loop.
#     It treats symbols like ^, _ (accidentals), durations, and note groups as single tokens.
#     """
#     tokens = []
#     current_token = ''

#     for char in abc:
#         if char.isspace():
#             if current_token:
#                 tokens.append(current_token)
#                 current_token = ''
#         elif char in "|[]:()":
#             if current_token:
#                 tokens.append(current_token)
#                 current_token = ''
#             tokens.append(char)
#         elif char in "^_=,'/0123456789ABCDEFGabcdefgzxzZ":  # acceptable musical characters
#             current_token += char
#         else:
#             # Unknown character, break current token and skip
#             if current_token:
#                 tokens.append(current_token)
#                 current_token = ''
    
#     if current_token:
#         tokens.append(current_token)

#     return tokens


In [None]:
# def clean_abc_text(abc_text: str) -> str:
#     """
#     Removes ABC metadata headers and keeps only the music notation after the 'K:' line.
#     This avoids tokenizing title, meter, tempo, etc.
#     """
#     lines = abc_text.split('\n')
#     music_started = False
#     music_lines = []

#     for line in lines:
#         if line.startswith('K:'):  # Start music after key signature
#             music_started = True
#             continue
#         if music_started:
#             music_lines.append(line)

#     return ' '.join(music_lines).strip()


In [None]:

# # Sort the dataset first
# sorted_train_dataset = sorted(train_dataset, key=lambda x: len(x['abc notation']))

# unique_tokens = set()

# for tune in sorted_train_dataset:
#     abc_raw = tune['abc notation']
#     cleaned_abc = clean_abc_text(abc_raw)

#     # Sanity check: skip long or empty lines
#     if len(cleaned_abc) == 0 or len(cleaned_abc) > 2000:
#         print("Long token: ", cleaned_abc)
#         continue

#     tokens = simple_tokenizer(cleaned_abc)

#     # Only accept reasonably short tokens
#     for token in tokens:
#         if len(token) <= 15:
#             unique_tokens.add(token)

# print(f"✅ Final unique token count: {len(unique_tokens)}")






Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  
Long token:  "^Slow" A3 A2- A c2 e | d3 Td2 ^c d2 A | f3 Te2 c g2 e | d3 d2 e c2 G | A3 A2 A c2 e |   d3 (Td2 ^c) d2 A | f3 f2 A f2 f | g3 (Tg2 f) g2 g |{fg} a3 a2 g Ta2 g | f3 (f2 g){fg} a2 g |   f !fermata!e2 a Tg>ed c2 G :: A3 A2 A cde | d3 Td2 c dcA | f3 e2 c geg | d3 d2 e cGc |   A3 A2 A cde | d3 Td2 c dcA | f3 f2 A fef | g3 Tg2 f gfg | a3 Ta2 g aga | f3 f2 g agf |  {f} !fermata!e2 a ged cGc ::[M:6/8]"^\"the variations follow\"" A3 A2 A/A/ | c3 c2 e/e/ |   d3 d2 c/c/ | d3 d2 A/A/ | f3 e2 c/c/ | g3 Tg2 e/e/ | d3 d2 A/A/ | c3 c2 G/G/ | A3 A2 A/A/ |   c3 c2 E/E/ | [Dd]3 d2 c/c/ | d3 d2 A/A/ | f3 f2 A/A/ | f3 f2 f/f/ | g3 Tg2 f/f/ | g2 g2 f/f/ |   a3 a2 g/g/ | a3 a2 g/g/ | f3 f2 g/g/ | a3 Ta2 g/g/ |{f} e3 e2 d/d/ | c3 c2 G/G/ ::   A2 A/A/ A2 A/A/ | c2 c/c/ c2 e/e/ | d2 d/d/ d2 ^

b) Model Implementation: Implement an RNN model (RNN Layer, LSTM Layer)using a deep learning framework of your choice (e.g. PyTorch).  

The model should be able to take sequences of tokens as input and predict the next token in the sequence.  

c) Training: Train your RNN model on the preprocessed dataset.  

Experiment with different hyperparameters such as learning rate, batch size, and number of epochs to achieve the best performance.

d) Music Generation: After training, use your RNN model to generate new music sequences.  

You can start with a seed sequence and let the model predict subsequent tokens to create a complete tune.

e) Evaluation: Evaluate the quality of the generated music. You can do this by listening to the tunes or by using metrics such as top-1 or top-5 accuracy.  

You can also look for other metrics like BLEU score or perplexity.  

Please use TensorBoard to visualize the training process and the generated music.  

Here is a tutorial on how to use TensorBoard with PyTorch: https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/tensorboard_with_pytorch.ipynb.