## Introduction

This project focuses on symbolic music harmonization, a task in the field of music information retrieval (MIR) and AI-generated music. The goal is to predict a sequence of chords that complement a given melody, using symbolic representations of music—typically notes and chords encoded from MIDI files. This falls under the broader umbrella of conditioned generation, where one musical element (melody) is used to condition the generation of another (harmony).

## Problem Statement

Given a sequence of musical notes (melody), automatically generate a corresponding sequence of chords (harmony) that musically complements the melody. This mirrors tasks a human composer might perform and is especially useful for music composition assistance, educational tools, and interactive music applications.

## Symbolic Conditioned Generation & Harmoization

Symbolic refers to data formats like MIDI, which encode music as discrete symbolic events (notes, durations, velocities) rather than audio waveforms. Unlike audio signals, symbolic representations allow for structured and interpretable data manipulation. The model generates chord sequences conditioned on input melodies, creating musically relevant outputs based on specific inputs. This makes the task similar to machine translation in NLP, where an input sentence (melody) is "translated" into another sequence (chords). Harmonization is the process of adding chords to a melody according to music theory principles (e.g., chord functions, voice leading) - creating a fuller musical texture. 

## Dataset: Nottingham MIDI Dataset

Source:
Collection of 1,200+ folk tunes in MIDI format from the Nottingham Music Database. The dataset contains folk tunes each with a melody and a corresponding harmony. 

Characteristics:
1. Primarily monophonic melodies with chord annotations
2. Contains folk, dance, and ballad styles
3. Typical structure: 8-32 bar phrases in 4/4 time

Each file is parsed to extract:
1. Melody: A sequence of notes or rests.
2. Chords: Either actual chords from the file or placeholder labels (in this code, placeholder "C" is used).

Format: Standard MIDI files, parsed using music21



In [5]:
pip install music21

Collecting music21
  Using cached music21-9.7.0-py3-none-any.whl.metadata (5.1 kB)
Collecting chardet (from music21)
  Using cached chardet-5.2.0-py3-none-any.whl.metadata (3.4 kB)
Collecting jsonpickle (from music21)
  Using cached jsonpickle-4.1.1-py3-none-any.whl.metadata (8.1 kB)
Collecting more-itertools (from music21)
  Using cached more_itertools-10.7.0-py3-none-any.whl.metadata (37 kB)
Collecting numpy<2.0.0 (from music21)
  Using cached numpy-1.26.4.tar.gz (15.8 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Installing backend dependencies: started
  Installing backend dependencies: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'error'
Note: you may need to restart the kernel to use updated packages.


  error: subprocess-exited-with-error
  
  Preparing metadata (pyproject.toml) did not run successfully.
  exit code: 1
  
  [21 lines of output]
  + C:\Users\Nush\AppData\Local\Programs\Python\Python313\python.exe C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab5837a46bd9f41c\vendored-meson\meson\meson.py setup C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab5837a46bd9f41c C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab5837a46bd9f41c\.mesonpy-c0ago6ov -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --native-file=C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab5837a46bd9f41c\.mesonpy-c0ago6ov\meson-python-native-file.ini
  The Meson build system
  Version: 1.2.99
  Source dir: C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab5837a46bd9f41c
  Build dir: C:\Users\Nush\AppData\Local\Temp\pip-install-radgrxnp\numpy_9a25f3fe41a542d1ab58

In [6]:
import os
import torch
import torch.nn as nn
import torch.optim as optim
from music21 import converter, instrument, note, chord, stream
import random
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

ModuleNotFoundError: No module named 'music21'

In [None]:
DATA_DIR = "nottingham-dataset-master/MIDI/"  # Update to your MIDI folder
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Sets data path and computation device (CPU or GPU)

## Data Preprocessing

File Parsing: Extract notes and chords using music21

Simplification:

1. Melodies: Convert to pitch strings (e.g., "C4") or "Rest"

2. Chords: Simplified to root note names (original code uses placeholder)

Sequence Alignment: Ensure 1:1 correspondence between notes and chords

### MIDI Parsing
- Uses music21 to parse MIDI files.
- Extracts melodies and (placeholder) chords from each file.
- Converts musical elements to string-based symbolic representations.

In [None]:
# -----------------------------
# MIDI Parser
# -----------------------------
def parse_midi_file(file_path):
    midi = converter.parse(file_path)
    parts = instrument.partitionByInstrument(midi)
    melody = []
    chords = []

    if parts: 
        part = parts.parts[0]
    else:
        part = midi.flat.notes

    for el in part.flat.notesAndRests:
        if isinstance(el, note.Note):
            melody.append(str(el.pitch))
            chords.append("C")  # Placeholder chord
        elif isinstance(el, chord.Chord):
            melody.append(str(el.root()))
            chords.append(str(el.commonName))
        elif isinstance(el, note.Rest):
            melody.append("Rest")
            chords.append("C")

    return melody, chords

### Dataset Loading

- Iterates through all MIDI files in the dataset.
- Parses and collects melodies and chords.

In [None]:
# -----------------------------
# Load Dataset
# -----------------------------
all_melodies = []
all_chords = []

for fname in os.listdir(DATA_DIR):
    if fname.endswith(".mid") or fname.endswith(".midi"):
        try:
            mel, chd = parse_midi_file(os.path.join(DATA_DIR, fname))
            if len(mel) > 0:
                all_melodies.append(mel)
                all_chords.append(chd)
        except Exception as e:
            print(f"Error parsing {fname}: {e}")

### Vocabulary Creation

- Builds lookup dictionaries for notes and chords.
- Each unique token is mapped to an integer index.
- Includes "UNK" token for unseen notes
- Chord vocabulary size typically 10-50 classes

In [None]:
# -----------------------------
# Vocab
# -----------------------------
note_vocab = sorted(list({n for mel in all_melodies for n in mel}))
chord_vocab = sorted(list({c for chd in all_chords for c in chd}))

note_to_idx = {n: i for i, n in enumerate(note_vocab)}
note_to_idx["UNK"] = len(note_to_idx)
idx_to_note = {i: n for n, i in note_to_idx.items()}

chord_to_idx = {c: i for i, c in enumerate(chord_vocab)}
idx_to_chord = {i: c for c, i in chord_to_idx.items()}


### Data Preparation

- Converts melody and chord sequences into index-based tensors.
- Maintains sequence alignment

In [None]:
# -----------------------------
# Prepare Data
# -----------------------------
x_data = []
y_data = []

for mel, chd in zip(all_melodies, all_chords):
    x_seq = [note_to_idx.get(n, note_to_idx["UNK"]) for n in mel]
    y_seq = [chord_to_idx[c] for c in chd]
    if len(x_seq) == len(y_seq):
        x_data.append(torch.tensor(x_seq))
        y_data.append(torch.tensor(y_seq))

In [4]:
# -----------------------------
# Train/Test Split
# -----------------------------
x_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size=0.2, random_state=42)

NameError: name 'train_test_split' is not defined

## Model Architecture

The model, Harmonizer, is a sequence-to-sequence model using an LSTM (Long Short-Term Memory) network. It maps a sequence of note indices to a sequence of chord indices.

1. Embedding Layer: Converts note indices into dense vectors (64-dim); learns semantic relationships between notes
2. LSTM Layer: Processes sequential embeddings (128 hidden units); Captures temporal dependencies in melodies; Batch-first: (batch, seq_len, features)
3. Fully Connected Layer: Predicts the chord class for each time step; Maps LSTM outputs → chord probabilities

Parameters:
- Embedding dim: 64
- Hidden units: 128
- Output size: Number of chord classes



In [2]:
class Harmonizer(nn.Module):
    def __init__(self, num_notes, num_chords, embedding_dim=64, hidden_dim=128):
        super().__init__()
        self.embed = nn.Embedding(num_notes, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, batch_first=True)
        self.fc = nn.Linear(hidden_dim, num_chords)

    def forward(self, x):
        x = self.embed(x)
        out, _ = self.lstm(x)
        return self.fc(out)

model = Harmonizer(len(note_to_idx), len(chord_to_idx)).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

NameError: name 'nn' is not defined

## Model Training
- Trains the model using CrossEntropyLoss.
- Each sequence is processed individually due to variable lengths.
- Optimization: Adam (lr=0.001)
- Batch Handling: Processes sequences individually

In [None]:
# -----------------------------
# Training
# -----------------------------
epochs = 10
for epoch in range(epochs):
    model.train()
    total_loss = 0
    for x, y in zip(x_train, y_train):
        x = x.unsqueeze(0).to(device)
        y = y.unsqueeze(0).to(device)
        optimizer.zero_grad()
        out = model(x)
        loss = criterion(out.view(-1, out.size(-1)), y.view(-1))
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f"Epoch {epoch+1}, Loss: {total_loss:.2f}")

## Evaluation
- Reports per-chord class metrics
- Handles class imbalance with zero_division=0

In [None]:
def evaluate(model, x_test, y_test):
    model.eval()
    correct = 0
    total = 0

    with torch.no_grad():
        for x, y in zip(x_test, y_test):
            x = x.unsqueeze(0).to(device)
            y = y.unsqueeze(0).to(device)
            out = model(x)
            pred = out.argmax(dim=-1)
            correct += (pred == y).sum().item()
            total += y.numel()

    acc = 100 * correct / total
    print(f"\nTest Accuracy: {acc:.2f}%")

evaluate(model, x_test, y_test)

In [None]:
def evaluate_with_metrics(model, x_test, y_test):
    model.eval()
    y_true = []
    y_pred = []

    with torch.no_grad():
        for x, y in zip(x_test, y_test):
            x = x.unsqueeze(0).to(device)
            out = model(x)
            preds = out.argmax(dim=-1).squeeze().cpu().tolist()
            targets = y.squeeze().tolist()

            # Ensure list format
            if isinstance(preds, int):
                preds = [preds]
            if isinstance(targets, int):
                targets = [targets]

            y_pred.extend(preds)
            y_true.extend(targets)

    # Compute only over classes actually present in y_true and y_pred
    all_labels = sorted(set(y_true) | set(y_pred))

    print("\nClassification Report:")
    print(classification_report(
        y_true, y_pred,
        labels=all_labels,
        target_names=[idx_to_chord[i] for i in all_labels],
        zero_division=0
    ))

evaluate_with_metrics(model, x_test, y_test)

## Harmoinzation
- Real-time chord prediction for new melodies
- Handles unknown notes via "UNK" 
- Takes a new melody (list of note strings).
- Converts it to indices, feeds into the model, and returns predicted chords.





In [None]:
# -----------------------------
# Harmonize New Melody
# -----------------------------
def harmonize(melody_notes):
    model.eval()
    input_idxs = [note_to_idx.get(n, note_to_idx["UNK"]) for n in melody_notes]
    x = torch.tensor(input_idxs).unsqueeze(0).to(device)
    with torch.no_grad():
        pred = model(x)
        pred_idxs = pred.argmax(-1).squeeze().tolist()
    return [idx_to_chord[i] for i in pred_idxs]

### Save to MIDI
- Uses music21 to create a MIDI file from melody and chords.
- Currently only saves melody notes (not chord sounds).

In [None]:
# -----------------------------
# Save Harmonized MIDI
# -----------------------------
def save_to_midi(melody, chords, filename="harmonized.mid"):
    s = stream.Stream()
    for m, c in zip(melody, chords):
        n = note.Note(m) if m != "Rest" else note.Rest()
        n.quarterLength = 1.0
        s.append(n)
    s.write("midi", fp=filename)

## Example Run

In [None]:
# -----------------------------
# Example Run
# -----------------------------
input_melody = all_melodies[0][:30]
predicted_chords = harmonize(input_melody)
print("Melody:", input_melody)
print("Predicted Chords:", predicted_chords)
save_to_midi(input_melody, predicted_chords)