# 🎹 The Well-Tempered Algorithm
## Learning AI Through the Music of Bach

Welcome to this interactive workshop. We will progress through three distinct eras of AI music generation:

1.  **The Probabilistic Era:** Using **Markov Chains** to understand style as a game of dice.
2.  **The Deep Learning Era:** Using **LSTMs** (Recurrent Neural Networks) to gain "memory."
3.  **The Transformer Era:** Using **GPT-2** architectures to treat music as a language.

### Prerequisites
We need a few libraries. `music21` is our toolkit for handling sheet music data.

In [None]:
!pip install music21 tensorflow transformers numpy

import music21
import numpy as np
import random
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Activation
from transformers import TFGPT2LMHeadModel, GPT2Config

print("Environment Ready!")

--- 
## Module 1: The Markov Chain (Probability)

Before we use heavy AI, let's understand the basics. A Markov chain generates music by asking: *"Given the note I just played, what is the most likely note to follow?"*

It has no memory of the past beyond the current moment. It is perfect for capturing the **texture** of Bach, but terrible at **structure**.

In [None]:
# --- 1. Load Data ---
def get_bach_chorale():
    # We load a specific chorale from the music21 corpus
    print("Loading Bach Chorale BWV 66.6...")
    return music21.corpus.parse('bach/bwv66.6')

# --- 2. Train (Build Dictionary) ---
def train_markov_chain(score):
    soprano = score.parts[0].flatten().notes
    transitions = {}
    
    for i in range(len(soprano) - 1):
        curr = soprano[i].nameWithOctave
        next_n = soprano[i+1].nameWithOctave
        
        if curr not in transitions:
            transitions[curr] = []
        transitions[curr].append(next_n)
    return transitions

# --- 3. Generate ---
def generate_markov(chain, length=20):
    current = random.choice(list(chain.keys()))
    melody = [current]
    
    for _ in range(length):
        if current in chain:
            current = random.choice(chain[current])
            melody.append(current)
        else:
            break
    return melody

# --- Run Module 1 ---
score = get_bach_chorale()
brain = train_markov_chain(score)
new_melody = generate_markov(brain)
print("Markov Melody:", new_melody)

--- 
## Module 2: The LSTM (Long Short-Term Memory)

Markov chains forget instantly. An **LSTM** is a Neural Network designed to have a "conveyor belt" of memory. It looks at a *sequence* of notes (e.g., the last 10) to decide the next one.

**Note:** Deep Learning requires data. We will load 10 chorales. In a real scenario, you would want 300+.

In [None]:
# --- 1. Data Prep ---
def get_lstm_data():
    chorales = music21.corpus.getComposer('bach')[:10]
    all_notes = []
    for c in chorales:
        try:
            parsed = music21.corpus.parse(c)
            notes = parsed.parts[0].flatten().notes
            for n in notes:
                if isinstance(n, music21.note.Note):
                    all_notes.append(n.nameWithOctave)
                elif isinstance(n, music21.chord.Chord):
                    all_notes.append(n.notes[-1].nameWithOctave)
        except:
            continue
    return all_notes

raw_notes = get_lstm_data()
pitchnames = sorted(set(raw_notes))
note_to_int = {n: i for i, n in enumerate(pitchnames)}
int_to_note = {i: n for i, n in enumerate(pitchnames)}
n_vocab = len(pitchnames)

# Create Sequences
SEQ_LEN = 10
network_in = []
network_out = []

for i in range(len(raw_notes) - SEQ_LEN):
    seq_in = raw_notes[i:i + SEQ_LEN]
    seq_out = raw_notes[i + SEQ_LEN]
    network_in.append([note_to_int[c] for c in seq_in])
    network_out.append(note_to_int[seq_out])

X = np.reshape(network_in, (len(network_in), SEQ_LEN, 1)) / float(n_vocab)
y = tf.keras.utils.to_categorical(network_out)

# --- 2. Model ---
model_lstm = Sequential()
model_lstm.add(LSTM(128, input_shape=(X.shape[1], X.shape[2])))
model_lstm.add(Dense(n_vocab))
model_lstm.add(Activation('softmax'))
model_lstm.compile(loss='categorical_crossentropy', optimizer='rmsprop')

# --- 3. Train (Short run for demo) ---
print("Training LSTM...")
model_lstm.fit(X, y, epochs=5, batch_size=64)

# --- 4. Generate ---
start = np.random.randint(0, len(network_in)-1)
pattern = network_in[start]
prediction_output = []

for i in range(20):
    input_x = np.reshape(pattern, (1, len(pattern), 1)) / float(n_vocab)
    prediction = model_lstm.predict(input_x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_note[index]
    prediction_output.append(result)
    pattern.append(index)
    pattern = pattern[1:]

print("LSTM Melody:", prediction_output)

--- 
## Module 3: The Transformer (GPT-2)

Modern AI treats music as a **Language**. We will use Hugging Face's GPT-2 architecture. Instead of reading left-to-right like an LSTM, the Transformer uses **Self-Attention** to see the whole musical phrase at once.

We tokenize music into strings: `"C#4_1.0"` (Pitch_Duration).

In [None]:
# --- 1. Tokenization ---
dataset_strings = []
vocab = set()

# We reuse the raw data logic but include duration
for c in music21.corpus.getComposer('bach')[:5]:
    try:
        s = music21.corpus.parse(c)
        flat = s.parts[0].flatten().notes
        tokens = []
        for n in flat:
            if isinstance(n, music21.note.Note):
                tok = f"{n.nameWithOctave}_{n.quarterLength}"
                tokens.append(tok)
                vocab.add(tok)
        dataset_strings.append(tokens)
    except:
        continue

vocab_list = sorted(list(vocab))
t2i = {t: i for i, t in enumerate(vocab_list)}
i2t = {i: t for i, t in enumerate(vocab_list)}

# --- 2. GPT Model Config ---
config = GPT2Config(
    vocab_size=len(vocab),
    n_positions=32,
    n_ctx=32,
    n_embd=128,
    n_layer=2,
    n_head=4
)
gpt_model = TFGPT2LMHeadModel(config)
gpt_model.build(input_shape=(None, 32))
gpt_model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))
print("GPT Model initialized.")