### Part 1 — Functions for Clarity

**Goal: Modularize the program, avoid long scripts, and make it easy to follow.**

#### Functions:

1. parse_melody_string(melody_str) — Convert raw melody strings to (note, duration) tuples.

2. build_bigram_model(melodies) — Build a bigram table: current note → next-note counts.

3. choose_next_note(current_note, bigram) — Randomly pick the next note weighted by counts.

4. generate_melody(bigram, max_length, start_note) — Generate a melody using the bigram.

5. print_bigram(bigram) — Print the bigram table in a readable way.

6. flatten_melodies(melodies) — Combine multiple melodies into one list (optional).

7. add_start_end_tokens(melodies) — Add ^ and $ to control melody start/end (optional).

In [None]:
# parse_melody_string(melody_str)
def parse_melody_string(melody_str):
    notes = []
    for token in melody_str.split():
        if "_" in token:
            pitch, duration = token.split("_")
            notes.append((pitch, float(duration)))
        else:
            notes.append((token, 1.0))  # default duration if not specified
    return notes

# build_bigram_model
from collections import defaultdict, Counter

def build_bigram_model(melodies):
    bigram = defaultdict(Counter)
    for melody in melodies:
        melody = [("^", 0)] + melody + [("$", 0)]  # start and end tokens
        for i in range(len(melody) - 1):
            current_note = melody[i][0]
            next_note = melody[i+1][0]
            bigram[current_note][next_note] += 1
    return bigram

# choose_next_note(current_note, bigram)
import random

def choose_next_note(current_note, bigram):
    if current_note not in bigram:
        return "$"
    next_notes = list(bigram[current_note].keys())
    weights = list(bigram[current_note].values())
    return random.choices(next_notes, weights=weights, k=1)[0]

# generate_melody(bigram, max_length=10, start_note=None)
def generate_melody(bigram, max_length=10, start_note=None):
    melody = []
    current_note = start_note if start_note else "^"
    while True:
        next_note = choose_next_note(current_note, bigram)
        if next_note == "$" or len(melody) >= max_length:
            break
        melody.append((next_note, 1.0))  # default duration
        current_note = next_note
    return melody

# print_bigram(bigram)
def print_bigram(bigram):
    for note, counter in bigram.items():
        print(f"{note} -> {dict(counter)}")

### Part 2 — Build the bigram model

**Goal: Learn note-to-note relationships to avoid random-sounding melodies.**

#### Steps:

Add start (^) and end ($) tokens to each melody.

For each pair of consecutive notes (current, next), update a counter:

bigram[current_note][next_note] += 1


The result is a dictionary of counters:

{
    "C4": {"D4": 3, "E4": 1},
    "D4": {"E4": 2, "G4": 1},
    ...
}

In [None]:
# Example 
dataset = [
    parse_melody_string("C4_1 D4_1 E4_1 F4_1 G4_2"),
    parse_melody_string("E4_0.5 F4_0.5 G4_1 A4_2 B4_1")
]

bigram = build_bigram_model(dataset)
print_bigram(bigram)

### Part 3 — Generate new melodies

**Goal: Use the bigram to create coherent melodies.**

#### Algorithm:

1. Start with ^ or a random note.

2. While melody length < desired length:

    - Look up possible next notes using bigram[current_note].

    - Pick one randomly, weighted by counts.

    - Append it to the melody.

3. Stop when $ is reached or max length is reached.

4. Ensure ending rule: optionally enforce the melody to end on the tonic.

In [None]:
# Exmaple
for i in range(3):
    new_melody = generate_melody(bigram, max_length=8)
    print(f"Generated Melody {i+1}: {new_melody}")

### Part 4 — Show your results

**Goal: Demonstrate the model and generated melodies.**

#### Outputs:

1. Bigram model printed:

    - C4 -> {'D4': 3, 'E4': 1}
    - D4 -> {'E4': 2, 'G4': 1}


2. Generated melodies:

1. C D E G C D
2. G F E D C E
3. E F G A C D

### Part 5 — Clean up and enhancements

**Goal: Avoid infinite loops and improve musicality.**

- Add start/end tokens ^ and $ to terminate melodies naturally.

- Optional constraints:

    - No repeated note 3+ times in a row.

    - Prefer stepwise motion: avoid jumps > 5 semitones.

    - Include duration variation using data from the dataset.

- Find most common transitions in the dataset to inform starting notes or endings.

- Test with longer melodies to see if they sound musical.