# MP3 to Musical Notes Extractor

This Jupyter notebook demonstrates a **minimal but fully-functional** pipeline that:

1. **Accepts an MP3 file as input**  
2. **Extracts the fundamental pitches across the audio** using `librosa`’s `pyin` algorithm  
3. **Maps those pitches to Western musical note names**  
4. **Outputs** the ordered sequence of detected notes and a simple piano-roll-style plot.

The workflow is intentionally lightweight and uses only open-source Python libraries:
* `librosa` for audio I/O & pitch detection  
* `music21` for music-theory helpers (note spelling)  
* `matplotlib` & `pandas` for visualisation & tabular display

> ℹ️  Pitch detection is a hard problem – the algorithm here works best on **monophonic** or clearly dominant-melody recordings. Polyphonic, noisy, or heavily-accompanied tracks will yield approximate results.

Feel free to experiment, tweak parameters, and build on this foundation for more sophisticated analysis.



In [1]:
# --- 1. Install deps (Colab / Binder safety) ------------------------------
# (Skip if already installed in your environment)
%pip -q install librosa music21 pandas tqdm



Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
from pathlib import Path
import librosa as lr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import Audio, display
from music21.pitch import Pitch
from tqdm.notebook import tqdm

# Utility -------------------------------------------------------------------
SEMITONE_NAMES = [Pitch(midi=i).nameWithOctave for i in range(128)]

def hz_to_note_name(hz: float) -> str:
    """Convert frequency in Hz to nearest MIDI note name (e.g. 'C4')."""
    if np.isnan(hz):
        return "rest"
    midi = int(np.round(lr.hz_to_midi(hz)))
    return SEMITONE_NAMES[midi]



Matplotlib is building the font cache; this may take a moment.


In [3]:
# --- 2b. (Optional) Upload an MP3 via the notebook UI -----------------------
# Skip this cell if you prefer to hard-code the path in the next cell.
from ipywidgets import FileUpload

uploader = FileUpload(accept='.mp3', multiple=False)
display(uploader)

# Wait for user to pick a file
import time, io
while not uploader.value:
    time.sleep(0.1)

fname = list(uploader.value.keys())[0]
mp3_path = Path(fname)
mp3_path.write_bytes(uploader.value[fname]['content'])
print(f"Saved upload to {mp3_path}")



ModuleNotFoundError: No module named 'ipywidgets'

In [None]:
# --- 2. Choose an MP3 file --------------------------------------------------
mp3_path = Path("your_audio.mp3")  # <-- replace with your path or use a file-upload widget
assert mp3_path.exists(), f"File not found: {mp3_path}"

# Listen to verify
display(Audio(str(mp3_path)))




In [None]:
# --- 3. Load audio & run pitch detection -----------------------------------
SAMPLE_RATE = 22050  # librosa default

audio, sr = lr.load(mp3_path, sr=SAMPLE_RATE, mono=True)
print(f"Loaded {len(audio)/sr:.1f} seconds, sr={sr}")

# Use probabilistic YIN (pyin) for F0 estimation
f0, voiced_flag, voiced_prob = lr.pyin(
    audio,
    fmin=lr.note_to_hz("C2"),
    fmax=lr.note_to_hz("C7"),
    sr=sr,
    frame_length=2048,
    hop_length=512,
)

# Map to note names
note_seq = [hz_to_note_name(f) for f in f0]



In [None]:
# --- 4. Aggregate consecutive identical notes ------------------------------
notes = []
current = None
start_idx = 0

def frame_to_time(idx):
    return idx * 512 / sr

for i, n in enumerate(note_seq):
    if n != current:
        if current is not None:
            notes.append({
                "note": current,
                "start_sec": frame_to_time(start_idx),
                "end_sec": frame_to_time(i),
            })
        current = n
        start_idx = i
# append last
notes.append({"note": current, "start_sec": frame_to_time(start_idx), "end_sec": frame_to_time(len(note_seq))})

df = pd.DataFrame(notes)
print(df.head())



In [None]:
# --- 5. Visualise as a piano-roll-style plot -------------------------------
plt.figure(figsize=(12, 4))
for _, row in df.iterrows():
    if row["note"] == "rest":
        continue  # skip silent frames
    midi = lr.note_to_midi(row["note"])
    plt.hlines(midi, row["start_sec"], row["end_sec"], lw=4)
plt.yticks(range(60, 85, 2))
plt.xlabel("Time (s)")
plt.ylabel("MIDI note number")
plt.title("Detected notes (approx.)")
plt.grid(True)
plt.show()

