# Final Team Project: Music Genre and Composer Classification Using Deep Learning

### Introduction

Music is a ubiquitous art form with a rich history. Different composers have imbued their work with unique styles, making identifying the creator of a piece challenging for novices. This project leverages deep learning to accurately identify the composer of a given musical score.

### Objective

The primary objective is developing a deep learning model to predict the composer of a musical score. We will utilize two techniques:

- Long Short-Term Memory (LSTM)
- Convolutional Neural Network (CNN)

### Project Timeline

- **Module 2 (End of Week 2):** Team formation (2-3 members). Utilize Canvas, USD Email, or Slack.
- **Module 4 (End of Week 4):** Team representative submits "Team Project Status Update Form."
- **Module 7 (End of Week 7):** Final project deliverables due. **No extensions will be granted.**
    - Project Report (PDF)
    - Project Notebook (.ipynb exported as PDF or HTML)

### Dataset

The project utilizes a [Kaggle dataset](link_to_dataset) of MIDI files from classical composers.  Focus will be on:

1. Bach
2. Beethoven
3. Chopin
4. Mozart

### Methodology

1. **Data Collection:** Dataset provided.
2. **Data Pre-processing:**  Convert scores to MIDI format and apply data augmentation.
3. **Feature Extraction:**  Extract features like notes, chords, and tempo using music analysis tools.
4. **Model Building:** Develop LSTM and CNN models for composer classification.
5. **Model Training:** Train models using pre-processed and feature-extracted data.
6. **Model Evaluation:** Evaluate performance using accuracy, precision, and recall metrics.
7. **Model Optimization:** Fine-tune hyperparameters for optimal performance.

### Deliverables

1. **Project Report (PDF):** Comprehensive documentation in APA 7 style ([Sample Professional Paper](link_to_sample)).  
    - File Naming: `DeliverableName-TeamNumber.pdf` (e.g., `Project_Report-Team1.pdf`)
    - Include:
        - Methodology
        - Data pre-processing steps
        - Feature extraction techniques
        - Model architecture
        - Training process
        - Reference list with citations
        - Concluding section with findings and future improvements
2. **Project Notebook (PDF or HTML):** Jupyter Notebook containing the complete project code.
    - Data pre-processing
    - Feature extraction
    - Model building
    - Training
    - Evaluation
    - Additional analysis/visualizations

### Conclusion

This project aims to accurately predict composers of musical scores using LSTM and CNN models. The final model can benefit musicians, enthusiasts, and listeners alike.

### Power Usage

- Utilize Google Colab GPU/TPU for increased computational power.
- Consider subscribing to [Google Colab Pro+](https://colab.research.google.com/signup) if needed.

**Note:** Team member grades may vary based on individual contribution levels.

This assignment uses **Turnitin** for plagiarism detection. Review your work using the **Draft Coach** extension in Google Docs before submission.

### Rubric

#### Final Team Project Scoring Rubric

| Criteria | Ratings | Pts |
|---|---|---|
| **Project Report (25%)** | Meets or Exceeds Expectations: Report thoroughly describes methodology, data preprocessing, feature extraction, model architecture, and training process for reproducibility. | 75 pts |
|  | Approaches Expectations: Report generally describes the above, but needs minor revisions. | 61.5 pts |
|  | Below Expectations: Report minimally describes the above, requiring major revisions. | 52.5 pts |
|  | Inadequate Attempt: Report misses one or more key elements. | 0 pts |
|  | Non-Performance | 0 pts |
| **Project Notebook (65%)** | Meets or Exceeds Expectations: High-quality notebook with complete code, including data preprocessing, feature extraction, model building, training, evaluation, and additional analysis. | 175.5 pts |
|  | Approaches Expectations: Notebook contains all elements but needs minor revisions. | 159.9 pts |
|  | Below Expectations: Notebook contains all elements but needs major revisions. | 136.5 pts |
|  | Inadequate Attempt: Notebook missing one or more key elements. | 0 pts |
|  | Non-Performance | 0 pts |
| **References and Citations (5%)** | Meets or Exceeds Expectations: Complete reference list with proper APA citations. | 13.5 pts |
|  | Approaches Expectations: Reference list needs minor APA formatting revisions. | 12.3 pts |
|  | Below Expectations: Reference list needs major APA formatting revisions. | 10.5 pts |
|  | Inadequate Attempt: Incomplete or missing reference list. | 0 pts |
|  | Non-Performance | 0 pts |
| **Conclusion (5%)** | Meets or Exceeds Expectations: Thorough conclusion summarizing the project, highlighting key findings, and suggesting future improvements. | 13.5 pts |
|  | Approaches Expectations: Adequate conclusion, but could benefit from further elaboration. | 12.3 pts |
|  | Below Expectations: Vague conclusion, lacking key elements. | 10.5 pts |
|  | Inadequate Attempt: Conclusion missing one or more key elements. | 0 pts |
|  | Non-Performance | 0 pts |
| **Total Points:** | | **300** |


In [None]:
%matplotlib inline

def bootstrap():
    # @title Bootstrap Google Colab {display-mode:"form"}

    # CONFIGURE: Parameters
    GOOGLE_DRIVE_FOLDER = "aai-511" # @param {type:"string"}
    GitHub = True  # @param {type:"boolean"}
    OpenAI = True  # @param {type:"boolean"}
    HuggingFace = True  # @param {type:"boolean"}
    Kaggle = True  # @param {type:"boolean"}

    # ENSURE: Secrets
    from google.colab import userdata
    SECRETS = [
        ("GH_TOKEN", GitHub), # https://github.com/settings/personal-access-tokens/new
        ("GITHUB_USERNAME", GitHub), # git config --global user.name
        ("GITHUB_EMAIL", GitHub), # git config --global user.email
        ("OPENAI_API_KEY", OpenAI), # https://platform.openai.com/api-keys
        ("HF_TOKEN", HuggingFace), # https://huggingface.co/settings/tokens?new_token=true
        ("KAGGLE_USERNAME", Kaggle),
        ("KAGGLE_KEY", Kaggle), # https://www.kaggle.com/settings#:~:text=Create%20New%20Token
    ]
    for secret, enabled in SECRETS:
        if enabled:
            try:
                userdata.get(secret)
            except userdata.SecretNotFoundError:
                raise ValueError(f"Must set Google Colab secret: {secret}.")

    # CONFIGURE: Environment
    import os
    os.environ['PIP_QUIET'] = '3'
    os.environ['PIP_PROGRESS_BAR'] = 'off'
    os.environ['PIP_ROOT_USER_ACTION'] = 'ignore'
    os.environ['DEBIAN_FRONTEND'] = 'noninteractive'

    # CONFIGURE: matplotlib
    import matplotlib.pyplot as plt
    plt.rcParams['figure.dpi'] = 300
    plt.rcParams['savefig.dpi'] = 300

    # DISABLE: Telemetry
    os.environ['HF_HUB_DISABLE_TELEMETRY'] = '1'
    os.environ['GRADIO_ANALYTICS_ENABLED'] = 'False'

    # CONFIGURE: apt
    # https://manpages.ubuntu.com/manpages/bionic/man5/apt.conf.5.html
    APT_CONFIG = [
        'APT::Acquire::Retries "20";',
        'APT::Clean-Installed "true";',
        'APT::Get::Assume-Yes "true";',
        'APT::Get::Clean "always";',
        'APT::Get::Fix-Broken "true";',
        'APT::Install-Recommends "0";',
        'APT::Install-Suggests "0";',
        'APT::Sources::List::Disable-Auto-Refresh "true";',
        'Dpkg::Options "--force-confnew";',
        'Dpkg::Use-Pty "0";',
        'Quiet "2";',
    ]
    with open('/etc/apt/apt.conf.d/01apt.conf', 'w') as file:
        for setting in APT_CONFIG:
            file.write(setting + '\n')

    # INSTALL: uv
    # https://github.com/astral-sh/uv
    !pip install uv

    # AUTHENTICATE: GitHub
    # https://github.com/cli/cli/blob/trunk/docs/install_linux.md
    if GitHub:
        !apt-get remove --purge gh > /dev/null
        !mkdir -p -m 755 /etc/apt/keyrings
        !wget -qO- https://cli.github.com/packages/githubcli-archive-keyring.gpg | tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null
        !chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg
        !echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | tee /etc/apt/sources.list.d/github-cli.list > /dev/null
        !apt-get update > /dev/null
        !apt-get install gh > /dev/null
        !gh auth login --hostname "github.com" --git-protocol https --with-token <<< {userdata.get("GH_TOKEN")}
        !git config --global user.name {userdata.get("GITHUB_USERNAME")}
        !git config --global user.email {userdata.get("GITHUB_EMAIL")}
        !git config --global pull.rebase false
        !git config --global credential.helper store

    # AUTHENTICATE: OpenAI
    # https://www.kaggle.com/settings
    if OpenAI:
        os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
        !uv pip install --system --quiet openai
        from openai import OpenAI
        client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

    # AUTHENTICATE: Hugging Face
    # https://huggingface.co/docs/huggingface_hub/en/quick-start#authentication
    if HuggingFace:
        !uv pip install --system --quiet huggingface_hub[cli]
        !huggingface-cli login --add-to-git-credential --token {userdata.get("HF_TOKEN")} > /dev/null

    # AUTHENTICATE: Kaggle
    # https://www.kaggle.com/settings
    if Kaggle:
        os.environ["KAGGLE_USERNAME"] = userdata.get("KAGGLE_USERNAME")
        os.environ["KAGGLE_KEY"] = userdata.get("KAGGLE_KEY")
        !uv pip install --system --quiet kaggle
        from kaggle.api.kaggle_api_extended import KaggleApi
        api = KaggleApi()
        api.authenticate()

    # MOUNT: Google Drive
    import contextlib
    with contextlib.redirect_stdout(open(os.devnull, 'w')):
        import google.colab
        google.colab.drive.mount("/content/drive", force_remount=True)

    # SYMLINK: Google Drive folder to Files Pane (Top Level)
    import pathlib
    drive_path = pathlib.Path("/content/drive/MyDrive")
    colab_notebooks_path = drive_path / "Colab Notebooks"
    project_path = colab_notebooks_path / GOOGLE_DRIVE_FOLDER
    project_path.mkdir(parents=True, exist_ok=True)
    shortcut = pathlib.Path(f"/content/{GOOGLE_DRIVE_FOLDER}")
    shortcut.parent.mkdir(parents=True, exist_ok=True)
    if not shortcut.exists():
        shortcut.symlink_to(project_path)

    # REMOVE: Sample Folder
    !rm -rf /content/sample_data

    # ENSURE: apt packages
    !apt-get install -qq \
        tree > /dev/null

    # ENSURE: pip packages
    !pip install --upgrade pip
    !uv pip install --system --quiet \
        black[jupyter] \
        isort

    # IMPORT: Python Libraries
    import tqdm.notebook

    # OUTPUTS
    print(f"SHORTCUT: {shortcut} --> {project_path}")
    return str(shortcut)

SHORTCUT = bootstrap()


SHORTCUT: /content/aai-511 --> /content/drive/MyDrive/Colab Notebooks/aai-511


In [None]:
%%time
!echo "CPU Model: $(lscpu | grep 'Model name:' | cut -d ':' -f 2- | xargs)"
!echo "CPU Sockets: $(lscpu | grep 'Socket(s):' | awk '{print $2}')"
!echo "Cores per Socket: $(lscpu | grep 'Core(s) per socket:' | awk '{print $4}')"
!echo "Threads per Core: $(lscpu | grep 'Thread(s) per core:' | awk '{print $4}')"
!echo "Allocated RAM: $(free -h --si | awk '/Mem:/{print $2}')"

CPU Model: Intel(R) Xeon(R) CPU @ 2.00GHz
CPU Sockets: 2
Cores per Socket: 24
Threads per Core: 2
Allocated RAM: 342G
CPU times: user 34.7 ms, sys: 13.9 ms, total: 48.6 ms
Wall time: 635 ms


In [60]:
%%time

# INSTALL: OS-LEVEL PACKAGES
!apt-get install -qq --no-install-recommends \
    fluidsynth \
    fluid-soundfont-gm \
    graphviz \
    libfluidsynth3 \
    libgraphviz-dev \

# UNINSTALL: PYTHON PACKAGES
!uv pip uninstall --system --quiet \
    bokeh \
    mkl

# UPGRADE: PYTHON PACKAGES
!uv pip install --system --quiet --upgrade \
    bokeh \
    datasets \
    pygraphviz \
    setuptools \
    wheel

# INSTALL: PYTHON PACKAGES
!uv pip install --system --quiet \
    autogluon \
    autokeras \
    black[jupyter] \
    cookiecutter-data-science \
    databricks-sdk \
    dvclive[all] \
    gensim \
    gradio \
    isort \
    kaggle \
    matplotlib \
    midi2audio \
    mido \
    mlflow \
    music21 \
    numpy==1.24.4 \
    pandas \
    pretty_midi \
    pyngrok \
    py_midicsv \
    scikit-learn \
    seaborn \
    shap


CPU times: user 1.02 s, sys: 42.8 ms, total: 1.06 s
Wall time: 17.5 s


In [40]:
import pathlib

print("Cloning Repository...")
!git clone https://github.com/aai511-groupX/project.git
!cd /content/project && git submodule update --init --recursive
!cd /content/project && git lfs pull

REPO = pathlib.Path(f"/content/project")
print(f"REPO --> {REPO}")

Cloning Repository...
Cloning into 'project'...
remote: Enumerating objects: 38, done.[K
remote: Counting objects:   2% (1/38)[Kremote: Counting objects:   5% (2/38)[Kremote: Counting objects:   7% (3/38)[Kremote: Counting objects:  10% (4/38)[Kremote: Counting objects:  13% (5/38)[Kremote: Counting objects:  15% (6/38)[Kremote: Counting objects:  18% (7/38)[Kremote: Counting objects:  21% (8/38)[Kremote: Counting objects:  23% (9/38)[Kremote: Counting objects:  26% (10/38)[Kremote: Counting objects:  28% (11/38)[Kremote: Counting objects:  31% (12/38)[Kremote: Counting objects:  34% (13/38)[Kremote: Counting objects:  36% (14/38)[Kremote: Counting objects:  39% (15/38)[Kremote: Counting objects:  42% (16/38)[Kremote: Counting objects:  44% (17/38)[Kremote: Counting objects:  47% (18/38)[Kremote: Counting objects:  50% (19/38)[Kremote: Counting objects:  52% (20/38)[Kremote: Counting objects:  55% (21/38)[Kremote: Counting objects:  57% (22/38

In [42]:
# DEFINE: TARGET COMPOSERS
COMPOSERS = [
    "bach",
    "beethoven",
    "chopin",
    "mozart"
]

# Move the data to the interim folder and rename files
for composer in COMPOSERS:
    !cp -r {REPO}/data/raw/midi-classical-music/{composer} {REPO}/data/interim

In [43]:
# Move the data to the interim folder and rename files
!mkdir -p {REPO}/data/interim/midi

for composer in COMPOSERS:
    interim_composer_dir = REPO / "data" / "interim" / composer
    midi_dir = REPO / "data" / "interim" / "midi"

    for filename in os.listdir(interim_composer_dir):
        if filename.endswith(".mid"):
            source_path = interim_composer_dir / filename
            new_filename = f"{composer}-{filename}"
            destination_path = midi_dir / new_filename
            os.rename(source_path, destination_path)

MIDI_FOLDER = f"{REPO}/data/interim/midi"
print(f"MIDI_FOLDER --> {MIDI_FOLDER}")

MIDI_FOLDER --> /content/project/data/interim/midi


In [None]:
import gradio as gr
from music21 import midi
import os
from midi2audio import FluidSynth

def play_midi(midi_directory, file_path):
    full_path = os.path.join(midi_directory, file_path)
    fs = FluidSynth()
    wav_path = full_path.replace('.mid', '.wav')
    fs.midi_to_audio(full_path, wav_path)
    return wav_path

def find_midi_files(directory):
    midi_files = []
    for root, _, files in os.walk(directory):
        for file in files:
            if file.endswith(".mid"):
                midi_files.append(os.path.join(root, file))
    return midi_files

midi_files = find_midi_files(MIDI_FOLDER)

file_map = {os.path.basename(file): file for file in midi_files}
dropdown = gr.Dropdown(choices=list(file_map.keys()), label="Select MIDI file")

def play_midi_from_filename(filename):
    file_path = file_map[filename]
    return play_midi(MIDI_FOLDER, file_path)

iface = gr.Interface(
    fn=play_midi_from_filename,
    inputs=dropdown,
    outputs=gr.Audio(label="Play MIDI"),
    title="MIDI Player",
    analytics_enabled=False,
    allow_flagging=None
)

iface.launch(debug=True)

In [57]:
import os
import py_midicsv as pm
from tqdm.notebook import tqdm

print("Processing MIDI files...")
MIDI_FOLDER = f"{REPO}/data/interim/midi"
CSV_FOLDER = f"{REPO}/data/interim/csv"

if not os.path.exists(CSV_FOLDER):
    os.makedirs(CSV_FOLDER)

midi_files = [f for f in os.listdir(MIDI_FOLDER) if f.endswith(".mid")]

def midi_to_csv(midi_path, csv_path):
    try:
        # Load the MIDI file and parse it into CSV format
        csv_string_list = pm.midi_to_csv(midi_path)

        # Write the CSV data to a file
        with open(csv_path, "w") as f:
            f.writelines(csv_string_list)

    except Exception as e:
        print(f"Error processing {midi_path}: {e}")

for filename in tqdm(midi_files, desc="Processing MIDI files"):
    midi_path = os.path.join(MIDI_FOLDER, filename)
    csv_path = os.path.join(CSV_FOLDER, filename.replace(".mid", ".csv"))
    midi_to_csv(midi_path, csv_path)

Processing MIDI files...


Processing MIDI files:   0%|          | 0/499 [00:00<?, ?it/s]

In [122]:
import os
import warnings
import pretty_midi
import numpy as np
import pandas as pd
from tqdm import tqdm
from scipy.stats import skew, kurtosis

# Suppress warnings
warnings.filterwarnings("ignore", category=RuntimeWarning)

MIDI_FOLDER = f"{REPO}/data/interim/midi"
CSV_FOLDER = f"{REPO}/data/interim/csv"

def extract_features(midi_path):
    try:
        midi_data = pretty_midi.PrettyMIDI(midi_path)
    except Exception as e:
        print(f"Error processing {midi_path}: {e}")
        return None

    # Basic MIDI Information
    total_duration = midi_data.get_end_time()
    num_instruments = len(midi_data.instruments)

    # Note-Level Features
    notes = []
    pitches = []
    durations = []
    velocities = []
    harmony = []
    polyphony = []

    for instrument in midi_data.instruments:
        instrument_notes = instrument.notes
        notes.extend(instrument_notes)
        pitches.extend([note.pitch for note in instrument_notes])
        durations.extend([note.end - note.start for note in instrument_notes])
        velocities.extend([note.velocity for note in instrument_notes])
        harmony.extend([note.pitch for note in instrument_notes])
        polyphony.extend([(note.start, note.end) for note in instrument_notes])

    avg_pitch = np.mean(pitches) if pitches else 0
    pitch_range = np.ptp(pitches) if pitches else 0
    pitch_std = np.std(pitches) if pitches else 0
    avg_duration = np.mean(durations) if durations else 0
    duration_range = np.ptp(durations) if durations else 0
    duration_std = np.std(durations) if durations else 0
    avg_velocity = np.mean(velocities) if velocities else 0
    velocity_variance = np.var(velocities) if velocities else 0

    # Time Signature and Key
    time_signature = midi_data.time_signature_changes[0].numerator if midi_data.time_signature_changes else 4
    try:
        key_signature = midi_data.key_signature_changes[0] if midi_data.key_signature_changes else None
        if key_signature:
            key = key_signature.key_number % 12
            mode = 'major' if key_signature.key_number < 12 else 'minor'
        else:
            key = 0
            mode = 'major'
    except Exception as e:
        print(f"Error processing key signature for {midi_path}: {e}")
        key = 0
        mode = 'major'

    # Harmonic Features
    harmony_complexity = len(set(harmony))

    # Rhythmic Features
    note_density = len(pitches) / total_duration if total_duration > 0 else 0

    # Dynamic Features
    dynamic_range = np.ptp(velocities) if velocities else 0

    # Articulation Features
    staccato_ratio = sum(1 for d in durations if d < 0.1) / len(durations) if durations else 0
    legato_ratio = sum(1 for d in durations if d > 0.5) / len(durations) if durations else 0

    # Polyphony Features
    polyphony_density = np.mean([end - start for start, end in polyphony]) if polyphony else 0

    # Tempo Features
    tempo_changes = midi_data.get_tempo_changes()[1]
    tempo_variability = np.std(tempo_changes) if len(tempo_changes) > 1 else 0
    avg_tempo = np.mean(tempo_changes) if len(tempo_changes) > 0 else 0

    # Orchestration Features
    instrument_families = [instrument.program // 8 for instrument in midi_data.instruments]
    instrument_diversity = len(set(instrument_families))

    # Expressive Features
    articulation_variability = np.std([note.end - note.start for note in notes]) if notes else 0
    dynamic_variability = np.std(velocities) if velocities else 0

    # Pitch Class Features
    pitch_classes = [note.pitch % 12 for note in notes]
    pitch_class_histogram = np.histogram(pitch_classes, bins=12, range=(0, 12))[0]

    # Interval Features
    intervals = np.diff(pitches) if len(pitches) > 1 else [0]
    interval_histogram = np.histogram(intervals, bins=12, range=(-6, 6))[0]

    # Additional Features
    pitch_entropy = -np.sum((np.histogram(pitches, bins=128, range=(0, 128))[0] / len(pitches)) * np.log2(np.histogram(pitches, bins=128, range=(0, 128))[0] / len(pitches) + 1e-9)) if pitches else 0
    duration_entropy = -np.sum((np.histogram(durations, bins=100)[0] / len(durations)) * np.log2(np.histogram(durations, bins=100)[0] / len(durations) + 1e-9)) if durations else 0
    velocity_entropy = -np.sum((np.histogram(velocities, bins=128, range=(0, 128))[0] / len(velocities)) * np.log2(np.histogram(velocities, bins=128, range=(0, 128))[0] / len(velocities) + 1e-9)) if velocities else 0
    chord_entropy = -np.sum((np.histogram(harmony, bins=128, range=(0, 128))[0] / len(harmony)) * np.log2(np.histogram(harmony, bins=128, range=(0, 128))[0] / len(harmony) + 1e-9)) if harmony else 0
    key_changes = len(midi_data.key_signature_changes)
    tempo_changes_count = len(tempo_changes)

    # New features
    pitch_skewness = skew(pitches) if pitches else 0
    pitch_kurtosis = kurtosis(pitches) if pitches else 0

    # Melodic contour (simplified)
    melodic_contour = np.diff(pitches) if len(pitches) > 1 else [0]
    contour_direction = np.sum(np.sign(melodic_contour))

    # Rhythmic complexity (simplified)
    ioi = np.diff([note.start for note in notes]) if len(notes) > 1 else [0]
    rhythmic_complexity = np.std(ioi) if len(ioi) > 0 else 0

    features = {
        'duration': total_duration,
        'num_instruments': num_instruments,
        'notes': len(notes),
        'avg_pitch': avg_pitch,
        'pitch_range': pitch_range,
        'pitch_std': pitch_std,
        'avg_duration': avg_duration,
        'duration_range': duration_range,
        'duration_std': duration_std,
        'time_signature': time_signature,
        'key': key,
        'mode': mode,
        'harmony_complexity': harmony_complexity,
        'note_density': note_density,
        'avg_velocity': avg_velocity,
        'velocity_variance': velocity_variance,
        'dynamic_range': dynamic_range,
        'staccato_ratio': staccato_ratio,
        'legato_ratio': legato_ratio,
        'polyphony_density': polyphony_density,
        'tempo_variability': tempo_variability,
        'avg_tempo': avg_tempo,
        'instrument_diversity': instrument_diversity,
        'articulation_variability': articulation_variability,
        'dynamic_variability': dynamic_variability,
        'pitch_class_histogram': pitch_class_histogram.tolist(),
        'interval_histogram': interval_histogram.tolist(),
        'pitch_entropy': pitch_entropy,
        'duration_entropy': duration_entropy,
        'velocity_entropy': velocity_entropy,
        'chord_entropy': chord_entropy,
        'key_changes': key_changes,
        'tempo_changes_count': tempo_changes_count,
        'pitch_skewness': pitch_skewness,
        'pitch_kurtosis': pitch_kurtosis,
        'contour_direction': contour_direction,
        'rhythmic_complexity': rhythmic_complexity,
    }

    return features

def process_file(filename):
    midi_path = os.path.join(MIDI_FOLDER, filename)
    try:
        features = extract_features(midi_path)
        if features is not None:
            features['filename'] = filename
             # Assuming filename format is composer-filename.mid
            features['composer'] = filename.split('-')[0]
        return features
    except Exception as e:
        print(f"Error processing file {filename}: {e}")
        return None

# Main execution
midi_files = [f for f in os.listdir(MIDI_FOLDER) if f.endswith(".mid")]

# Process files
all_features = []
for filename in tqdm(midi_files, desc="Extracting features"):
    features = process_file(filename)
    if features is not None:
        all_features.append(features)

# Convert to DataFrame
features_df = pd.DataFrame(all_features)

# Save to CSV
csv_path = os.path.join(CSV_FOLDER, 'midi_features.csv')
features_df.to_csv(csv_path, index=False)
print(f"Features saved to {csv_path}")

Extracting features:  55%|█████▍    | 274/499 [01:16<01:16,  2.92it/s]

Error processing /content/project/data/interim/midi/beethoven-anhang_14_3.mid: Could not decode key with 3 flats and mode 255


Extracting features: 100%|██████████| 499/499 [02:13<00:00,  3.75it/s]

Features saved to /content/project/data/interim/csv/midi_features.csv





In [129]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
import ast

# Load the features
csv_path = os.path.join(CSV_FOLDER, 'midi_features.csv')
features_df = pd.read_csv(csv_path)

# Encode categorical features
label_encoder = LabelEncoder()
features_df['composer'] = label_encoder.fit_transform(features_df['composer'])
features_df['mode'] = label_encoder.fit_transform(features_df['mode'])

# Convert list-like strings to numerical arrays
features_df['pitch_class_histogram'] = features_df['pitch_class_histogram'].apply(ast.literal_eval)
features_df['interval_histogram'] = features_df['interval_histogram'].apply(ast.literal_eval)

# Flatten list-like columns
pitch_class_histogram_df = pd.DataFrame(features_df['pitch_class_histogram'].tolist(), index=features_df.index)
interval_histogram_df = pd.DataFrame(features_df['interval_histogram'].tolist(), index=features_df.index)

# Rename columns to avoid conflicts
pitch_class_histogram_df.columns = [f'pitch_class_histogram_{i}' for i in range(pitch_class_histogram_df.shape[1])]
interval_histogram_df.columns = [f'interval_histogram_{i}' for i in range(interval_histogram_df.shape[1])]

# Concatenate the flattened columns back to the original DataFrame
features_df = pd.concat([features_df, pitch_class_histogram_df, interval_histogram_df], axis=1)
features_df.drop(columns=['pitch_class_histogram', 'interval_histogram', 'filename'], inplace=True)

# Normalize numerical features
numerical_features = features_df.select_dtypes(include=[np.number]).columns
scaler = StandardScaler()
features_df[numerical_features] = scaler.fit_transform(features_df[numerical_features])

# Ensure all features are numeric
for col in features_df.columns:
    if features_df[col].dtype == 'object':
        features_df[col] = features_df[col].apply(lambda x: np.nan if x == '' else x).astype(float)

# Drop rows with NaN values (if any)
features_df.dropna(inplace=True)

# Split the data into training and testing sets
X = features_df.drop(columns=['composer'])
y = features_df['composer']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Ensure that the labels are within the valid range
num_classes = len(label_encoder.classes_)
y_train = np.clip(y_train, 0, num_classes - 1)
y_test = np.clip(y_test, 0, num_classes - 1)

In [130]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Input

# Define the LSTM model
def create_lstm_model(input_shape):
    model = Sequential()
    model.add(Input(shape=input_shape))
    model.add(LSTM(128, return_sequences=True))
    model.add(Dropout(0.2))
    model.add(LSTM(64, return_sequences=False))
    model.add(Dropout(0.2))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    return model

# Reshape the data for LSTM
X_train_lstm = X_train.values.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test_lstm = X_test.values.reshape((X_test.shape[0], X_test.shape[1], 1))

# Create and compile the LSTM model
lstm_model = create_lstm_model((X_train_lstm.shape[1], 1))
lstm_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the LSTM model
lstm_model.fit(X_train_lstm, y_train, epochs=50, batch_size=32, validation_data=(X_test_lstm, y_test))

Epoch 1/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 278ms/step - accuracy: 0.5522 - loss: 0.5939 - val_accuracy: 0.5000 - val_loss: 0.6067
Epoch 2/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 376ms/step - accuracy: 0.5668 - loss: 0.5199 - val_accuracy: 0.5000 - val_loss: 0.5142
Epoch 3/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 101ms/step - accuracy: 0.5668 - loss: 0.4588 - val_accuracy: 0.5000 - val_loss: 0.5193
Epoch 4/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 376ms/step - accuracy: 0.5668 - loss: 0.4486 - val_accuracy: 0.5000 - val_loss: 0.5165
Epoch 5/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 108ms/step - accuracy: 0.5668 - loss: 0.4516 - val_accuracy: 0.5000 - val_loss: 0.5141
Epoch 6/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 370ms/step - accuracy: 0.5668 - loss: 0.4602 - val_accuracy: 0.5000 - val_loss: 0.5106
Epoch 7/50
[1m13/13[0m [3

<keras.src.callbacks.history.History at 0x7cd77c4905b0>

In [131]:
from tensorflow.keras.layers import Conv1D, MaxPooling1D, Flatten

# Define the CNN model
def create_cnn_model(input_shape):
    model = Sequential()
    model.add(Input(shape=input_shape))
    model.add(Conv1D(64, kernel_size=3, activation='relu'))
    model.add(MaxPooling1D(pool_size=2))
    model.add(Dropout(0.2))
    model.add(Conv1D(32, kernel_size=3, activation='relu'))
    model.add(MaxPooling1D(pool_size=2))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(32, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    return model

# Reshape the data for CNN
X_train_cnn = X_train.values.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test_cnn = X_test.values.reshape((X_test.shape[0], X_test.shape[1], 1))

# Create and compile the CNN model
cnn_model = create_cnn_model((X_train_cnn.shape[1], 1))
cnn_model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the CNN model
cnn_model.fit(X_train_cnn, y_train, epochs=50, batch_size=32, validation_data=(X_test_cnn, y_test))

Epoch 1/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 85ms/step - accuracy: 0.5595 - loss: 0.4963 - val_accuracy: 0.5000 - val_loss: 0.4832
Epoch 2/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.5668 - loss: 0.4343 - val_accuracy: 0.5000 - val_loss: 0.4564
Epoch 3/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.5668 - loss: 0.4271 - val_accuracy: 0.5000 - val_loss: 0.4338
Epoch 4/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - accuracy: 0.5668 - loss: 0.4093 - val_accuracy: 0.5000 - val_loss: 0.4121
Epoch 5/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 124ms/step - accuracy: 0.5668 - loss: 0.3869 - val_accuracy: 0.5000 - val_loss: 0.3885
Epoch 6/50
[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.5767 - loss: 0.3616 - val_accuracy: 0.5100 - val_loss: 0.3683
Epoch 7/50
[1m13/13[0m [32m━━━━━━━

<keras.src.callbacks.history.History at 0x7cd71c50b0a0>

In [132]:
# Evaluate the LSTM model
lstm_loss, lstm_accuracy = lstm_model.evaluate(X_test_lstm, y_test)
print(f"LSTM Model Accuracy: {lstm_accuracy * 100:.2f}%")

[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 124ms/step - accuracy: 0.5830 - loss: 0.3769
LSTM Model Accuracy: 58.00%


In [133]:
# Evaluate the CNN model
cnn_loss, cnn_accuracy = cnn_model.evaluate(X_test_cnn, y_test)
print(f"CNN Model Accuracy: {cnn_accuracy * 100:.2f}%")

[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.6140 - loss: 0.3994 
CNN Model Accuracy: 60.00%
