# Introduction

Music is a form of art that is ubiquitous and has a rich history. Different composers have created music with their unique styles and compositions. However, identifying the composer of a particular piece of music can be a challenging task, especially for novice musicians or listeners. The proposed project aims to use deep learning techniques to identify the composer of a given piece of music accurately.

# Objective

The primary objective of this project is to develop a deep learning model that can predict the composer of a given musical score accurately. The project aims to accomplish this objective by using two deep learning techniques: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN).

## Code

In [3]:
"""Imports"""
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

from miditok import REMI, TokenizerConfig
from miditok.data_augmentation import augment_dataset
from miditok.pytorch_data import DatasetMIDI, DataCollator
from miditok.utils import split_files_for_training
from torch.utils.data import DataLoader
from pathlib import Path
from random import sample, shuffle, seed as random_seed

In [4]:
"""Set seeds and device"""
random_seed(73)
np.random.seed(73)
torch.manual_seed(73)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(73)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cpu




### Data Collection: Data is collected and provided to you.

In [3]:
"""Include data EDA from Anitra's branch here"""

"Include data EDA from Anitra's branch here"

### Data Pre-processing: Convert the musical scores into a format suitable for deep learning models. This involves converting the musical scores into MIDI files and applying data augmentation techniques.

In [None]:
"""Train a tokenizer on each composer, split the data by number of tokens and augment the data"""
config = TokenizerConfig(use_chords=True, use_rests=True, use_tempos=True, use_programs=True)
tokenizer = REMI(config)

DATA_PATH = Path(Path.cwd().parent, "Data")
RETAIN = 10000

for composer in ["Bach", "Beethoven", "Chopin", "Mozart"]:
	print(f"{composer}:")
	midi_path = list(Path(DATA_PATH).glob(f"{composer}/*.mid"))
	tokenizer.train(vocab_size=30000, files_paths=midi_path)
	subset_chunks_dir = Path(DATA_PATH, f"{composer}_augmented")
	
	split_files_for_training(
        files_paths=midi_path,
        tokenizer=tokenizer,
        save_dir=subset_chunks_dir,
        max_seq_len=1024,
        num_overlap_bars=2,
    )

	augment_dataset(
		subset_chunks_dir,
		pitch_offsets=[-6, 6],
		velocity_offsets=[-4, 4],
		duration_offsets=[-0.2, 0.2],
		save_data_aug_report=False,
		all_offset_combinations=composer == "Chopin" # perform more augmentation if chopin (low token count)
	)

	# use RETAIN as upper bound of song count and remove excess at random to balance composer data
	all_files = list(Path(subset_chunks_dir).glob("*.mid"))
	composer_sample = sample(all_files, len(all_files) - RETAIN)
	for file_name in composer_sample:
		file_name.unlink()

Bach:
Splitting music files: 100%|██████████| 1024/1024 [00:10<00:00, 79.55it/s]
Performing data augmentation: 100%|██████████| 2263/2263 [00:24<00:00, 500.35it/s]
Beethoven:
Splitting music files: 100%|██████████| 220/220 [00:02<00:00, 98.92it/s]
Performing data augmentation: 100%|██████████| 3004/3004 [00:27<00:00, 460.36it/s]
Chopin:
Splitting music files: 100%|██████████| 135/135 [00:01<00:00, 131.18it/s]
Performing data augmentation: 100%|██████████| 11754/11754 [01:38<00:00, 119.72it/s]
Mozart:
Splitting music files: 100%|██████████| 257/257 [00:02<00:00, 104.82it/s]
Performing data augmentation: 100%|██████████| 2588/2588 [00:04<00:00, 527.91it/s]


In [None]:
"""Combine all MIDIs and train new tokenizer using combined data"""
combined_midi_path = list(Path(DATA_PATH).glob(f"*_augmented/*.mid"))

tokenizer = REMI(config)
tokenizer.train(vocab_size=30000, files_paths=combined_midi_path)

In [None]:
"""Split into train/valid/test datasets using roughly 15% of the data for each of valid and test"""
total_num_files = len(combined_midi_path)
num_files_valid = round(total_num_files * 0.15)
num_files_test = round(total_num_files * 0.15)
shuffle(combined_midi_path)
midi_paths_valid = combined_midi_path[:num_files_valid]
midi_paths_test = combined_midi_path[num_files_valid:num_files_valid + num_files_test]
midi_paths_train = combined_midi_path[num_files_valid + num_files_test:]

for files_paths, subset_name in (
    (midi_paths_train, "Train"), (midi_paths_valid, "Validate"), (midi_paths_test, "Test")
):
    subset_chunks_dir = Path(DATA_PATH, subset_name)
    split_files_for_training(
        files_paths=files_paths,
        tokenizer=tokenizer,
        save_dir=subset_chunks_dir,
        max_seq_len=1024,
        num_overlap_bars=2,
    )

Splitting music files: 100%|██████████| 28000/28000 [00:45<00:00, 613.39it/s]
Splitting music files: 100%|██████████| 6000/6000 [00:09<00:00, 606.37it/s]
Splitting music files: 100%|██████████| 6000/6000 [00:09<00:00, 614.17it/s]


In [None]:
"""Create torch compatable data loaders using the datasets created above"""
def lable_composer(score, tok_sequence, file_path):
    composer = file_path.parts[-2:-1]
    if "Bach" in composer:
        return 0
    elif "Beethoven" in composer:
        return 1
    elif "Chopin" in composer:
        return 2
    elif "Mozart" in composer:
        return 3
    
collator = DataCollator(tokenizer.pad_token_id)

dataset_train = DatasetMIDI(
    files_paths=list(Path(DATA_PATH, "Train").glob("**/*.mid")),
    tokenizer=tokenizer,
    max_seq_len=1024,
    bos_token_id=tokenizer["BOS_None"],
    eos_token_id=tokenizer["EOS_None"],
    func_to_get_labels=lable_composer
)
dataloader_train = DataLoader(dataset_train, batch_size=64, collate_fn=collator)

dataset_valid = DatasetMIDI(
    files_paths=list(Path(DATA_PATH, "Valid").glob("**/*.mid")),
    tokenizer=tokenizer,
    max_seq_len=1024,
    bos_token_id=tokenizer["BOS_None"],
    eos_token_id=tokenizer["EOS_None"],
)
dataloader_valid = DataLoader(dataset_valid, batch_size=64, collate_fn=collator)

dataset_test = DatasetMIDI(
    files_paths=list(Path(DATA_PATH, "Test").glob("**/*.mid")),
    tokenizer=tokenizer,
    max_seq_len=1024,
    bos_token_id=tokenizer["BOS_None"],
    eos_token_id=tokenizer["EOS_None"],
)
dataloader_test = DataLoader(dataset_test, batch_size=64, collate_fn=collator)

In [32]:
"""The dataloader has input ids (midi-tokens)"""
for batch in dataloader_train:
    print(batch)
    break

{'input_ids': tensor([[  1,   4, 190,  ...,  31, 115, 126],
        [  4, 190, 352,  ...,  33, 106,   0],
        [  4, 190, 332,  ..., 111, 135,   0],
        [  1,   4, 190,  ..., 131, 362,  44]]), 'labels': tensor([[3],
        [3],
        [3],
        [3]]), 'attention_mask': tensor([[1, 1, 1,  ..., 1, 1, 1],
        [1, 1, 1,  ..., 1, 1, 0],
        [1, 1, 1,  ..., 1, 1, 0],
        [1, 1, 1,  ..., 1, 1, 1]], dtype=torch.int32)}


### Feature Extraction: Extract features from the MIDI files, such as notes, chords, and tempo, using music analysis tools.

### Model Building: Develop a deep learning model using LSTM and CNN architectures to classify the musical scores according to the composer.

### Model Training: Train the deep learning model using the pre-processed and feature-extracted data.

### Model Evaluation: Evaluate the performance of the deep learning model using accuracy, precision, and recall metrics.

### Model Optimization: Optimize the deep learning model by fine-tuning hyperparameters.

## Findings