In [23]:
warnings.filterwarnings("ignore", category=UserWarning, module="librosa")
warnings.filterwarnings("ignore", category=FutureWarning, module="librosa")

## Business Understanding

This project aims to solve the problem of automatically classifying musical chords as either major or minor using audio input. Chord identification is a key task in music analysis, and automating it can save time on transcription and harmonic analysis. By using machine learning and music information retrieval (MIR) techniques, the goal is to create a tool that helps musicians, producers, and educators analyze music in real-time. The project focuses on making chord recognition more accessible and efficient, benefiting both students and professionals in the music industry.

---

## Tools/Methodologies

To handle the workflow, I'll use several Python libraries:

- [librosa](https://librosa.org/doc/latest/index.html) for extracting audio features, [numpy](https://numpy.org/doc/1.24/reference/index.html#reference) and [pandas](https://pandas.pydata.org/docs/reference/index.html#api) for data manipulation, and os and [Kaggle CLI](https://www.kaggle.com/code/donkeys/kaggle-python-api) to download the data directly into the notebook.
- [matplotlib](https://matplotlib.org/stable/api/index.html) and [seaborn](https://seaborn.pydata.org/api.html) for exploring and visualizing features like waveforms and spectrograms.
- [scikit-learn](https://scikit-learn.org/stable/api/index.html) for baseline models (e.g., logistic regression, SVM), and [tensorflow](https://www.tensorflow.org/api_docs/python/tf/all_symbols) or [keras](https://keras.io/api/) for building CNNs.

In [2]:
# Data manipulation
import numpy as np
import pandas as pd
import sklearn
import random
import time
import json
import os

# Audio feature extraction
import librosa
import librosa.display
import soundfile as sf
from scipy.signal import find_peaks

# for Kaggle CLI
from kaggle.api.kaggle_api_extended import KaggleApi

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

# Machine learning models and utilities
from imblearn.over_sampling import SMOTE
from sklearn.utils import class_weight
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import GridSearchCV
from sklearn.impute import KNNImputer

# Deep learning for CNNs
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import LSTM, Conv2D, MaxPooling2D, Flatten, Dense, Reshape, Dropout, GlobalAveragePooling2D
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping


## Data Understanding
The dataset used in this project is sourced from the [Musical Instrument Chord Classification (Audio)](https://www.kaggle.com/datasets/deepcontractor/musical-instrument-chord-classification) dataset on Kaggle. It contains audio files `.wav` format of chords played on two instruments: guitar and piano. The raw data has been scraped from various sources and is already available for download on Kaggle, eliminating the need for manual data collection. The dataset is well-suited for this project, as it provides a clear distinction between major and minor chords, which is the focus of the classification task.

The features for the model will be extracted from the audio files using techniques such as Mel-frequency cepstral coefficients (MFCCs) or spectrograms, which capture important frequency and temporal information from the audio signals. Although other individuals may have used this dataset for similar chord classification tasks, this project will build upon existing work by focusing specifically on distinguishing between major and minor chords, potentially improving upon current models or exploring new machine learning techniques for this type of classification.

In [3]:
# uncomment if replicating project
# !pip install kaggle

In [4]:
# Load kaggle.json credentials
api_config_path = os.path.join(os.getcwd(), 'kaggle.json')
with open(api_config_path, 'r') as f:
    kaggle_config = json.load(f)

# Set environment variables
os.environ['KAGGLE_USERNAME'] = kaggle_config['username']
os.environ['KAGGLE_KEY'] = kaggle_config['key']

# Initialize the Kaggle API
api = KaggleApi()
api.authenticate()

# Ensure the 'dataset' folder exists
dataset_dir = os.path.join(os.getcwd(), 'dataset')
os.makedirs(dataset_dir, exist_ok=True)

# Use the Kaggle API to download the dataset
api.dataset_download_files('deepcontractor/musical-instrument-chord-classification',
                           path=dataset_dir, unzip=True)

print("Dataset downloaded and extracted to:", dataset_dir)

Dataset URL: https://www.kaggle.com/datasets/deepcontractor/musical-instrument-chord-classification
Dataset downloaded and extracted to: C:\Users\Nik\Desktop\code\Flatiron\capstone\dataset


In [5]:
# Define the base directory where the audio files are stored
base_dir = os.path.join(os.getcwd(), 'dataset', 'Audio_Files')

# Prepare to collect file details
file_details = []

# Loop through each category directory ('Major' and 'Minor')
for category in ['Major', 'Minor']:
    category_dir = os.path.join(base_dir, category)
    
    for filename in os.listdir(category_dir):
        if filename.endswith('.wav'):
            # Full path to file
            file_path = os.path.join(category_dir, filename)
            # Append the file path, filename (used as ID), and label to the list
            file_details.append({'path': file_path, 'id': filename, 'label': category})

# Save collected file details as a DataFrame
file_data = pd.DataFrame(file_details)

file_data.head()

Unnamed: 0,path,id,label
0,C:\Users\Nik\Desktop\code\Flatiron\capstone\da...,Major_0.wav,Major
1,C:\Users\Nik\Desktop\code\Flatiron\capstone\da...,Major_1.wav,Major
2,C:\Users\Nik\Desktop\code\Flatiron\capstone\da...,Major_10.wav,Major
3,C:\Users\Nik\Desktop\code\Flatiron\capstone\da...,Major_100.wav,Major
4,C:\Users\Nik\Desktop\code\Flatiron\capstone\da...,Major_101.wav,Major


## Data Preparation

### Feature Extraction Functions:
- 

In [17]:
def extract_audio_features(signal=None, sr=22050, hop_length=512, n_fft=2048):
    if signal is None or not isinstance(signal, np.ndarray):
        print("Warning: No valid audio signal provided.")
        return {
            'chroma': np.full(12, np.nan),
            'mfcc': np.full(20, np.nan),
            'spectral_centroid': np.nan,
            'zero_crossing_rate': np.nan
        }

    try:
        chroma = librosa.feature.chroma_stft(y=signal, sr=sr, hop_length=hop_length, n_fft=n_fft).mean(axis=1)
        mfccs = librosa.feature.mfcc(y=signal, sr=sr, n_mfcc=20, hop_length=hop_length, n_fft=n_fft).mean(axis=1)
        spectral_centroid = librosa.feature.spectral_centroid(y=signal, sr=sr, hop_length=hop_length).mean()
        zero_crossing_rate = librosa.feature.zero_crossing_rate(signal, hop_length=hop_length).mean()

        return {
            'chroma': chroma,
            'mfcc': mfccs,
            'spectral_centroid': spectral_centroid,
            'zero_crossing_rate': zero_crossing_rate
        }

    except Exception as e:
        print(f"Error during feature extraction: {e}")
        return {
            'chroma': np.full(12, np.nan),
            'mfcc': np.full(20, np.nan),
            'spectral_centroid': np.nan,
            'zero_crossing_rate': np.nan
        }

In [18]:
def find_harmonics(signal=None, sr=22050, n_fft=2048):
    """
    Extract harmonic frequencies and intervals from an audio signal.
    
    Parameters:
    - signal: Audio signal array (used for both original and augmented data).
    - sr: Sample rate.
    - n_fft: Number of FFT components.
    
    Returns:
    - harmonic_frequencies and harmonic_intervals
    """
    try:
        if signal is None or len(signal) == 0:
            raise ValueError("No audio signal provided.")

        # Perform STFT to get the frequency spectrum
        S = np.abs(librosa.stft(signal, n_fft=n_fft))

        # Sum over time frames to get the overall magnitude spectrum
        magnitude = np.mean(S, axis=1)

        # Frequency bins corresponding to the FFT
        frequency = np.fft.fftfreq(len(magnitude), 1/sr)

        # Only keep positive frequencies
        positive_freq_idxs = np.where(frequency >= 0)
        positive_freqs = frequency[positive_freq_idxs]
        positive_magnitude = magnitude[positive_freq_idxs]

        # Find peaks in the frequency spectrum (harmonics)
        peaks, _ = find_peaks(positive_magnitude, height=np.max(positive_magnitude) * 0.1)

        # Get the corresponding frequencies of the peaks (harmonic frequencies)
        harmonic_frequencies = positive_freqs[peaks]

        # Calculate intervals between harmonic frequencies
        harmonic_intervals = np.diff(harmonic_frequencies) if len(harmonic_frequencies) > 1 else []

        return harmonic_frequencies, harmonic_intervals

    except Exception as e:
        print(f"Error processing harmonics for augmented signal: {e}")
        return None, None


In [19]:
def extract_mel_spectrogram(signal=None, sr=22050, n_mels=128, hop_length=512, fixed_length=100):
    if signal is None or not isinstance(signal, np.ndarray):
        print("Warning: No valid audio signal provided for mel-spectrogram extraction.")
        return np.full((fixed_length, n_mels), np.nan)

    try:
        # Generate Mel-spectrogram
        mel_spectrogram = librosa.feature.melspectrogram(y=signal, sr=sr, n_mels=n_mels, hop_length=hop_length)
        log_mel_spectrogram = librosa.power_to_db(mel_spectrogram, ref=np.max)

        # Pad or truncate to the fixed length
        log_mel_spectrogram = log_mel_spectrogram.T  # Transpose to (time_steps, n_mels)
        if log_mel_spectrogram.shape[0] < fixed_length:
            pad_width = fixed_length - log_mel_spectrogram.shape[0]
            log_mel_spectrogram = np.pad(log_mel_spectrogram, ((0, pad_width), (0, 0)), mode='constant')
        else:
            log_mel_spectrogram = log_mel_spectrogram[:fixed_length, :]

        return log_mel_spectrogram

    except Exception as e:
        print(f"Error during Mel-spectrogram extraction: {e}")
        return np.full((fixed_length, n_mels), np.nan)

#### Running Feature Extraction on Origional Dataset

In [24]:
# Initialize a list to store feature data
feature_data = []

# Extract features for each file and store them in a list
for index, row in file_data.iterrows():
    try:
        # Load the original audio signal
        signal, sr = librosa.load(row['path'], sr=None)

        # Check if the signal is valid
        if signal is None or len(signal) == 0:
            print(f"Warning: Empty or invalid audio signal for {row['path']}")
            continue

        # Extract features using the updated function
        features = extract_audio_features(signal=signal, sr=sr)

        if features is not None:
            feature_data.append({
                'id': row['id'].replace('.wav', ''),
                'Label': row['label'],
                'chroma': features['chroma'],  # Store chroma features
                'mfcc': features['mfcc'],      # Store MFCC features
                'spectral_centroid': features['spectral_centroid'],  # Store spectral centroid
                'zero_crossing_rate': features['zero_crossing_rate'] # Store zero-crossing rate
            })
        else:
            print(f"Warning: Feature extraction returned None for {row['path']}")

    except Exception as e:
        print(f"Error extracting features for {row['path']}: {e}")

# Prepare a list to hold each row's dictionary for creating the DataFrame
feature_dict_list = []

# Determine the number of chroma and MFCC coefficients (12 chroma, 20 MFCCs)
n_chroma = 12
n_mfcc = 20

# Processing each item's features to create a flat dictionary
for item in feature_data:
    feature_dict = {
        'id': item['id'],
        'Label': item['Label']
    }
    
    # Store chroma features
    for i in range(n_chroma):
        feature_dict[f'chroma_{i+1}'] = item['chroma'][i] if i < len(item['chroma']) else np.nan
    
    # Store MFCC features
    for i in range(n_mfcc):
        feature_dict[f'mfcc_{i+1}'] = item['mfcc'][i] if i < len(item['mfcc']) else np.nan
    
    # Store spectral centroid and zero-crossing rate as scalar features
    feature_dict['spectral_centroid'] = item['spectral_centroid']
    feature_dict['zero_crossing_rate'] = item['zero_crossing_rate']
    
    feature_dict_list.append(feature_dict)

# Create a new DataFrame from the list of dictionaries
features_df = pd.DataFrame(feature_dict_list)

# Display the first few rows of the new DataFrame to verify
print(features_df.head())


Error extracting features for C:\Users\Nik\Desktop\code\Flatiron\capstone\dataset\Audio_Files\Major\Major_285.wav: 
          id  Label  chroma_1  chroma_2  chroma_3  chroma_4  chroma_5  \
0    Major_0  Major  0.796852  0.417257  0.299810  0.391450  0.769257   
1    Major_1  Major  0.723283  0.452638  0.262528  0.236474  0.472363   
2   Major_10  Major  0.371833  0.267660  0.125646  0.144663  0.298086   
3  Major_100  Major  0.390774  0.934246  0.763238  0.553007  0.418208   
4  Major_101  Major  0.207329  0.403518  0.400692  0.461187  0.456608   

   chroma_6  chroma_7  chroma_8  ...    mfcc_13    mfcc_14    mfcc_15  \
0  0.475731  0.173122  0.434240  ... -16.763796  -8.383060  -2.931852   
1  0.383091  0.316608  0.407608  ...  -4.015803  -3.557462  -3.183057   
2  0.462609  0.684176  0.370034  ... -11.326472  -2.769469  -0.272876   
3  0.357188  0.244438  0.371424  ... -14.805438 -11.510791 -10.704238   
4  0.576159  0.579669  0.570114  ...  -9.541916  -5.118158   0.926334   

    mf

In [25]:
warnings.simplefilter("ignore", UserWarning)

harmonics_data = []
skipped_files = []  # Initialize list to track skipped files

# Extracting harmonics for each file and storing them in a list
for index, row in file_data.iterrows():
    try:
        # Load the audio signal
        signal, sr = librosa.load(row['path'], sr=None)
        
        # Ensure the signal is valid
        if signal is None or len(signal) == 0:
            print(f"Warning: Empty or invalid audio signal for {row['path']}")
            skipped_files.append(row['path'])
            continue

        harmonic_frequencies, harmonic_intervals = find_harmonics(signal=signal, sr=sr)
        
        if harmonic_frequencies is not None:
            harmonics_data.append({
                'id': row['id'].replace('.wav', ''),
                'Label': row['label'],
                'harmonics': harmonic_frequencies,
                'intervals': harmonic_intervals
            })
        else:
            print(f"Warning: No harmonics extracted for {row['path']}")
            skipped_files.append(row['path'])

    except Exception as e:
        print(f"Error processing {row['path']}: {e}")
        skipped_files.append(row['path'])

# Ensure there are harmonics extracted
if harmonics_data:
    # Find the maximum number of harmonic frequencies across all files
    max_harmonics = max(len(item['harmonics']) for item in harmonics_data)

    # Prepare a list to hold each row's dictionary
    harmonics_dict_list = []

    for item in harmonics_data:
        harmonic_dict = {
            'id': item['id'],
            'Label': item['Label']
        }

        # Fill harmonic frequencies, and pad with NaN if there are fewer than max_harmonics
        for i in range(max_harmonics):
            harmonic_dict[f'harmonic_{i+1}'] = item['harmonics'][i] if i < len(item['harmonics']) else np.nan

        # Optionally, add intervals if you want them as well
        if item['intervals'] is not None:
            for i in range(len(item['intervals'])):
                harmonic_dict[f'interval_{i+1}'] = item['intervals'][i]

        harmonics_dict_list.append(harmonic_dict)

    # Create a new DataFrame from the list of dictionaries
    harmonics_df = pd.DataFrame(harmonics_dict_list)
    print("Harmonics DataFrame:")
    print(harmonics_df.head())
else:
    print("No harmonics were extracted from the files.")

# Print skipped files if any
if skipped_files:
    print("\nSkipped files:")
    for file in skipped_files:
        print(file)
else:
    print("\nNo files were skipped.")


Error processing C:\Users\Nik\Desktop\code\Flatiron\capstone\dataset\Audio_Files\Major\Major_285.wav: 
Harmonics DataFrame:
          id  Label  harmonic_1  harmonic_2  harmonic_3  harmonic_4  \
0    Major_0  Major  258.146341  387.219512  516.292683  645.365854   
1    Major_1  Major  387.219512  516.292683  645.365854  774.439024   
2   Major_10  Major  258.146341  387.219512  516.292683  731.414634   
3  Major_100  Major  301.170732  387.219512  602.341463  774.439024   
4  Major_101  Major  301.170732  387.219512  688.390244  946.536585   

    harmonic_5   harmonic_6   harmonic_7   harmonic_8  ...  interval_11  \
0   774.439024  1032.585366  1161.658537  1333.756098  ...   215.121951   
1   989.560976  1161.658537  1333.756098  1979.121951  ...          NaN   
2   946.536585  1118.634146  1333.756098  1419.804878  ...   301.170732   
3   946.536585  1118.634146  1376.780488  1548.878049  ...    86.048780   
4  1118.634146  1247.707317  1548.878049          NaN  ...          NaN   

In [26]:
# Initialize a list to store Mel-spectrogram data
mel_spectrogram_data = []

# Extract Mel-spectrograms for each file and store them in a list
for index, row in file_data.iterrows():
    try:
        # Load the original audio signal
        signal, sr = librosa.load(row['path'], sr=None)

        # Check if the signal is valid
        if signal is None or len(signal) == 0:
            print(f"Warning: Empty or invalid audio signal for {row['path']}")
            continue

        # Extract Mel-spectrogram using the updated function
        mel_spectrogram = extract_mel_spectrogram(signal=signal, sr=sr)

        # Store the Mel-spectrogram data
        mel_spectrogram_data.append({
            'id': row['id'].replace('.wav', ''),
            'Label': row['label'],
            'mel_spectrogram': mel_spectrogram
        })

    except Exception as e:
        print(f"Error extracting Mel-spectrogram for {row['path']}: {e}")

# Convert to DataFrame
mel_df = pd.DataFrame(mel_spectrogram_data)

# Display the first few rows of the new DataFrame to verify
print("Mel-Spectrogram DataFrame:")
print(mel_df.head())


Error extracting Mel-spectrogram for C:\Users\Nik\Desktop\code\Flatiron\capstone\dataset\Audio_Files\Major\Major_285.wav: 
Mel-Spectrogram DataFrame:
          id  Label                                    mel_spectrogram
0    Major_0  Major  [[-22.629173, -18.381105, -14.328545, -12.8387...
1    Major_1  Major  [[-42.334816, -45.12371, -50.84518, -55.982666...
2   Major_10  Major  [[-23.402384, -19.154316, -15.101755, -13.6119...
3  Major_100  Major  [[-26.984245, -23.075817, -19.678623, -15.0486...
4  Major_101  Major  [[-38.758972, -41.832405, -47.83084, -51.69927...


In [12]:
# List to hold the new harmonic ratio columns and their data
harmonic_ratio_data = []

# Get harmonic columns
harmonic_columns = [col for col in harmonics_df.columns if 'harmonic_' in col]

# Ensure there are at least 2 harmonic columns to calculate ratios
if len(harmonic_columns) > 1:
    for i in range(len(harmonic_columns)):
        for j in range(i + 1, len(harmonic_columns)):
            col_i = harmonic_columns[i]
            col_j = harmonic_columns[j]
            ratio_col_name = f'harmonic_ratio_{i}_to_{j}'  # Ensure "harmonic_ratio" is in the name
            
            # Calculate the ratio, avoiding division by zero
            harmonic_ratio = harmonics_df[col_i] / harmonics_df[col_j]
            harmonic_ratio.replace([np.inf, -np.inf], np.nan, inplace=True)  # Replace inf with NaN
            
            # Append the calculated ratios and column name to the list
            harmonic_ratio_data.append(harmonic_ratio.rename(ratio_col_name))
    
    # Concatenate all harmonic ratio columns at once to the DataFrame
    harmonic_ratio_df = pd.concat([harmonics_df] + harmonic_ratio_data, axis=1)

    # Fill NaN values with 0 or another appropriate value (depending on your analysis)
    harmonic_ratio_df.fillna(0, inplace=True)
    
    # Display the new DataFrame
    print(harmonic_ratio_df.head())
else:
    print("Not enough harmonic columns to compute ratios.")


          id  Label  harmonic_1  harmonic_2  harmonic_3  harmonic_4  \
0    Major_0  Major  258.146341  387.219512  516.292683  645.365854   
1    Major_1  Major  387.219512  516.292683  645.365854  774.439024   
2   Major_10  Major  258.146341  387.219512  516.292683  731.414634   
3  Major_100  Major  301.170732  387.219512  602.341463  774.439024   
4  Major_101  Major  301.170732  387.219512  688.390244  946.536585   

    harmonic_5   harmonic_6   harmonic_7   harmonic_8  ...  \
0   774.439024  1032.585366  1161.658537  1333.756098  ...   
1   989.560976  1161.658537  1333.756098  1979.121951  ...   
2   946.536585  1118.634146  1333.756098  1419.804878  ...   
3   946.536585  1118.634146  1376.780488  1548.878049  ...   
4  1118.634146  1247.707317  1548.878049     0.000000  ...   

   harmonic_ratio_16_to_17  harmonic_ratio_16_to_18  harmonic_ratio_16_to_19  \
0                      0.0                      0.0                      0.0   
1                      0.0              

In [27]:
# Merge features_df, harmonics_df, mel_df, and harmonic_ratio_df
raw_complete_df = features_df.merge(harmonics_df, on=['id', 'Label'], how='left')
raw_complete_df = raw_complete_df.merge(mel_df, on=['id', 'Label'], how='left')
raw_complete_df = raw_complete_df.merge(harmonic_ratio_df, on=['id', 'Label'], how='left')

# Display the DataFrame to verify
print("Raw Complete DataFrame (with all possible features):")
print(raw_complete_df.head())


Raw Complete DataFrame (with all possible features):
          id  Label  chroma_1  chroma_2  chroma_3  chroma_4  chroma_5  \
0    Major_0  Major  0.796852  0.417257  0.299810  0.391450  0.769257   
1    Major_1  Major  0.723283  0.452638  0.262528  0.236474  0.472363   
2   Major_10  Major  0.371833  0.267660  0.125646  0.144663  0.298086   
3  Major_100  Major  0.390774  0.934246  0.763238  0.553007  0.418208   
4  Major_101  Major  0.207329  0.403518  0.400692  0.461187  0.456608   

   chroma_6  chroma_7  chroma_8  ...  harmonic_ratio_16_to_17  \
0  0.475731  0.173122  0.434240  ...                      0.0   
1  0.383091  0.316608  0.407608  ...                      0.0   
2  0.462609  0.684176  0.370034  ...                      0.0   
3  0.357188  0.244438  0.371424  ...                      0.0   
4  0.576159  0.579669  0.570114  ...                      0.0   

   harmonic_ratio_16_to_18  harmonic_ratio_16_to_19  harmonic_ratio_16_to_20  \
0                      0.0           

## Augmenting Data

We will augment the audio data using techniques such as time-stretching, pitch-shifting, and adding noise. The augmented data will then have features extracted in the same way as the original data. We will apply these augmentations to our data to create synthetic data - to even the distribution of our classes.

In [13]:
# Augmentation functions
def pitch_shift(signal, sr, n_steps=4):
    return librosa.effects.pitch_shift(signal, sr=sr, n_steps=n_steps)

def add_noise(signal, noise_factor=0.005):
    noise = np.random.randn(len(signal))
    return signal + noise_factor * noise

def augment_audio(signal, sr):
    augmentations = ['time_stretch', 'pitch_shift', 'add_noise']
    augmentation = random.choice(augmentations)

    if augmentation == 'time_stretch':
        return librosa.effects.time_stretch(signal, rate=1.2)
    elif augmentation == 'pitch_shift':
        return pitch_shift(signal, sr, n_steps=4)
    elif augmentation == 'add_noise':
        return add_noise(signal)
    else:
        return signal

In [14]:
# Count existing samples in the original dataset
original_counts = file_data['label'].value_counts()
target_count = 500

# Determine how many samples to augment for each class
augmented_counts = {}
for label, count in original_counts.items():
    if count < target_count:
        augmented_counts[label] = target_count - count
    else:
        augmented_counts[label] = 0  # No augmentation needed


In [15]:
# Augmentation and Feature Extraction Workflow (including harmonics and mel-spectrogram)
augmented_data = []

# Track how many augmentations done for each class
augmentation_tracker = {label: 0 for label in augmented_counts.keys()}

for index, row in file_data.iterrows():
    try:
        # Check if we need to augment this class
        if augmentation_tracker[row['label']] >= augmented_counts[row['label']]:
            continue  # Skip if we've reached the target augmentation

        # Load the original audio signal
        signal, sr = librosa.load(row['path'], sr=None)

        # Ensure the signal is valid before proceeding
        if signal is None or len(signal) == 0:
            print(f"Warning: Empty or invalid audio signal for {row['path']}")
            continue

        # Apply augmentation
        augmented_signal = augment_audio(signal, sr)

        # Ensure the augmented signal is valid
        if augmented_signal is None or not isinstance(augmented_signal, np.ndarray):
            print(f"Warning: Augmented signal is not valid for {row['path']}")
            continue

        # Extract features from the augmented signal
        features = extract_audio_features(signal=augmented_signal, sr=sr)
        harmonics, intervals = find_harmonics(signal=augmented_signal, sr=sr)
        mel_spectrogram = extract_mel_spectrogram(signal=augmented_signal, sr=sr)

        # Check if all features were successfully extracted
        if features is None or harmonics is None or mel_spectrogram is None:
            print(f"Warning: Some features were None for {row['path']}")
            continue

        # Append the features to the augmented data list
        augmented_data.append({
            'id': row['id'].replace('.wav', '') + '_aug',
            'label': row['label'],
            'chroma': features['chroma'],
            'mfcc': features['mfcc'],
            'spectral_centroid': features['spectral_centroid'],
            'zero_crossing_rate': features['zero_crossing_rate'],
            'harmonics': harmonics,
            'intervals': intervals,
            'mel_spectrogram': mel_spectrogram
        })

        # Increment the augmentation count for the class
        augmentation_tracker[row['label']] += 1

    except Exception as e:
        print(f"Error augmenting and extracting features from {row['path']}: {e}")

# Preparing DataFrame for the augmented data
augmented_feature_dict_list = []
n_chroma = 12
n_mfcc = 20
max_harmonics = max([len(item['harmonics']) for item in augmented_data if item['harmonics'] is not None], default=0)

for item in augmented_data:
    feature_dict = {
        'id': item['id'],
        'Label': item['label']
    }
    
    # Store chroma features
    for i in range(n_chroma):
        feature_dict[f'chroma_{i+1}'] = item['chroma'][i] if item['chroma'] is not None and i < len(item['chroma']) else np.nan
    
    # Store MFCC features
    for i in range(n_mfcc):
        feature_dict[f'mfcc_{i+1}'] = item['mfcc'][i] if item['mfcc'] is not None and i < len(item['mfcc']) else np.nan
    
    # Store harmonic features
    for i in range(max_harmonics):
        feature_dict[f'harmonic_{i+1}'] = item['harmonics'][i] if item['harmonics'] is not None and i < len(item['harmonics']) else np.nan
    
    # Store interval features
    if item['intervals'] is not None:
        for i in range(len(item['intervals'])):
            feature_dict[f'interval_{i+1}'] = item['intervals'][i]

    # Store other features
    feature_dict['spectral_centroid'] = item['spectral_centroid'] if item['spectral_centroid'] is not None else np.nan
    feature_dict['zero_crossing_rate'] = item['zero_crossing_rate'] if item['zero_crossing_rate'] is not None else np.nan
    feature_dict['mel_spectrogram'] = item['mel_spectrogram'] if item['mel_spectrogram'] is not None else np.nan

    augmented_feature_dict_list.append(feature_dict)

# Create a DataFrame for augmented features
augmented_features_df = pd.DataFrame(augmented_feature_dict_list)

# Display the DataFrame to verify results
print("Augmented Features DataFrame with all features:")
print(augmented_features_df.head())

Augmented Features DataFrame with all features:
              id  Label  chroma_1  chroma_2  chroma_3  chroma_4  chroma_5  \
0    Minor_0_aug  Minor  0.694272  0.445703  0.529096  0.724197  0.410463   
1    Minor_1_aug  Minor  0.655989  0.449560  0.266899  0.242286  0.200219   
2   Minor_10_aug  Minor  0.753776  0.972707  0.434122  0.300710  0.282845   
3  Minor_100_aug  Minor  0.108470  0.074901  0.051982  0.156289  0.388980   
4  Minor_101_aug  Minor  0.820669  0.537448  0.254670  0.173032  0.232917   

   chroma_6  chroma_7  chroma_8  ...  zero_crossing_rate  \
0  0.087992  0.258958  0.908643  ...            0.044980   
1  0.190632  0.423906  0.867719  ...            0.086247   
2  0.109526  0.053868  0.224057  ...            0.048661   
3  0.226154  0.065551  0.121646  ...            0.026134   
4  0.377288  0.276959  0.184213  ...            0.007307   

                                     mel_spectrogram  interval_13  \
0  [[-22.204008347185585, -17.895482333864855, -1...       

In [16]:
# Calculate Harmonic Ratios for Augmented Data
augmented_harmonic_ratio_data = []

# Get harmonic columns from augmented_features_df
augmented_harmonic_columns = [col for col in augmented_features_df.columns if 'harmonic_' in col]

# Ensure there are at least 2 harmonic columns to calculate ratios
if len(augmented_harmonic_columns) > 1:
    for i in range(len(augmented_harmonic_columns)):
        for j in range(i + 1, len(augmented_harmonic_columns)):
            col_i = augmented_harmonic_columns[i]
            col_j = augmented_harmonic_columns[j]
            ratio_col_name = f'harmonic_ratio_{i}_to_{j}'  # Ensure "harmonic_ratio" is in the name
            
            # Calculate the ratio, avoiding division by zero
            harmonic_ratio = augmented_features_df[col_i] / augmented_features_df[col_j]
            harmonic_ratio.replace([np.inf, -np.inf], np.nan, inplace=True)  # Replace inf with NaN
            
            # Append the calculated ratios and column name to the list
            augmented_harmonic_ratio_data.append(harmonic_ratio.rename(ratio_col_name))
    
    # Concatenate all harmonic ratio columns at once to the DataFrame
    augmented_harmonic_ratio_df = pd.concat([augmented_features_df] + augmented_harmonic_ratio_data, axis=1)

    # Fill NaN values with 0 or another appropriate value (depending on your analysis)
    augmented_harmonic_ratio_df.fillna(0, inplace=True)
    
    # Display the new DataFrame
    print("Augmented Harmonic Ratios DataFrame:")
    print(augmented_harmonic_ratio_df.head())
else:
    print("Not enough harmonic columns in augmented data to compute ratios.")


Augmented Harmonic Ratios DataFrame:
              id  Label  chroma_1  chroma_2  chroma_3  chroma_4  chroma_5  \
0    Minor_0_aug  Minor  0.694272  0.445703  0.529096  0.724197  0.410463   
1    Minor_1_aug  Minor  0.655989  0.449560  0.266899  0.242286  0.200219   
2   Minor_10_aug  Minor  0.753776  0.972707  0.434122  0.300710  0.282845   
3  Minor_100_aug  Minor  0.108470  0.074901  0.051982  0.156289  0.388980   
4  Minor_101_aug  Minor  0.820669  0.537448  0.254670  0.173032  0.232917   

   chroma_6  chroma_7  chroma_8  ...  harmonic_ratio_16_to_17  \
0  0.087992  0.258958  0.908643  ...                      0.0   
1  0.190632  0.423906  0.867719  ...                      0.0   
2  0.109526  0.053868  0.224057  ...                      0.0   
3  0.226154  0.065551  0.121646  ...                      0.0   
4  0.377288  0.276959  0.184213  ...                      0.0   

   harmonic_ratio_16_to_18  harmonic_ratio_16_to_19  harmonic_ratio_16_to_20  \
0                      0.0   

In [28]:
# Merge augmented_features_df and augmented_harmonic_ratio_df
augmented_complete_df = augmented_features_df.merge(augmented_harmonic_ratio_df, on=['id', 'Label'], how='left')

# Remove the '_x' suffix from the chroma and other feature columns by renaming them
augmented_complete_df.columns = augmented_complete_df.columns.str.replace('_x', '')

# Display the cleaned DataFrame
print("augmented_complete_df (cleaned, with all possible features):")
print(augmented_complete_df.head())


augmented_complete_df (cleaned, with all possible features):
              id  Label  chroma_1  chroma_2  chroma_3  chroma_4  chroma_5  \
0    Minor_0_aug  Minor  0.694272  0.445703  0.529096  0.724197  0.410463   
1    Minor_1_aug  Minor  0.655989  0.449560  0.266899  0.242286  0.200219   
2   Minor_10_aug  Minor  0.753776  0.972707  0.434122  0.300710  0.282845   
3  Minor_100_aug  Minor  0.108470  0.074901  0.051982  0.156289  0.388980   
4  Minor_101_aug  Minor  0.820669  0.537448  0.254670  0.173032  0.232917   

   chroma_6  chroma_7  chroma_8  ...  harmonic_ratio_16_to_17  \
0  0.087992  0.258958  0.908643  ...                      0.0   
1  0.190632  0.423906  0.867719  ...                      0.0   
2  0.109526  0.053868  0.224057  ...                      0.0   
3  0.226154  0.065551  0.121646  ...                      0.0   
4  0.377288  0.276959  0.184213  ...                      0.0   

   harmonic_ratio_16_to_18  harmonic_ratio_16_to_19  harmonic_ratio_16_to_20  \
0    

## Combining Dataframes

