# Music Note Classifier
### The goal of this program is to create a useful segment of code to convert mp3 files into usable data for a machine learning model.

## Import dependencies:

The four imports used to achieve this functionality are numpy, pandas, os, and librosa. Librosa used in conjunction with numpy allows for easy spectogram generation and feature extraction. The os and csv are used for I/O and saving the features extracted to a csv file to push to the machine learning model.

In [1]:
import numpy as np
import pandas as pd
import os
import librosa
from librosa import feature
import csv

## Feature Extraction and Spectrogram Functions:
In this section two functions are defined: load_spec and extract features. The load spec function loads the spectogram, trims it, and returns the mean of it. The extract_features takes in the spectogram and calls a list of feature extraction functions, takes the mean of them, and stores them into an array.

In [2]:
def trim(y, target_length):
    current_length = y.shape[0]
    frames_to_keep = min(current_length, target_length)
    
    if current_length > target_length:
        trimmed_spec = y[:frames_to_keep]
    else:
        pad_width = target_length - current_length
        trimmed_spec = np.pad(y, (0, pad_width), mode='constant', constant_values=0)
        
    return trimmed_spec

In [3]:
def extract_features(y,sr):
    
    #Using the mean length of the lengths found below the spectrograms are all trimmed to a length 23160
    y = trim(y, target_length=23160)

    chroma = np.mean(feature.chroma_stft(y=y, sr=sr))
    centr = np.mean(feature.spectral_centroid(y=y, sr=sr))
    band = np.mean(feature.spectral_bandwidth(y=y, sr=sr))
    roll = np.mean(feature.spectral_rolloff(y=y, sr=sr))
    contrast = np.mean(feature.spectral_contrast(y=y, sr=sr))
    mfcc = np.mean(feature.mfcc(y=y, sr=sr))
    zero = np.mean(feature.zero_crossing_rate(y=y))
    rms = np.mean(feature.rms(y=y))
    tonnetz = np.mean(feature.tonnetz(y=y, sr=sr))
    
    feat_vect = [chroma, centr, band, roll, contrast, mfcc, zero, rms, tonnetz] 
    return feat_vect

## Data Feature Extraction Loop:
This section loops through all of the files and runs the extraction functions on them, storing the results into a notes features array which will be used later to create the csv file for this project. A label is also created for each folder of different notes for the y value of each row in the csv file.

In [4]:
count = 0
lengths = []
notes_feats = []
numNotes = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    count +=1
    if(count > 3):
        i = dirname.rfind('/')
        label = dirname[i+1:]
        numNotes += 1
        print(label)
    for filename in filenames:        
        y, sr = librosa.load(os.path.join(dirname, filename),sr=None)
        lengths.append(len(y))
        feat_vect = extract_features(y,sr)
        feat_vect.insert(0, label)
        notes_feats.append(feat_vect)

23. D1




34. C-sharp 2
37. E3
26. F2
14. F1
5. G-sharp


  return pitch_tuning(


17. G-sharp 1
31. A-sharp 2
24. D-sharp 1
11. D
21. C1
10. C-sharp
20. B1
36. D-sharp 2
8. B
15. F-sharp 1
33. C2
19. A-sharp 1
16. G1
25. E2
4. G
7. A-sharp
18. A1
35. D2
9. C
6. A
29. G-sharp 2
22. C-sharp 1
28. G2
1. E
2. F
27. F-sharp 2
13. E1
32. B2
30. A2
12. D-sharp
3. F-sharp


## Trimming the Spectrograms:

Using the mean of the lengths of the spectrograms, they were able to be trimmed down to around the same length to allow for better data formatting.

In [5]:
s = pd.Series(lengths)
print(s.mean())

23160.472246696034


## CSV File Creation:
In this section the csv is populated and stored into a file called Guitar_features.csv. First the headers for each column are defined and then the rest of the rows are filed in with the 2d array stored in notes_feats.

In [6]:
guitar_output = 'Guitar_features.csv'

header =[
    'note_names',
    'chroma_stft',
    'spectral_centroid',
    'spectral_bandwidth',
    'spectral_rolloff',
    'spectral_contrast',
    'mfcc',
    'zero_crossing_rate',
    'rms',
    'tonnetz'
]

with open(guitar_output,'+w') as f:
    csv_writer = csv.writer(f, delimiter = ',')
    csv_writer.writerow(header)
    csv_writer.writerows(notes_feats)