# Extracting Audios Features for Genre Classification

In this notebook, we take the `fma_small` subset of the larger music dataset made available by FMA and extract audio features so that we may build a model for classifying music genres (see `Genres.ipynb` for more on that).

Sources:
- [FMA: A Dataset For Music Analysis](https://github.com/mdeff/fma) by Michaël Defferrard, et. al. Provides the dataset used in this notebook.
- [Audio Data Analysis Using Deep Learning with Python](https://www.kdnuggets.com/2020/02/audio-data-analysis-deep-learning-python-part-1.html), by Nagesh Singh Chauhan courtesy of KDnuggets. This notebook is written using the same basic concepts and implementation presented in this article.

In [None]:
import librosa
import pandas as pd
import numpy as np
import os
import pathlib
import csv
import warnings
warnings.filterwarnings('ignore')
import utils

First, we need to get ahold of the dataset we'll be using. Info and files for the dataset can be found [here](https://github.com/mdeff/fma). In particular, we'll be utilizing the `fma_small` dataset, as well a metadata CSV, `tracks.csv`.

Before we can extract features, we have some preprocessing to do. First, we build the header for our csv file where we'll be extracting our features into. Then, we load the metadata for the MFA dataset and extract the metadata only for the small subset.

In [None]:
header = 'filename chroma_stft rmse spectral_centroid spectral_bandwidth rolloff zero_crossing_rate'
for i in range(1, 21):
    header += f' mfcc{i}'
header += ' label'
header = header.split()

# Load track metadata
tracks = utils.load('tracks.csv')
tracks = tracks[tracks['set', 'subset'] <= 'small']
tracks = tracks['track']

Now we can extract our features. In particular, we'll be extracting the following features:
- Chroma
- Spectral centroid
- Spectral bandwith
- Spectral rolloff
- Zero crossing rate
- Mel-frequency cepstral coefficients (MFCCs)  

Each of these features are extracted as means calculated across the source audio files. For the MFCCs, we actually collect 20 different means. For more information on what all these features are, take a look at the [article](https://www.kdnuggets.com/2020/02/audio-data-analysis-deep-learning-python-part-1.html) this notebook is based on.

We will use the Librosa library to iterate through each audio file in the dataset and extract the desired features. Using the metadata CSV we loaded earlier, we append to our CSV the appropriate genre label for each audio file as well. 

Running this may take some time; when I ran it, I had to lead my machine running overnight. To sidestep the length running process, you can download the final CSV of extracted features here. Additionally, some files in the dataset don't load properly, so we have a basic error handler set up so we can ignore any problem files.

In [None]:
file = open('fma_small.csv', 'w', newline='')
with file:
    writer = csv.writer(file)
    writer.writerow(header)
#genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()
for folder in os.listdir('./fma_small/'):
    for filename in os.listdir(f'./fma_small/{folder}'):
        # Load file and extract features w/ Librosa
        songname = f'./fma_small/{folder}/{filename}'
        try:
            y, sr = librosa.load(songname, mono=True, duration=30)
        except:
            print(f'failed loading {filename}')
        else:
            rmse = librosa.feature.rms(y=y)
            chroma_stft = librosa.feature.chroma_stft(y=y, sr=sr)
            spec_cent = librosa.feature.spectral_centroid(y=y, sr=sr)
            spec_bw = librosa.feature.spectral_bandwidth(y=y, sr=sr)
            rolloff = librosa.feature.spectral_rolloff(y=y, sr=sr)
            zcr = librosa.feature.zero_crossing_rate(y)
            mfcc = librosa.feature.mfcc(y=y, sr=sr)
        
            # Add features to dataset.csv
            to_append = f'{filename} {np.mean(chroma_stft)} {np.mean(rmse)} {np.mean(spec_cent)} {np.mean(spec_bw)} {np.mean(rolloff)} {np.mean(zcr)}'    
            for e in mfcc:
                to_append += f' {np.mean(e)}'
        
            # Find and add genre label
            genre = tracks.loc[int(filename.lstrip('0')[:-4]), 'genre_top']
            to_append += f' {genre}'
        
            # Write to file
            file = open('fma_small.csv', 'a', newline='')
            with file:
                writer = csv.writer(file)
                writer.writerow(to_append.split())

Now that the lengthy feature extraction progress has finished, let's check the CSV to verify our results:

In [None]:
data = pd.read_csv('fma_small.csv') # Loading data to manipulate for our purposes
data.head()