# Preprocessing 

In this script, I would like to do the following:
1. Loop through all subdirectories in the music folder (not on Github) and get features for each song
2. Features to look at:
    a) Zero-crossings (possibly do rate?)
    b) Spectral centroids
    c) Spectral rolloff
    d) Mel-frequency cestral coefficients (multiple columns)
    e) Chroma frequencies (multiple columns)
    f) Tempograms
3. Compile CSV file that stores all this data for each song

Also, I found a really good website for more information of feature extraction: https://musicinformationretrieval.com

In [3]:
# Libraries

## Music
import librosa

## Data analysis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## File handling
import os
import pathlib

## Preprocessing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler

Let's start off by reading in the files and displaying all the file names.

In [5]:
years = ['2010', '2011', '2012', '2013', '2014', 
        '2015', '2016', '2017', '2018']

for year in years:
    subdir = f'../music/english/{year}'
    for filename in os.listdir(subdir):
        songname = f'../music/english/{year}/{filename}'
        x, sr = librosa.load(songname)
        
        # Extracting features
        chroma_stft = librosa.feature.chroma_stft(x, sr=sr)
        spec_cent = librosa.feature.spectral_centroid(x, sr=sr)
        spec_bw = librosa.feature.spectral_bandwidth(x, sr=sr)
        rolloff = librosa.feature.spectral_rolloff(x, sr=sr)
        zcr = librosa.feature.zero_crossing_rate(x)
        mfcc = librosa.feature.mfcc(x, sr=sr)
        tempo = librosa.beat.tempo(x, sr = sr)
        
        # Note: We can possibly look at separating the harmonic and percussive parts of the song
        
        # Putting this into a file

../music/english/.DS_Store
../music/english/2013/ho_hey.mp3
../music/english/2013/wrecking_ball.mp3
../music/english/2013/.DS_Store
../music/english/2013/locked_out_of_heaven.mp3
../music/english/2013/wake_me_up.mp3
../music/english/2013/blurred_lines.mp3
../music/english/2013/mirros.mp3
../music/english/2013/suit_tie.mp3
../music/english/2013/radioactive.mp3
../music/english/2013/when_your_man.mp3
../music/english/2013/cant_stop.mp3
../music/english/2013/harlem_shake.mp3
../music/english/2013/i_knew_trouble.mp3
../music/english/2013/roar.mp3
../music/english/2013/give_me_reason.mp3
../music/english/2013/cant_hold.mp3
../music/english/2013/stay.mp3
../music/english/2013/line_cruise.mp3
../music/english/2013/thrift_shop.mp3
../music/english/2013/get_lucky.mp3
../music/english/2014/talk_dirty.mp3
../music/english/2014/timber.mp3
../music/english/2014/problem.mp3
../music/english/2014/pompeii.mp3
../music/english/2014/monster.mp3
../music/english/2014/.DS_Store
../music/english/2014/say_s