<a href="https://colab.research.google.com/gist/parulnith/7f8c174e6ac099e86f0495d3d9a4c01e/untitled9.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Music genre classification notebook

## Importing Libraries

In [1]:
# feature extractoring and preprocessing data
import librosa
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import os
from PIL import Image
import pathlib
import csv

# Preprocessing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler

#Keras
import keras

import warnings
warnings.filterwarnings('ignore')

Using TensorFlow backend.


## Extracting music and features

### Dataset

We use [GTZAN genre collection](http://marsyasweb.appspot.com/download/data_sets/) dataset for classification. 
<br>
<br>
The dataset consists of 10 genres i.e
 * Blues
 * Classical
 * Country
 * Disco
 * Hiphop
 * Jazz
 * Metal
 * Pop
 * Reggae
 * Rock
 
Each genre contains 100 songs. Total dataset: 1000 songs

## Extracting the Spectrogram for every Audio

In [2]:
cmap = plt.get_cmap('inferno')

plt.figure(figsize=(10,10))
genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()
for g in genres:
    pathlib.Path(f'img_data/{g}').mkdir(parents=True, exist_ok=True)     
    for filename in os.listdir(f'./Music_GTZAN/genres/{g}'):
        songname = f'./Music_GTZAN/genres/{g}/{filename}'
        y, sr = librosa.load(songname, mono=True, duration=5)
        plt.specgram(y, NFFT=2048, Fs=2, Fc=0, noverlap=128, cmap=cmap, sides='default', mode='default', scale='dB');
        plt.axis('off');
        plt.savefig(f'img_data/{g}/{filename[:-3].replace(".", "")}.png')
        plt.clf()
 

<Figure size 720x720 with 0 Axes>

All the audio files get converted into their respective spectrograms .WE can noe easily extract features from them.

## Extracting features from Spectrogram


We will extract

* Mel-frequency cepstral coefficients (MFCC)(20 in number)
* Spectral Centroid,
* Zero Crossing Rate
* Chroma Frequencies
* Spectral Roll-off.

In [3]:
header = 'filename chroma_stft rmse spectral_centroid spectral_bandwidth rolloff zero_crossing_rate'
for i in range(1, 21):
    header += f' mfcc{i}'
header += ' label'
header = header.split()

## Writing data to csv file

We write the data to a csv file 

In [6]:
file = open('data_MYMUSIC.csv', 'w', newline='')
with file:
    writer = csv.writer(file)
    writer.writerow(header)
genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()
for g in genres:
    for filename in os.listdir(f'./Music_GTZAN/genres/{g}'):
        songname = f'./Music_GTZAN/genres/{g}/{filename}'
        y, sr = librosa.load(songname, mono=True, duration=30)
        chroma_stft = librosa.feature.chroma_stft(y=y, sr=sr)
        spec_cent = librosa.feature.spectral_centroid(y=y, sr=sr)
        spec_bw = librosa.feature.spectral_bandwidth(y=y, sr=sr)
        rolloff = librosa.feature.spectral_rolloff(y=y, sr=sr)
        zcr = librosa.feature.zero_crossing_rate(y)
        mfcc = librosa.feature.mfcc(y=y, sr=sr)
        #to_append = f'{filename} {np.mean(chroma_stft)} {np.mean(rmse)} {np.mean(spec_cent)} {np.mean(spec_bw)} {np.mean(rolloff)} {np.mean(zcr)}'    
        to_append = f'{filename} {np.mean(chroma_stft)} {np.mean(spec_cent)} {np.mean(spec_bw)} {np.mean(rolloff)} {np.mean(zcr)}'    
        for e in mfcc:
            to_append += f' {np.mean(e)}'
        to_append += f' {g}'
        file = open('data_MYMUSIC.csv', 'a', newline='')
        with file:
            writer = csv.writer(file)
            writer.writerow(to_append.split())

The data has been extracted into a [data.csv](https://github.com/parulnith/Music-Genre-Classification-with-Python/blob/master/data.csv) file.

# Analysing the Data in Pandas

In [7]:
data = pd.read_csv('data_MYMUSIC.csv')
data.head()

Unnamed: 0,filename,chroma_stft,rmse,spectral_centroid,spectral_bandwidth,rolloff,zero_crossing_rate,mfcc1,mfcc2,mfcc3,...,mfcc12,mfcc13,mfcc14,mfcc15,mfcc16,mfcc17,mfcc18,mfcc19,mfcc20,label
0,blues.00000.au,0.349943,1784.420446,2002.650192,3806.485316,0.083066,-113.596748,121.557297,-19.158825,42.351032,...,-3.667369,5.75169,-5.162763,0.750948,-1.691938,-0.409953,-2.300209,1.219929,blues,
1,blues.00001.au,0.340983,1529.835316,2038.617579,3548.820207,0.056044,-207.556793,124.006721,8.93056,35.874687,...,-2.23912,4.216963,-6.012273,0.93611,-0.716537,0.293876,-0.287431,0.531574,blues,
2,blues.00002.au,0.363603,1552.481958,1747.165985,3040.514948,0.076301,-90.754387,140.459915,-29.109968,31.689013,...,-8.905224,-1.08372,-9.21836,2.455806,-7.726901,-1.815723,-3.433434,-2.226821,blues,
3,blues.00003.au,0.404779,1070.119953,1596.333948,2185.028454,0.033309,-199.431152,150.099213,5.647593,26.871927,...,-2.476421,-1.07389,-2.874778,0.780977,-3.316932,0.637981,-0.61969,-3.408233,blues,
4,blues.00004.au,0.30859,1835.494603,1748.362448,3580.945013,0.1015,-160.266037,126.198807,-35.60545,22.153301,...,-6.934123,-7.558618,-9.173553,-4.512165,-5.453538,-0.924161,-4.409333,-11.70378,blues,


In [8]:
data.shape

(960, 28)

In [9]:
# Dropping unneccesary columns
data = data.drop(['filename'],axis=1)

## Encoding the Labels

In [24]:
genre_list = data.iloc[:, -2] #-1
encoder = LabelEncoder()
y = encoder.fit_transform(genre_list)

In [25]:
print(y)

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4
 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 4 4 4 4 4 4 4 4 4 4 4 4 

## Scaling the Feature columns

In [12]:
scaler = StandardScaler()
X = scaler.fit_transform(np.array(data.iloc[:, :-2], dtype = float))  #-1

In [20]:
print(X)

[[-0.34809486 -0.57181952 -0.45041864 ... -0.22257026 -0.03193404
   0.59571367]
 [-0.45620808 -0.92220568 -0.38308858 ... -0.04126695  0.5142383
   0.41833093]
 [-0.18327533 -0.89103705 -0.92867878 ... -0.58469081 -0.3394376
  -0.29248134]
 ...
 [ 0.13631436  1.7742102   1.86913117 ... -0.41816317 -0.03417004
  -0.28903263]
 [-0.93754092  0.32687711  1.02314642 ...  1.19576383  0.62602022
   0.71284742]
 [-0.47455628  0.30923015  1.28053478 ... -1.40288114 -0.09057725
  -0.82636867]]


## Dividing data into training and Testing set

In [26]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [27]:
len(y_train)

768

In [28]:
len(y_test)

192

In [29]:
X_train[10]

array([-0.3551725 ,  1.02746961,  1.513174  ,  1.30759659,  0.42681933,
        0.97835895, -0.77945382,  1.13787334, -1.25884186,  0.80444821,
       -0.51097263,  1.06165849, -0.43246988,  1.47391853, -0.64281763,
        0.97109856, -0.46536059,  0.35890297, -0.92720801,  0.23061455,
       -0.82189098,  0.50869884,  0.28551781,  1.03134359,  0.27973686])

In [30]:
X_train.shape[1]

25

In [31]:
y_test

array([4, 7, 0, 4, 7, 2, 8, 6, 7, 2, 3, 4, 4, 4, 5, 4, 4, 0, 0, 8, 8, 0,
       7, 7, 5, 0, 3, 4, 6, 9, 6, 8, 9, 2, 0, 8, 4, 8, 3, 0, 4, 1, 5, 1,
       7, 5, 5, 6, 1, 1, 6, 1, 2, 3, 1, 4, 9, 7, 8, 7, 6, 8, 5, 4, 1, 2,
       2, 6, 6, 2, 0, 6, 3, 8, 7, 7, 7, 7, 4, 4, 4, 8, 7, 8, 7, 0, 5, 6,
       5, 6, 3, 1, 6, 1, 6, 4, 9, 2, 6, 4, 4, 8, 8, 7, 4, 0, 5, 1, 7, 1,
       8, 2, 6, 5, 5, 6, 2, 8, 5, 5, 0, 1, 5, 3, 5, 4, 2, 4, 4, 8, 2, 3,
       5, 4, 1, 4, 3, 6, 2, 4, 6, 2, 0, 2, 2, 2, 4, 6, 4, 7, 8, 0, 1, 4,
       0, 6, 3, 9, 7, 3, 4, 7, 1, 5, 0, 7, 0, 4, 3, 8, 3, 1, 2, 7, 2, 3,
       4, 3, 7, 4, 4, 2, 7, 1, 8, 3, 5, 2, 3, 7, 8, 5])

# Classification with Keras

## Building our Network

In [32]:
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_shape=(X_train.shape[1],)))

model.add(layers.Dense(128, activation='relu'))

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(10, activation='softmax'))

In [33]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [34]:
history = model.fit(X_train,
                    y_train,
                    epochs=20,
                    batch_size=128)
                   

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [35]:
test_loss, test_acc = model.evaluate(X_test,y_test)



In [36]:
print('test_acc: ',test_acc)

test_acc:  0.6927083134651184


Tes accuracy is less than training dataa accuracy. This hints at Overfitting

## Validating our approach
Let's set apart 200 samples in our training data to use as a validation set:

In [37]:
x_val = X_train[:200]
partial_x_train = X_train[200:]

y_val = y_train[:200]
partial_y_train = y_train[200:]

Now let's train our network for 20 epochs:

In [38]:
model = models.Sequential()
model.add(layers.Dense(512, activation='relu', input_shape=(X_train.shape[1],)))
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(partial_x_train,
          partial_y_train,
          epochs=30,
          batch_size=512,
          validation_data=(x_val, y_val))
results = model.evaluate(X_test, y_test)

Train on 568 samples, validate on 200 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [39]:
results

[1.1176357169946034, 0.625]

## Predictions on Test Data

In [40]:
predictions = model.predict(X_test)

In [41]:
predictions[0].shape

(10,)

In [42]:
np.sum(predictions[0])

1.0

In [43]:
np.argmax(predictions[0])

4