# Music Genre  Classification using Machine Learnin
In this tutorial, we will develop a machine learning project to automatically classify different musical genres from audio files. We will classify these audio files using low-level features of frequency and time domain.

For this project, we need a dataset of audio tracks that have the same size and similar frequency range. The GTZAN genre classification dataset is the recommended dataset for a music genre classification project and was collected for this task only.

The GTZAN dataset was collected in 2000-2001. It consists of 1,000 audio files of 30 seconds each. There are 10 categories (10 music genres) each containing 100 audio tracks. Each track in .wav format contains audio files of the following 10 genres:
-  Blues
- Classical
- Country
- Disco
- Hiphop
- Jazz
- Metal
- Pop
- Reggae
- Rock

There are different ways to perform classification on this dataset. We will use the Neighbors Nearest-K algorithm because it has shown in many studies the best results for this problem.

MFCC:

These are the newest features used in speech recognition and automatic speech studies. There are a set of steps to create these features
:- • Because audio signals are constantly changing, we first divide these signals into smaller frames. Each frame is about 20-40 ms lon
- Then we try to identify the different frequencies present in each frame.
- Now, separate the linguistic frequencies from the noise.
- To get rid of the noise, it takes a discrete cosine transform (DCT) for these frequencies. With the DCT technique, we only keep a specific m sequence of frequencies that have a high probability of information.

**Please note that my project draws inspiration from 'Machine Learning through Examples' by Dr. Alaa Tuaima, as I explore the concepts and techniques outlined in the book to create innovative solutions.**  .tion.ae
• Rockg

### Importing the needed libraries

In [1]:
# This library provides functions for extracting various features from audio signals, primarily for speech processing applications.
from python_speech_features import mfcc
#This is part of Python's built-in tempfile module, which provides functions for creating temporary files and directories.
#TemporaryFile is a class that creates a temporary file in a secure manner. 
import scipy.io.wavfile as wav
import numpy as np
from tempfile import TemporaryFile
import os
#The pickle module is used for serializing and deserializing Python objects. It's commonly used to save and load objects to and from files.
import pickle
import random
#The operator module provides a set of efficient functions that are often used for performing common operations in a more concise and readable way.
import operator
import math
import numpy as np

### Define a function to get the distance between feature vectors and find neighbors
This code defines a function called getNeighbors that is part of the k-Nearest Neighbors (k-NN) algorithm. The function is responsible for finding the nearest neighbors of a given instance within a training set.

In [2]:
def getNeighbors(trainingSet, instance, k):
    
# trainingSet: A list of training instances, where each instance is represented by a list containing features and a class label.
# instance: The instance for which we want to find the nearest neighbors.
# k: The number of neighbors to find.
    
    distances = []  # Initializes an empty list to store the distances between the instance and all instances in the trainingSet.
    
    for x in range(len(trainingSet)):  # The first loop iterates through each instance in the trainingSet
        
        dist = distance(trainingSet[x], instance, k ) + distance(instance, trainingSet[x], k) 
        # dist calculates the combined distance between the instance and the current instance in the trainingSet using a distance function twice.
        # This is a common practice in k-NN where you calculate the distance in both directions to ensure symmetry.
        
        distances.append((trainingSet[x][2], dist))
        # In the context of k-Nearest Neighbors (k-NN), 
        # the third element [2] is likely representing the class label associated with the x-th instance in the trainingSet.
        
    distances.sort(key=operator.itemgetter(1))
    # In the context of sorting a list of tuples or dictionaries, 
    # the key parameter allows you to specify a function that calculates a value for each element in the list.
    # The sorting is then based on the values returned by that function.
    # --------------------------------------------------------------------------------------------------------------------------------------
    # operator.itemgetter(1): This is a function provided by the operator module. 
    # It returns a function that gets the value at index 1 from a sequence (like a tuple or list). 
    # We have a list of tuples where each tuple has two elements: (trainingSet[x][2], dist). Index 1 corresponds to the dist value in each tuple.
    
    neighbors = []   # Initializes an empty list to store the class labels of the nearest neighbors.
    
    for x in range(k):   # The second loop iterates through the distances list to extract the class labels of the k nearest neighbors
        neighbors.append(distances[x][0])   #  appends the class label of the current neighbor to the neighbors list
        return neighbors

### Identify nearest neighbors
This code defines a function called nearestClass that helps determine the class label for a given instance based on its nearest neighbors. This is a key step in the k-Nearest Neighbors (k-NN) algorithm, where the class label with the most votes from the neighbors is assigned to the instance.

In [3]:
def nearestClass(neighbors):   # neighbors: A list of class labels from the nearest neighbors of the target instance.
    
    classVote = {}   #  Initializes an empty dictionary to store the count of votes for each class label.
    
    for x in range(len(neighbors)):   # The loop iterates through each neighbor's class label
        
        response = neighbors[x]   # retrieves the class label of the current neighbor
        
        if response in classVote:
            classVote[response] += 1   # If the class label is already in the dictionary, the vote count for that class label is incremented by 1
        else:
            classVote[response] = 1   # If the class label is not in the dictionary, a new entry is created with a vote count of 1
            
    sorter = sorted(classVote.items(), key = operator.itemgetter(1), reverse=True)
    # the dictionary classVote is sorted in descending order based on the vote counts using the sorted() function with the key parameter as the vote count
    return sorter[0][0]
    # The function returns the class label with the highest number of votes

### Define a function to evaluate the model
This code defines a function called getAccuracy that calculates the accuracy of a classification model's predictions. It's a common evaluation metric used to measure the percentage of correct predictions made by the model.

In [4]:
def getAccuracy(testSet, predictions):
    
    correct = 0   #  Initializes a counter variable to keep track of the number of correct predictions made by the model.
    
    for x in range (len(testSet)):
        if testSet[x][-1]==predictions[x]:   # it checks whether the true class label (testSet[x][-1]) matches the predicted class label (predictions[x])
            
            correct += 1   # If the labels match, the correct counter is incremented by 1
            
    return 1.0*correct/len(testSet)   # the function calculates the accuracy by dividing the number of correct predictions by the total number of instances

### Extract features from the dataset and dump these features into a .dat binary file.
This code is an example of how to extract audio features from audio files using the MFCC (Mel-frequency cepstral coefficients) technique and save the extracted features to a binary file using the Pickle module.

In [5]:
# the path where the audio files are stored.
directory = "GTZAN"

# The code opens a binary file named "mydataset.dat" in write mode ("wb" mode) for storing the extracted features.
f = open("mydataset.dat", "wb")

i = 0

# The code iterates through each item (folder) in the specified directory. 
# It increments the i counter and checks if it has reached a value of 11. If it has, the loop breaks. 
# This suggests that the code is processing a maximum of 10 classes.
for folder in os.listdir(directory):
    #print(folder)
    i += 1
    if i == 11:
        break
        
    # For each file in the current folder, the code attempts to read the audio file using the wav.read, 
    # It extracts the sampling rate rate and the audio signal sig.
    # directory+"/"+folder: This expression creates the complete path to a subdirectory within the main directory.
    for file in os.listdir(directory+"/"+folder):
        #print(file)
        try:
            if file.startswith('.'):
                continue  # Skip hidden files
                
            # directory+"/"+folder+"/"+file: This expression creates the complete path to a file in a subdirectory within the main directory.
            (rate, sig) = wav.read(directory+"/"+folder+"/"+file)
            
            # The code calculates MFCC features (mfcc_feat) for the audio signal using the mfcc function.
            # It computes the covariance matrix (covariance) of the transpose of the MFCC features.
            # And calculates the mean matrix (mean_matrix) of the MFCC features along the first axis.
            mfcc_feat = mfcc(sig, rate, winlen = 0.020, appendEnergy=False)
            covariance = np.cov(np.matrix.transpose(mfcc_feat))
            mean_matrix = mfcc_feat.mean(0)
            
            #The calculated mean matrix, covariance matrix, and the class label i are packed into a tuple named feature.
            # This tuple is then serialized and written to the binary file using pickle.dump.
            feature = (mean_matrix, covariance, i)
            pickle.dump(feature, f)
            
        except Exception as e:
            print("Got an exception: ", e, 'in folder: ', folder, ' filename: ', file)
f.close()

### Split training and testing on the dataset
 This code is a part of a dataset loading and splitting function. It loads data from a binary file "mydataset.dat", which contains previously serialized data, and then it divides this dataset into training and testing sets based on a given split ratio. 

In [6]:
dataset = []

# filename: The name of the binary file containing the dataset.
# split: The split ratio between the training set and the testing set.
# trSet: An empty list that will store the training set data.
# teSet: An empty list that will store the testing set data.
def loadDataset(filename , split , trSet , teSet):
    with open("mydataset.dat" , 'rb') as f:  # This opens the binary file "mydataset.dat" in read-binary mode, and the file will be closed after processing.
        
        while True:

            # This loop continuously reads data from the binary file using pickle.load(f). 
            # The loop keeps appending the loaded data to the dataset list until EOFError occurs.
            try:
                dataset.append(pickle.load(f))
            except EOFError:
                f.close()
                break

    # This loop iterates through the dataset list that was populated with data from the binary file.
    # it randomly decides whether to add the item to the training set (trSet) or the testing set (teSet) based on the given split ratio.
    for x in range(len(dataset)):
        if random.random() <split :
            trSet.append(dataset[x])
        else:
            teSet.append(dataset[x])
            
trainingSet = []
testSet = []
loadDataset("mydataset.dat" , 0.66, trainingSet, testSet)

### Calculate the distance between two instance
This code defines a function named distance that calculates the Mahalanobis distance between two instances based on their mean vectors (mm1 and mm2) and covariance matrices (cm1 and cm2). 

In [7]:
# instance1 and instance2 are tuples where the first element ([0]) is the mean vector (mm1 and mm2) 
# and the second element ([1]) is the covariance matrix (cm1 and cm2).
# k is a constant that will be subtracted from the final calculated distance.
def distance(instance1, instance2, k):
    distance = 0
    mm1 = instance1[0]
    cm1 = instance1[1]
    mm2 = instance2[0]
    cm2 = instance2[1]

    # The first part of the distance calculation involves computing the trace of the product of the inverse of cm2 and cm1. 
    # The trace of a matrix is the sum of its diagonal elements.
    distance = np.trace(np.dot(np.linalg.inv(cm2), cm1))

    # The second part of the distance calculation involves calculating the quadratic term (mm2 - mm1).transpose() * inv(cm2) * (mm2 - mm1). 
    # This term represents the squared Mahalanobis distance between the mean vectors of the two instances.
    distance += (np.dot(np.dot((mm2-mm1).transpose(), np.linalg.inv(cm2)), mm2-mm1))

    # The third part involves adding the difference of the logarithms of the determinants of cm2 and cm1. 
    # The determinant is used to capture the volume scaling effect of the covariance matrices.
    distance += np.log(np.linalg.det(cm2)) - np.log(np.linalg.det(cm1))
    
    distance -= k
    return distance

### Training the Model and making predictions
This code is performing k-nearest neighbors (KNN) classification on a test set and then calculating and printing the accuracy of the KNN predictions. 

In [8]:
length = len(testSet) # calculates the length of the testSet, which presumably contains the instances that I want to classify using KNN.

predictions = []

# This loop iterates through each instance in the testSet.
# It calls the getNeighbors function to find the k nearest neighbors in the trainingSet for the current instance. 
# The third argument 5 indicates that the algorithm should find 5 nearest neighbors.
# Then, it calls the nearestClass function on the obtained neighbors to predict the class label for the current instance.
# The predicted class label is added to the predictions list.
for x in range(length):
    predictions.append(nearestClass(getNeighbors(trainingSet, testSet[x], 5)))

accuracy1 = getAccuracy(testSet, predictions)
print(accuracy1)

0.7198795180722891
