#### Step1: Preprocessing

This notebook does the following:
- Loads and processing the audio files from the GTZAN Dataset
- Extracts the audio waveform using Librosa
- Saves the preprocessed data for further feature extraction and model training


In [22]:
import os
import librosa
import tqdm
import numpy as np


#### Dataset Folder Structure
The dataset is organized into folders where each folder is a genre with 100 .wav files each.
The genres are as follows: Blues,Classical,Country,Disco,Hip-Hop,Jazz,Metal,Pop,Reggae and Rock


In [23]:
audio_folder = '/Users/aravpatel/iml final/genres' 

Opening the folder to See which files exist

In [24]:
for file in os.listdir(audio_folder):
        print(file)

pop
.DS_Store
metal
disco
blues
reggae
classical
rock
hiphop
country
jazz


Creating a dictionary to store all genere labels

In [6]:
label = { 
    folder: idx
    for idx, folder in enumerate(
        folder for folder in os.listdir(audio_folder) if not folder.startswith(".")
    )
}
print(label)

{'pop': 0, 'metal': 1, 'disco': 2, 'blues': 3, 'reggae': 4, 'classical': 5, 'rock': 6, 'hiphop': 7, 'country': 8, 'jazz': 9}


A `.wav` file is a sequence of floating points that represent sound waves. Each data point corresponds to the amplitude of the sound wave at a specific moment, sampled at a specific rate(eg: 22050 Hz)

First We load each file using librosa and store them in Lists
  

In [25]:
X = [] #stores the audio series of a song
y = [] #stores the genre
file_names = [] #stores the file name

# Loop through each folder
for folder in os.listdir(audio_folder):
    print(f'Processing folder: {folder}')
    
    # if the folder is valid
    if os.path.isdir(os.path.join(audio_folder, folder)):
        for file in tqdm.tqdm(os.listdir(os.path.join(audio_folder, folder))):
            file_path = os.path.join(audio_folder, folder, file)
        
            data, sr = librosa.load(file_path, sr=22050) #load using librosa, since the data has a sampling rate of 22050
            
            X.append(data)
            y.append(folder)
            file_names.append(file)
            
print("Processing complete")




Processing folder: pop


100%|██████████| 100/100 [00:00<00:00, 278.52it/s]


Processing folder: .DS_Store
Processing folder: metal


100%|██████████| 100/100 [00:00<00:00, 224.52it/s]


Processing folder: disco


100%|██████████| 100/100 [00:00<00:00, 187.47it/s]


Processing folder: blues


100%|██████████| 100/100 [00:00<00:00, 246.08it/s]


Processing folder: reggae


100%|██████████| 100/100 [00:00<00:00, 237.60it/s]


Processing folder: classical


100%|██████████| 100/100 [00:00<00:00, 224.39it/s]


Processing folder: rock


100%|██████████| 100/100 [00:00<00:00, 234.22it/s]


Processing folder: hiphop


100%|██████████| 100/100 [00:00<00:00, 245.57it/s]


Processing folder: country


100%|██████████| 100/100 [00:00<00:00, 217.37it/s]


Processing folder: jazz


100%|██████████| 100/100 [00:00<00:00, 191.44it/s]

Processing complete





 Converting X and y into Numpy Arrays 
 

In [27]:
#check the length of X, y and genre 
print(len(X))
print(len(y))
print(len(file_names))

X = np.array(X, dtype=object)

1000
1000
1000


Checking if the audio files are of the same length

In [28]:
# Reference length 
reference_length = len(X[0])

# Number of audio files
num_files = len(X)
print(reference_length)
# Check for different lengths
for i in range(num_files):
    if len(X[i]) != reference_length:
        print(f"File {i} has a different length: {len(X[i])}")

        
    
    

661504
File 121 has a different length: 661794
File 123 has a different length: 661794
File 124 has a different length: 661794
File 125 has a different length: 661794
File 126 has a different length: 661794
File 128 has a different length: 661794
File 130 has a different length: 661794
File 132 has a different length: 661794
File 133 has a different length: 661794
File 134 has a different length: 661794
File 135 has a different length: 661794
File 140 has a different length: 661794
File 141 has a different length: 661794
File 142 has a different length: 661794
File 143 has a different length: 661794
File 148 has a different length: 661794
File 149 has a different length: 661794
File 152 has a different length: 661794
File 153 has a different length: 661794
File 156 has a different length: 661794
File 157 has a different length: 661794
File 160 has a different length: 661794
File 161 has a different length: 661794
File 164 has a different length: 661794
File 165 has a different length: 

To ensure consistency we standardise the lengths of the audio by trunicating each file to that of the shortest one.


In [29]:
shortest_length = min(len(x) for x in X) #finding the shortest audio

X = [x[:shortest_length] for x in X] #slicing each file to match the length

Finally, Saving the processed lists into numpy files

In [30]:
np.save('X_Data.npy', X)
np.save('y_Genres.npy', y)        
np.save('file_names.npy', file_names)