# Audio Feature Extraction and Preprocessing Documentation

## Introduction

This document explains the process of extracting audio features from a set of audio files and performing preprocessing steps. The provided code utilizes the `librosa` library for feature extraction and includes a custom class structure for enhanced organization.

## Libraries Installation

# Install librosa
pip install librosa

# Install pandas and numpy (if not already installed)
pip install pandas numpy

# Install scikit-learn (if not already installed)
pip install scikit-learn

In [1]:
# pip install pydub

In [2]:
# import os
# os.getcwd()

In [3]:
# from pydub import AudioSegment
# import os
# from pydub.exceptions import CouldntDecodeError

# def convert_m4a_to_wav(input_path, output_path):
#     try:
#         # Load M4A or AAC file
#         audio = AudioSegment.from_file(input_path, format="m4a")

#         # Export as WAV synchronously
#         audio.export(output_path, format="wav", codec="pcm_s16le", bitrate="192k")
#     except CouldntDecodeError as e:
#         print(f"Error decoding audio file: {input_path}")
#         print(e)
#     except Exception as e:
#         print(f"Error converting to WAV: {input_path}")
#         print(e)

# def convert_audio_files(directory):
#     for filename in os.listdir(directory):
#         file_path = os.path.join(directory, filename)

#         if os.path.isfile(file_path):
#             try:
#                 _, file_extension = os.path.splitext(filename)

#                 if file_extension.lower() in (".m4a", ".aac"):
#                     # Replace spaces with underscores in the filename
#                     filename_no_spaces = filename.replace(" ", "_")

#                     # Convert M4A or AAC to WAV
#                     wav_filename = os.path.splitext(filename_no_spaces)[0] + ".wav"
#                     wav_path = os.path.join(directory, wav_filename)

#                     convert_m4a_to_wav(file_path, wav_path)

#                     # Delete the original file after conversion
#                     os.remove(file_path)
#                     print(f"Converted and deleted: {filename}")
#                 elif file_extension.lower() == ".wav":
#                     print(f"File is already in WAV format: {filename}")
#                 else:
#                     print(f"File is not a supported format: {filename}")
#             except Exception as e:
#                 print(f"Error processing file: {filename}")
#                 print(e)

# # Specify the directory containing audio files
# input_directory = r"C:\Users\Retr0\Desktop\Recordings for FYP\Start Menu\Open Start Menu"

# # Call the function to convert audio files in the specified directory
# convert_audio_files(input_directory)


In [4]:
# def list_files(directory):
#     for filename in os.listdir(directory):
#         print(filename)

# # Call the function to list files in the specified directory
# list_files(input_directory)


In [5]:
# import os
# import magic

# def get_file_format(file_path):
#     mime = magic.Magic()
#     file_format = mime.from_file(file_path)
#     return file_format

# def check_file_formats(directory):
#     for filename in os.listdir(directory):
#         file_path = os.path.join(directory, filename)

#         if os.path.isfile(file_path):
#             file_format = get_file_format(file_path)
#             print(f"File: {filename}, Format: {file_format}")

# # Specify the directory containing files
# input_directory = r"C:\Users\Retr0\Desktop\Recordings for FYP\Browser\Open Google.com"

# # Call the function to check file formats in the specified directory
# check_file_formats(input_directory)


In [6]:
# !pip install python-magic

In [7]:
# !pip install python-magic-bin

In [8]:
# !pip install librosa

In [9]:
# !pip install essentia

In [10]:
#!pip install opensmile

In [11]:
# !pip show librosa

In [12]:
import librosa
import pandas as pd
import numpy as np
import os
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# import seaborn as sns
# import matplotlib.pyplot as plt

# 1. Folder Structure
The `Audiofile` class manages folder paths corresponding to different audio commands. Each folder represents a specific action or command, and the paths are stored in a dictionary.

# 2. Feature Extraction
The `Feature_Extraction` class extracts audio features using the `librosa` library, including chroma features, mel spectrogram, MFCCs, and more.

# 3. Mean Calculation
The mean of each feature across time is calculated to obtain a single representative value for each feature.

# 4. Data Preprocessing
The `Data_Preprocessing` class combines feature extraction and preprocessing. It uses the `Feature_Extraction` class to obtain mean feature values and performs additional steps, including label encoding and feature scaling.

# 5. Dataset Creation
A Pandas DataFrame is created, where each row corresponds to an audio file, and columns represent mean values of different audio features. The last column contains labels.

## Execution Steps
1. **Import Libraries:** Ensure required libraries are installed.

2. **Class Initialization:**
   - Create an instance of `Audiofile` to manage folder paths.
   - Create an instance of `Feature_Extraction` for feature extraction.

3. **Feature Extraction:**
   - Use `compute_features` method of `Feature_Extraction` to extract features.
   - Calculate mean values for each audio file.

4. **Dataset Creation:**
   - Store resulting data in a Pandas DataFrame.
   - Columns represent audio features, and each row corresponds to an audio file.

5. **Dataset Export (Optional):**
   - Export dataset to a CSV file for further analysis.

## Conclusion
This process provides a structured approach to extracting meaningful features from audio files and preparing data for machine learning tasks. The resulting dataset can be used for training and evaluating machine learning models for audio classification or related tasks.

In [13]:
class Audiofile:
    def __init__(self):
        self._folder_paths = {
            'assistance off': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\assistance off",
            'assistance on': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\assistance on",
            'turn off wi-fi.': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\turn off wifi",
            'turn off bluetooth.': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\turn off bluetooth",
            'turn on wi-fi.': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\turn on wifi",
            'turn on bluetooth.': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\turn on bluetooth",
            'open control panel': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\open control panel",
            'stop movie': r"C:\Users\Retr0\Desktop\Recordings for FYP\Movie\Stop Movie",
            'play movie': r"C:\Users\Retr0\Desktop\Recordings for FYP\Movie\Play Movie",
            'next movie': r"C:\Users\Retr0\Desktop\Recordings for FYP\Movie\Next Movie",
            'unmute volume': r"C:\Users\Retr0\Desktop\Recordings for FYP\Speakers\Unmute Volume",
            'volume down': r"C:\Users\Retr0\Desktop\Recordings for FYP\Speakers\Volume Down",
            'volume up': r"C:\Users\Retr0\Desktop\Recordings for FYP\Speakers\Volume Up",
            'open start menu': r"C:\Users\Retr0\Desktop\Recordings for FYP\Start Menu\Open Start Menu",
            'zoom in': r"C:\Users\Retr0\Desktop\Recordings for FYP\Window\Zoom in",
            'zoom out': r"C:\Users\Retr0\Desktop\Recordings for FYP\Window\Zoom out",
            'search for a specific file': r"C:\Users\Retr0\Desktop\Recordings for FYP\Window\Search for a specific file",
            'open google.com': r"C:\Users\Retr0\Desktop\Recordings for FYP\Browser\Open Google.com",
            'create new folder': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\create new folder",
            'dont listen while you speak': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\dont listen while you speak",
            'hello': r"C:\Users\Retr0\Desktop\Recordings for FYP\General\hello",

        }
    def get_folder_names(self):
        return list(self._folder_paths.keys())
    def get_folderpaths(self):
        # Function to get the path of folder
        return self._folder_paths

In [14]:
audio = Audiofile()
folders = audio.get_folderpaths()
print(folders)

{'assistance off': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\assistance off', 'assistance on': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\assistance on', 'turn off wi-fi.': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\turn off wifi', 'turn off bluetooth.': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\turn off bluetooth', 'turn on wi-fi.': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\turn on wifi', 'turn on bluetooth.': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\turn on bluetooth', 'open control panel': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\General\\open control panel', 'stop movie': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\Movie\\Stop Movie', 'play movie': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\Movie\\Play Movie', 'next movie': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\Movie\\Next Movie', 'unmute volume': 'C:\\Users\\Retr0\\Desktop\\Recordings for FYP\\Speakers\\Unmute Volume'

In [15]:
class Feature_Extraction:
    
    def __init__(self):
         self.X=[]
#         self.features={}

    def extract_features(self,audio_data, sr):
        
        features = {}
        # Existing Librosa Features
        features["chroma_stft"] = librosa.feature.chroma_stft(y=audio_data, sr=sr)
        features["chroma_cqt"] = librosa.feature.chroma_cqt(y=audio_data, sr=sr)
        features["chroma_cens"] = librosa.feature.chroma_cens(y=audio_data, sr=sr)
        features["melspectrogram"] = librosa.feature.melspectrogram(y=audio_data, sr=sr)
        features["mfccs"] = librosa.feature.mfcc(y=audio_data, sr=sr)
        features["rms"] = librosa.feature.rms(y=audio_data)
        features["spectral_centroid"] = librosa.feature.spectral_centroid(y=audio_data, sr=sr)
        features["spectral_bandwidth"] = librosa.feature.spectral_bandwidth(y=audio_data, sr=sr)
        features["spectral_contrast"] = librosa.feature.spectral_contrast(y=audio_data, sr=sr)
        features["spectral_flatness"] = librosa.feature.spectral_flatness(y=audio_data)
        features["spectral_rolloff"] = librosa.feature.spectral_rolloff(y=audio_data, sr=sr)
        features["poly_features"] = librosa.feature.poly_features(y=audio_data, sr=sr)
        features["zero_crossing_rate"] = librosa.feature.zero_crossing_rate(y=audio_data)

        # Additional Librosa Features
        features["harmonic_centroid"] = librosa.feature.spectral_centroid(y=librosa.effects.harmonic(audio_data), sr=sr)
        features["harmonic_tonnetz"] = librosa.effects.harmonic(librosa.feature.tonnetz(y=audio_data, sr=sr))
        features["harmonic_rms"] = librosa.feature.rms(y=librosa.effects.harmonic(audio_data))
        features["harmonic_spectral_flatness"] = librosa.feature.spectral_flatness(y=librosa.effects.harmonic(audio_data))
        features["harmonic_spectral_contrast"] = librosa.feature.spectral_contrast(y=librosa.effects.harmonic(audio_data), sr=sr)
        features["harmonic_spectral_rolloff"] = librosa.feature.spectral_rolloff(y=librosa.effects.harmonic(audio_data), sr=sr)
        features["harmonic_zero_crossing_rate"] = librosa.feature.zero_crossing_rate(y=librosa.effects.harmonic(audio_data))
        
        return features
    
    def calculate_mean(self,features):
        mean=[]
        for feature_name, feature_values in features.items():
            # Calculate mean
            feature_mean = np.mean(feature_values, axis=1)
            Final_feature_mean=np.mean(feature_mean, axis=0)
            mean.append(Final_feature_mean)
        return mean
    
    def compute_features(self):
        audios=Audiofile()
        paths =audios.get_folderpaths()             #folder name : folder path
        for folder,path in paths.items():
#             print(folder," : ",path)
            os.chdir(path)
#             print()
#             print(os.getcwd(),"\n",os.listdir())
            for one in os.listdir():
#                 print(one)
                audio_data,sr=librosa.load(one)
                features = self.extract_features(audio_data, sr)
                mean=self.calculate_mean(features)
                mean.append(folder)
                self.X.append(mean)
#             self.X.append(folder)

In [16]:
# feature_extractor=Feature_Extraction()
# audio_data, sr = librosa.load('Recording 112.wav')
# features = feature_extractor.extract_features(audio_data, sr)  
# features
# # # print(os.getcwd(),"\n",os.listdir())
data=Feature_Extraction()
data.compute_features()





In [17]:
data=Data_Preprocessing()
data.preprocessing()
data.X

[[0.3148534,
  0.44938722,
  0.26962763,
  0.8683516,
  -8.460384,
  0.050471116,
  2055.7444239247898,
  2122.464977730524,
  18.175618172740023,
  0.028133204,
  3970.535411005435,
  0.5573058706897414,
  0.09693444293478261,
  1435.764889594859,
  -0.005153828283788475,
  0.019528124,
  0.002812034,
  19.81496190812022,
  3142.599354619565,
  0.04577813632246377,
  'assistance off'],
 [0.33324137,
  0.41040543,
  0.2567056,
  0.84791887,
  -8.759901,
  0.05153707,
  2203.2884889012907,
  2101.4055690099285,
  18.54136076076312,
  0.042438973,
  4163.392304211128,
  0.5502008411438529,
  0.12242163681402439,
  1441.0557945357477,
  0.006465257966899159,
  0.028310101,
  0.003471975,
  20.93137693230367,
  2942.696027057927,
  0.05154344512195122,
  'assistance off'],
 [0.36579457,
  0.491572,
  0.27052754,
  0.8524674,
  -10.320987,
  0.04388995,
  1978.270510787401,
  2104.966466907474,
  17.458287384861617,
  0.03826897,
  3900.349308894231,
  0.47507490202940483,
  0.0961323832417

In [18]:
XX=pd.DataFrame(data.X)
XX.columns=['chroma_stft','chroma_cqt',
        'chroma_cens','melspectrogram','mfccs','rms','spectral_centroid','spectral_bandwidth',
        'spectral_contrast','spectral_flatness','spectral_rolloff',
        'poly_features','zero_crossing_rate',"harmonic_centroid","harmonic_tonnetz",
        "harmonic_rms","harmonic_spectral_flatness","harmonic_spectral_contrast",
        "harmonic_spectral_rolloff","harmonic_zero_crossing_rate",'class']
XX
# data.X

Unnamed: 0,chroma_stft,chroma_cqt,chroma_cens,melspectrogram,mfccs,rms,spectral_centroid,spectral_bandwidth,spectral_contrast,spectral_flatness,...,poly_features,zero_crossing_rate,harmonic_centroid,harmonic_tonnetz,harmonic_rms,harmonic_spectral_flatness,harmonic_spectral_contrast,harmonic_spectral_rolloff,harmonic_zero_crossing_rate,class
0,0.314853,0.449387,0.269628,0.868352,-8.460384,0.050471,2055.744424,2122.464978,18.175618,0.028133,...,0.557306,0.096934,1435.764890,-0.005154,0.019528,0.002812,19.814962,3142.599355,0.045778,assistance off
1,0.333241,0.410405,0.256706,0.847919,-8.759901,0.051537,2203.288489,2101.405569,18.541361,0.042439,...,0.550201,0.122422,1441.055795,0.006465,0.028310,0.003472,20.931377,2942.696027,0.051543,assistance off
2,0.365795,0.491572,0.270528,0.852467,-10.320987,0.043890,1978.270511,2104.966467,17.458287,0.038269,...,0.475075,0.096132,1404.367754,-0.004612,0.015012,0.004198,18.908135,2962.471830,0.051135,assistance off
3,0.366682,0.449307,0.263398,0.392838,-11.028899,0.033162,2010.991748,2108.957274,17.270739,0.034354,...,0.368664,0.095388,1257.770677,0.002987,0.013581,0.002509,19.202183,2729.596096,0.041403,assistance off
4,0.382542,0.491022,0.267234,0.501142,-10.479601,0.035843,1670.864122,1918.899049,17.294483,0.025254,...,0.392079,0.077822,1137.549262,-0.006427,0.015291,0.001925,18.965096,2195.396686,0.039714,assistance off
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1085,0.320585,0.360972,0.214530,1.029822,-16.066755,0.050752,1980.002852,2158.087737,21.337520,0.017429,...,0.337920,0.111399,1347.192344,-0.007183,0.043869,0.005473,24.802369,2525.889587,0.063995,hello
1086,0.406721,0.443279,0.238275,0.781018,-16.405331,0.039501,2419.303388,2325.301043,20.082895,0.028263,...,0.300632,0.136433,1416.064868,0.000448,0.034590,0.007167,23.309083,2732.729117,0.062222,hello
1087,0.365377,0.409115,0.225390,0.821256,-17.146938,0.043208,1930.901676,2044.480949,20.104520,0.018393,...,0.310116,0.104292,1203.158622,-0.004164,0.033083,0.004618,22.891717,2082.013640,0.051918,hello
1088,0.388034,0.443211,0.235701,0.738131,-16.789186,0.039363,1990.073375,2274.138626,19.599763,0.020506,...,0.279512,0.102178,1320.318022,-0.003163,0.030124,0.006198,22.296483,2497.189002,0.056122,hello


In [19]:
os.getcwd()
path=r'C:\Users\Retr0\Desktop'
os.chdir(path)

In [22]:
XX.to_csv("data.csv", index=False)


In [23]:
data=pd.read_csv("data.csv")
XX

Unnamed: 0,chroma_stft,chroma_cqt,chroma_cens,melspectrogram,mfccs,rms,spectral_centroid,spectral_bandwidth,spectral_contrast,spectral_flatness,...,poly_features,zero_crossing_rate,harmonic_centroid,harmonic_tonnetz,harmonic_rms,harmonic_spectral_flatness,harmonic_spectral_contrast,harmonic_spectral_rolloff,harmonic_zero_crossing_rate,class
0,0.314853,0.449387,0.269628,0.868352,-8.460384,0.050471,2055.744424,2122.464978,18.175618,0.028133,...,0.557306,0.096934,1435.764890,-0.005154,0.019528,0.002812,19.814962,3142.599355,0.045778,assistance off
1,0.333241,0.410405,0.256706,0.847919,-8.759901,0.051537,2203.288489,2101.405569,18.541361,0.042439,...,0.550201,0.122422,1441.055795,0.006465,0.028310,0.003472,20.931377,2942.696027,0.051543,assistance off
2,0.365795,0.491572,0.270528,0.852467,-10.320987,0.043890,1978.270511,2104.966467,17.458287,0.038269,...,0.475075,0.096132,1404.367754,-0.004612,0.015012,0.004198,18.908135,2962.471830,0.051135,assistance off
3,0.366682,0.449307,0.263398,0.392838,-11.028899,0.033162,2010.991748,2108.957274,17.270739,0.034354,...,0.368664,0.095388,1257.770677,0.002987,0.013581,0.002509,19.202183,2729.596096,0.041403,assistance off
4,0.382542,0.491022,0.267234,0.501142,-10.479601,0.035843,1670.864122,1918.899049,17.294483,0.025254,...,0.392079,0.077822,1137.549262,-0.006427,0.015291,0.001925,18.965096,2195.396686,0.039714,assistance off
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1085,0.320585,0.360972,0.214530,1.029822,-16.066755,0.050752,1980.002852,2158.087737,21.337520,0.017429,...,0.337920,0.111399,1347.192344,-0.007183,0.043869,0.005473,24.802369,2525.889587,0.063995,hello
1086,0.406721,0.443279,0.238275,0.781018,-16.405331,0.039501,2419.303388,2325.301043,20.082895,0.028263,...,0.300632,0.136433,1416.064868,0.000448,0.034590,0.007167,23.309083,2732.729117,0.062222,hello
1087,0.365377,0.409115,0.225390,0.821256,-17.146938,0.043208,1930.901676,2044.480949,20.104520,0.018393,...,0.310116,0.104292,1203.158622,-0.004164,0.033083,0.004618,22.891717,2082.013640,0.051918,hello
1088,0.388034,0.443211,0.235701,0.738131,-16.789186,0.039363,1990.073375,2274.138626,19.599763,0.020506,...,0.279512,0.102178,1320.318022,-0.003163,0.030124,0.006198,22.296483,2497.189002,0.056122,hello
