Skip to content

Ayushh1023/Audio-Classification-using-TensorFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Classification using TensorFlow

UrbanSound8K Audio Classifier: TensorFlow model

Overview

This repository contains a TensorFlow model for classifying audio samples from the UrbanSound8K dataset. The model achieves high accuracy through a streamlined workflow and comprehensive data analysis.

Dataset

The UrbanSound8K dataset consists of 8,732 audio samples across 10 classes. Each sample is labeled with the corresponding sound class, making it suitable for supervised learning tasks.

Features

  • Utilizes Mel-Frequency Cepstral Coefficients (MFCC) for feature extraction.
  • TensorFlow model architecture optimized for audio classification tasks.
  • Efficient training process with Adam optimizer and early stopping.

Usage of Librosa for Audio Processing

This project heavily relies on the Librosa library for various audio processing tasks, including:

  • Loading audio files in various formats.
  • Extracting features using Mel-Frequency Cepstral Coefficients (MFCC) function.
  • Visualizing audio waveforms and spectrograms.
  • Performing advanced audio analysis by gauging the sampling rate of the loaded audio.
  • Performing manipulations such as time-stretching, pitch-shifting, and noise injection using Librosa's built-in functions.

Model Architecture

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Input

# Define the model
model = Sequential(
    layers=[

            Input(shape=X[0].shape),

            # Layer 1
            Dense(100),
            Activation('relu'),
            Dropout(0.5),
            
            # Layer 2
            Dense(200),
            Activation('relu'),
            Dropout(0.5),
            
            # Layer 3
            Dense(100),
            Activation('relu'),
            Dropout(0.5),
            
            # Output Layer
            Dense(num_labels),
            Activation('softmax')
        
        ]

)
model.build()

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Model-Train Configuration

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from datetime import datetime

EPOCHS = 1000
BACTH_SIZE = 32

checkPoint = ModelCheckpoint(filepath='checkpoints/model.h5', verbose=1, save_best_only=True)

start = datetime.now()

model.fit(X_train, y_train, batch_size=BACTH_SIZE, epochs=EPOCHS, validation_data=(X_test, y_test), callbacks=[checkPoint])

duration = datetime.now() - start
print("Elapsed Time: ", duration)

Results

  • Test accuracy: close to 82% (Actual : 81.96%)
  • Model checkpoint available for further experimentation.