Skip to content
This repository has been archived by the owner on Jun 24, 2023. It is now read-only.

Classify 10 different sounds from the city using a CNN

Notifications You must be signed in to change notification settings

gallo-json/urban-sounds-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Urban Sounds Classification

Date

Final project for the Machine Learning and AI ID tech camp.

High Level Overview

There are 8732 .wav files of 10 different urban sounds like dog barks, car horns, gun shots, etc. The dataset is divided in 10 folds (folders) to make the train and test easier. I used fold 1-9 to train the model, and fold 10 to test it. A custom CNN is used to classify the sounds.

The sound features used in the CNN are:

  • MFCC: Mel-frequency cepstral coefficients that use a quasi-logarithmic spaced frequency scale, which is more similar to how the human auditory system processes sounds.
  • Melspectrogram: Compute a Mel-scaled power spectrogram. Based on human ear.
  • chroma-stft: Compute a chromagram from a waveform or power spectrogram. Uses pitches.
  • chroma_cq: Constant-Q chromogram. Uses pitches.
  • chroma_cens: Chroma Energy Normalized CENS. Uses pitches.

Tech Stack

  • Python 3
  • Keras
  • Pandas
  • Librosa

Results

Test accuracy: 70%

Validation accurarcy: 90%

Reflection

As seen in the above results, the model is clearly overfitting. See more in my reflection on this project.

Useful Links

Dataset

Vlog I used as reference and inspiration

About

Classify 10 different sounds from the city using a CNN

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages