Urban Sounds Classification

Date

Final project for the Machine Learning and AI ID tech camp.

High Level Overview

There are 8732 .wav files of 10 different urban sounds like dog barks, car horns, gun shots, etc. The dataset is divided in 10 folds (folders) to make the train and test easier. I used fold 1-9 to train the model, and fold 10 to test it. A custom CNN is used to classify the sounds.

The sound features used in the CNN are:

MFCC: Mel-frequency cepstral coefficients that use a quasi-logarithmic spaced frequency scale, which is more similar to how the human auditory system processes sounds.
Melspectrogram: Compute a Mel-scaled power spectrogram. Based on human ear.
chroma-stft: Compute a chromagram from a waveform or power spectrogram. Uses pitches.
chroma_cq: Constant-Q chromogram. Uses pitches.
chroma_cens: Chroma Energy Normalized CENS. Uses pitches.

Tech Stack

Python 3
Keras
Pandas
Librosa

Results

Test accuracy: 70%

Validation accurarcy: 90%

Reflection

As seen in the above results, the model is clearly overfitting. See more in my reflection on this project.

Useful Links

Dataset

Vlog I used as reference and inspiration

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
urban_test.py		urban_test.py
urban_test2.py		urban_test2.py
urbankeras.py		urbankeras.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Urban Sounds Classification

Date

High Level Overview

Tech Stack

Results

Reflection

Useful Links

About

Releases

Packages

Languages

gallo-json/urban-sounds-classification

Folders and files

Latest commit

History

Repository files navigation

Urban Sounds Classification

Date

High Level Overview

Tech Stack

Results

Reflection

Useful Links

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages