🦜Rainforest Connection Species Audio Detection🐸

Summary

A modified ResNet18 model is used to predict the species from 24 types. After trimming the audio samples from the dataset to the correct length to match the species' call and implementing a band pass filter to remove the frequencies outiside the range of the call, the highest validation accuracy achieved was 0.72, but there are certainly ways to improve this model, including introducing other data preprocessing techniques.

The dataset can be found from the official Kaggle page.

Future Steps for Preprocessing the Dataset

Pitch shift: Pitch shifting all the audio to have an equal centre frequency could improve generalisation
Set dB level: Making the data the same dB level (maybe having equal RMS?), could encourage the model to infer mainly from frequency data

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ModelSaves		ModelSaves
.gitignore		.gitignore
LICENSE		LICENSE
Notebook.ipynb		Notebook.ipynb
README.md		README.md
ResNet18 - M1 - Trimmed only.png		ResNet18 - M1 - Trimmed only.png
ResNet18 - Trimmed - Bandpass.png		ResNet18 - Trimmed - Bandpass.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦜Rainforest Connection Species Audio Detection🐸

Summary

Future Steps for Preprocessing the Dataset

About

Languages

License

dilne/Rainforest-Connection-Species-Audio-Detection

Folders and files

Latest commit

History

Repository files navigation

🦜Rainforest Connection Species Audio Detection🐸

Summary

Future Steps for Preprocessing the Dataset

About

Resources

License

Stars

Watchers

Forks

Languages