Audio Classification

This repository contains code and research papers on audio classification. The focus of this repository is on speech or audio classification and it includes a basic classifier for Gujarati digits.

Getting Started

To get started with the code in this repository, simply run all the blocks in the MultilingualAudioClassification notebook to reproduce the results. The model inference is available at huggingface.co/manthan40/wav2vec2-base-finetuned-manthan_base.

Prerequisites

You will need to have the following packages installed in order to run the code in this repository:

References

The code in this repository uses the following research papers as reference:

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations by Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, and Michael Auli.
Development of a Novel Database in Gujarati Language for Spoken Digits Classification by Nikunj Dalsaniya, Sapan H. Mankad, Sanjay Garg, and Dhuri Shrivastava.

Contact

If you have any questions or suggestions, please open an issue in this repository or contact the repository owner.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
MultilingualAudioClassification.ipynb		MultilingualAudioClassification.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Classification

Getting Started

Prerequisites

References

Contact

About

Releases

Packages

Languages

License

manthanthakker/AudioClassification

Folders and files

Latest commit

History

Repository files navigation

Audio Classification

Getting Started

Prerequisites

References

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages