Skip to content

A compilation of libraries, case studies, resources, and research papers revolving around deep learning/machine learning for audio

License

Notifications You must be signed in to change notification settings

therealmolf/audaio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 

Repository files navigation

audaio

A practical compilation of libraries, case studies, resources, datasets, and research papers revolving around deep learning/machine learning for audio. 🎶🎶🎶 Reasonable resources you will actually use!

Audio ML Landscape Map

  • I will add this very soon!

Datasets

Libraries

End-to-end Toolkits

  • DeepSpeech: an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
  • PaddleSpeech: Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting.
  • NeMo: a toolkit for conversaional AI
  • Speech Brain: an open-source and all-in-one conversational AI toolkit based on PyTorch.

Data Transformation and Manipulation

  • torchaudio: Data manipulation and transformation for audio signal processing, powered by PyTorch
  • nlpaug: Data augmentation for NLP. This has spectrogram and audio input support. Check this and this
  • pedalboard: Spotify's Python library for working with audio. Internally, SPotify uses this for data augmentation and improving machine learning models.
  • Deezer: Deezer source separation library including pretrained models.
  • Basic Pitch: A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
  • librosa: Python library for audio and music analysis

Comparisons

  • Best library for certain tasks? Where to focus and when learning tools?

Getting Started

  • How to use librosa, one end-to-end tooklkit, torchaudio
  • Fundamental papers related to audio deep learning
  • Guided walkthroughs from data preparation to deployment

Audio Generation

Learning Resource

Research Papers

Music Source Separation

Genre Recognition

Automatic Speech Recognition

Learning Resource

Research Papers

Music Information Retrieval

Music Recommendation

Podcast Summarization

Other Research Papers

Syntax Description Test Text
Header Title Here's this
Paragraph Text And more

Releases

No releases published

Packages

No packages published