audio

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

audio sdk transformers tts language-model whisper asr vlm sdk-python edge-computing on-device-ml on-device-ai llm stable-diffusion

Updated Feb 18, 2025
Python

metabrainz / picard

Sponsor

Star

MusicBrainz Picard audio file tagger

audio python music picard musicbrainz id3 tagger musicbrainz-picard music-tagger acoustid

Updated Feb 17, 2025
Python

huggingface / distil-whisper

Star

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

audio speech-recognition whisper

Updated Jan 8, 2025
Python

spotify / basic-pitch

Star

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

audio python music lightweight machine-learning typescript midi transcription pitch-detection polyphonic

Updated Jan 17, 2025
Python

riffusion / riffusion-hobby

Star

Stable diffusion for real-time music generation

audio music ai diffusion stable-diffusion diffusers

Updated Jul 22, 2024
Python

WyattBlue / auto-editor

Star

Auto-Editor: Efficient media analysis and rendering

audio video python3 audio-editing video-processing automatic video-editing audio-processing

Updated Feb 13, 2025
Python

zzw922cn / Automatic_Speech_Recognition

Star

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

audio deep-learning tensorflow paper end-to-end evaluation cnn lstm speech-recognition rnn automatic-speech-recognition feature-vector data-preprocessing phonemes timit-dataset layer-normalization rnn-encoder-decoder chinese-speech-recognition

Updated Mar 24, 2023
Python

Rikorose / DeepFilterNet

Star

Noise supression using deep filtering

audio rust deep-learning speech pytorch speech-enhancement noise-suppression

Updated Oct 17, 2024
Python

pytorch / audio

Star

Data manipulation and transformation for audio signal processing, powered by PyTorch

audio python machine-learning speech pytorch io audio-processing

Updated Feb 18, 2025
Python

readbeyond / aeneas

Star

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Updated Jun 22, 2024
Python

muammar / mkchromecast

Star

Cast macOS and Linux Audio/Video to your Google Cast and Sonos Devices

audio python macos linux node video debian chromecast sonos python3 sample-rate alsa chromecast-audio cast-videos tray-menu pychromecast cast-audio sonos-speakers soundflower

Updated Jul 1, 2024
Python

modelscope / ClearerVoice-Studio

Star

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

audio deep-learning speech pytorch speech-separation speech-enhancement noise-suppression speaker-extraction bandwidth-extension speech-super-resolution

Updated Feb 14, 2025
Python

Improve this page

Add a description, image, and links to the audio topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the audio topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio

Here are 2,133 public repositories matching this topic...

Anjok07 / ultimatevocalremovergui

AIGC-Audio / AudioGPT

speechbrain / speechbrain

Uberi / speech_recognition

openai / jukebox

librosa / librosa

smacke / ffsubsync

tyiannak / pyAudioAnalysis

NexaAI / nexa-sdk

metabrainz / picard

huggingface / distil-whisper

spotify / basic-pitch

riffusion / riffusion-hobby

WyattBlue / auto-editor

zzw922cn / Automatic_Speech_Recognition

Rikorose / DeepFilterNet

pytorch / audio

readbeyond / aeneas

muammar / mkchromecast

modelscope / ClearerVoice-Studio

Improve this page

Add this topic to your repo