#

wav2vec2

Here are 55 public repositories matching this topic...

akash13s / audio-to-image

Pipeline for generating images conditioned on input audio

pytorch u-net diffusion-models hubert wav2vec2

Updated Jul 25, 2024
Python

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jul 23, 2024
Python

JingleCate / SpeechEmotionRecog

A simple Speech Emotion Recognition (SER) project based on Wav2Vec2.

audio classification wav2vec2

Updated Jul 20, 2024
Python

s3prl

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Updated Jun 18, 2024
Python

jp1924 / wav2vec2

에전에 수행한 ASR 프로젝트 재구현한 repo

Updated Jun 16, 2024
Python

inboxpraveen / LLM-Minutes-of-Meeting

🎤📄 An innovative tool that transforms audio or video files into text transcripts and generates concise meeting minutes. Stay organized and efficient in your meetings, and get ready for Phase 2 where we'll be open for contributions to enable real-time meeting transcription! 🚀

python nlp natural-language-processing web translation transformers web-application speech-recognition speech-to-text whisper meeting-minutes webapplication minutes-of-meeting huggingface huggingface-transformers wav2vec2 llm whisper-ai llm-inference

Updated Jun 10, 2024
Python

sebinbenjamin / wav2vec_demo

A Python tool for transcribing speech from audio files using the Wav2Vec 2.0 model. Supports multilingual transcription, automatic audio chunking, and easy setup

transformers pytorch speech-recognition hugging-face wav2vec2

Updated May 15, 2024
Python

seanghay / kfa

A fast Khmer Forced Aligner powered by Wav2Vec2CTC and Phonetisaurus

alignment cambodia khmer forced-alignment wav2vec2

Updated May 2, 2024
Python

seb5433 / wav2vec2-speaker-recognition

Speaker recognition task using wav2vec2 model.

speaker-recognition fine-tuning speaker-recognition-systems wav2vec2

Updated Apr 25, 2024
Python

ECNU-Cross-Innovation-Lab / ENT

[ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition

automatic-speech-recognition speech-emotion-recognition wav2vec2

Updated Apr 11, 2024
Python

egorsmkv / asr-corpus-creator

This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.

audio speech-recognition automatic-speech-recognition nemo whisper audio-processing asr wav2vec2

Updated Feb 15, 2024
Python

oliverguhr / wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

pyaudio speech speech-recognition speech-to-text asr wav2vec wav2vec2

Updated Feb 4, 2024
Python

balena

louisbrulenaudet / balena

BALanced Execution through Natural Activation : a human-computer interaction methodology for code running.

terminal transformers python3 speech-recognition execution speech-to-text sentence-similarity speech-to-function sentence-transformers wav2vec2

Updated Jan 29, 2024
Python

aitor-alvarez / large-speech-models

Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper

whisper asr asr-model speech-recognition-model wav2vec2 arabic-speech-recognition large-speech-models finetuning-wav2vec finetuning-whisper

Updated Jan 23, 2024
Python

ECNU-Cross-Innovation-Lab / ShiftSER

[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations

speech-emotion-recognition hubert wav2vec2

Updated Dec 18, 2023
Python

Dhruv16S / Transcribing-Video-to-Text

This repository is an implementation of the Wav2Vec2 model for converting speech into text through a series of speech recognition, noise removal and STT to transcribe the text from a video file.

speech-recognition speech-to-text whisper video-to-text wav2vec2

Updated Dec 18, 2023
Python

agustyawan-arif / wav2vec2-large-xlsr-53-id

Performing audio transcription using the Wav2Vec2 model trained on the Common Voice dataset 13 for Indonesian.

deep-learning speech-recognition speech-to-text wav2vec2

Updated Dec 16, 2023
Python

appledora / wav2vec2_scripts

A modular codebase to process audio dataset, generate custom tokenizer, finetune and infer wav2vec2 model on custom dataset.

end-to-end inference speech-to-text fine-tuning huggingface wav2vec2

Updated Nov 12, 2023
Python

Msparihar / Transcriber

Developed an AI tool to automatically generate captions and transcripts for YouTube videos in 67 languages and can generate summarized texts in 133 languages.

nlp deep-neural-networks audio-processing kenlm wav2vec2

Updated Nov 10, 2023
Python

khanld / ASR-Wav2vec-Finetune

⚡ Finetune Wa2vec 2.0 For Speech Recognition

pytorch speech-recognition speech-to-text asr huggingface vietnamese-speech-recognition wav2vec2 finetune-wav2vec

Updated Nov 7, 2023
Python

Improve this page

Add a description, image, and links to the wav2vec2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the wav2vec2 topic, visit your repo's landing page and select "manage topics."