Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 561 Bytes

README.md

File metadata and controls

12 lines (8 loc) · 561 Bytes

wav2vec2_hugging_face

The objection of this task is to have audio file as user input and generate text. we can then use generated for different task based on situation.

My repo contains 2 notebooks and 3 sets of audio files. To run them, you’ll need: Transformers ≥ 4.3 Librosa (to manage the audio files)

I’m sticking with the wav2vec2-base-960h base model. we can use large model for better performance.

Audio file drive link recorded +15 minute https://drive.google.com/file/d/1BqdcrslUPP8JC5ym7qy4_6urYjtsYnEG/view?usp=sharing