Fine-tuned-Multilingual-whisper-based-RAG-on-hindi-dataset

A real-time Retrieval-Augmented-Generation(RAG) based model to perform question answering on hindi audio data. Here, the fine-tuned open ai's whisper-tiny model downsampled the word error rate(WER) to 74.24 for the hindi dataset.

Deployment Code:

Follow these steps to run the prototype in your system:

git clone https://github.com/system-reboot/Multilingual-whisper-based-RAG.git
cd Multilingual-whisper-based-RAG
jupyter execute fine-tuning-whisper.ipynb
python3 run inference.py

Files:

fine-tuning-whisper.ipynb - Whisper-tiny model tuned for hindi dataset.
rag.py - QA-Bert model for performing question answering on the passed audio.
inference.py - Displays the Gradio-based interface for inference results.

Note:

Try to give shorter length audio for efficient results.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Audio files		Audio files
__pycache__		__pycache__
flagged		flagged
README.md		README.md
fine-tuning-whisper.ipynb		fine-tuning-whisper.ipynb
inference.py		inference.py
rag.py		rag.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuned-Multilingual-whisper-based-RAG-on-hindi-dataset

Deployment Code:

Files:

Note:

About

Releases

Packages

Languages

system-reboot/Multilingual-whisper-based-RAG

Folders and files

Latest commit

History

Repository files navigation

Fine-tuned-Multilingual-whisper-based-RAG-on-hindi-dataset

Deployment Code:

Files:

Note:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages