ποΈ BaltiVoice v1.0.0
We are excited to announce the first public release of BaltiVoice, the first Automatic Speech Recognition (ASR) system for the Balti language (bft).
β¨ What's New?
- π Dataset: Released 10,060 validated Balti audio clips (~16.8 hours) on HuggingFace Hub.
- π€ Model: Fine-tuned Whisper-small achieving 30.07% WER on unseen validation data.
- π Live Demo: Deployed real-time transcription demo on HuggingFace Spaces.
- π Documentation: Comprehensive README, Model Card, and CONTRIBUTING guide.
π Quick Start
Try the Live Demo
Click here to try the live demo
Use the Model in Python
from transformers import pipeline
asr = pipeline(
"automatic-speech-recognition",
model="mohdali1/whisper-small-balti",
generate_kwargs={"language": "urdu", "task": "transcribe"}
)
result = asr("your_balti_audio.wav")
print(result["text"])Use the Dataset
from datasets import load_dataset
dataset = load_dataset("mohdali1/baltivoice-asr")π Performance
| Metric | Value |
|---|---|
| Word Error Rate (WER) | 30.07% |
| Training Steps | 1000 |
| Base Model | openai/whisper-small |
π€ Contributing
We welcome contributions from Balti speakers and ML engineers! Please see our CONTRIBUTING.md for details on how to help validate data or improve the model.
π License
This project is licensed under the MIT License. See LICENSE for more information.
Built with β€οΈ for language preservation.