Skip to content

v1.0.0 - First Public Release of BaltiVoice ASR

Latest

Choose a tag to compare

@mohdali-dev mohdali-dev released this 28 May 13:40
· 27 commits to main since this release
d24cd78

πŸŽ™οΈ BaltiVoice v1.0.0

We are excited to announce the first public release of BaltiVoice, the first Automatic Speech Recognition (ASR) system for the Balti language (bft).

✨ What's New?

  • πŸ“Š Dataset: Released 10,060 validated Balti audio clips (~16.8 hours) on HuggingFace Hub.
  • πŸ€– Model: Fine-tuned Whisper-small achieving 30.07% WER on unseen validation data.
  • 🌐 Live Demo: Deployed real-time transcription demo on HuggingFace Spaces.
  • πŸ“š Documentation: Comprehensive README, Model Card, and CONTRIBUTING guide.

πŸš€ Quick Start

Try the Live Demo

Click here to try the live demo

Use the Model in Python

from transformers import pipeline

asr = pipeline(
    "automatic-speech-recognition",
    model="mohdali1/whisper-small-balti",
    generate_kwargs={"language": "urdu", "task": "transcribe"}
)

result = asr("your_balti_audio.wav")
print(result["text"])

Use the Dataset

from datasets import load_dataset

dataset = load_dataset("mohdali1/baltivoice-asr")

πŸ“ˆ Performance

Metric Value
Word Error Rate (WER) 30.07%
Training Steps 1000
Base Model openai/whisper-small

🀝 Contributing

We welcome contributions from Balti speakers and ML engineers! Please see our CONTRIBUTING.md for details on how to help validate data or improve the model.

πŸ“„ License

This project is licensed under the MIT License. See LICENSE for more information.


Built with ❀️ for language preservation.