Skip to content

Releases: mohdali-dev/BaltiVoice-ASR

v1.0.0 - First Public Release of BaltiVoice ASR

28 May 13:40
d24cd78

Choose a tag to compare

🎙️ BaltiVoice v1.0.0

We are excited to announce the first public release of BaltiVoice, the first Automatic Speech Recognition (ASR) system for the Balti language (bft).

✨ What's New?

  • 📊 Dataset: Released 10,060 validated Balti audio clips (~16.8 hours) on HuggingFace Hub.
  • 🤖 Model: Fine-tuned Whisper-small achieving 30.07% WER on unseen validation data.
  • 🌐 Live Demo: Deployed real-time transcription demo on HuggingFace Spaces.
  • 📚 Documentation: Comprehensive README, Model Card, and CONTRIBUTING guide.

🚀 Quick Start

Try the Live Demo

Click here to try the live demo

Use the Model in Python

from transformers import pipeline

asr = pipeline(
    "automatic-speech-recognition",
    model="mohdali1/whisper-small-balti",
    generate_kwargs={"language": "urdu", "task": "transcribe"}
)

result = asr("your_balti_audio.wav")
print(result["text"])

Use the Dataset

from datasets import load_dataset

dataset = load_dataset("mohdali1/baltivoice-asr")

📈 Performance

Metric Value
Word Error Rate (WER) 30.07%
Training Steps 1000
Base Model openai/whisper-small

🤝 Contributing

We welcome contributions from Balti speakers and ML engineers! Please see our CONTRIBUTING.md for details on how to help validate data or improve the model.

📄 License

This project is licensed under the MIT License. See LICENSE for more information.


Built with ❤️ for language preservation.