-
Notifications
You must be signed in to change notification settings - Fork 39
WoC: Hinglish transformer
Gautam edited this page Jan 26, 2023
·
5 revisions
Build a hugging face pipeline to train and fine-tune a transformer to translate Hinglish sentences (text containing Hindi and English words in Roman/Latin script) into English. The sentence will usually be a Hindi sentence transliterated into the Roman script with a few nouns in English. The task will involve generating synthetic Hinglish-English sentence pairs using 'Bhashini' Hindi<-> English translation models. The transformer should be available both as a hosted API service for streaming translation tasks and as a downloadable model for tabular data batch translations.
End Goal : Feedback data received in Hinglish should be translated to English
- Create corpus of Hinglish- English sentence pairs
- Create pipeline for training transformer on corpus
- Create model finetuning pipeline for pre-trained model
- Create deployment setup
Issues can be raised here
| Category | Rating |
|---|---|
| Difficulty | Medium |
| Risk/Exploratory | High |
| Core Development | Python, PyTorch |
| Skills | NLP |
| Mentors | Gautam Rajeev |
| Project size |
Copyright © 2022 | All Rights Reserved
- UCI Web Channel
- Admin for Sunbird RC
- UCI Signal Integration
- Centralised Access Control
- Competency Passbook
- Low-code Admin Console
- Workflow Management
- Machine Learning Platform
- URL Shortener (YAUS)
- Doc Generator
- Shiksha Postgres Adapter
- Shiksha CMS and Announcements Module
- Shiksha Frontend Restructuring
- Shiksha Design System
- Sunbird QUML Player