I'm a Machine Learning Engineer focused on building impactful AI solutions for low-resource African languages, specialising in:
- Speech Technologies (ASR, TTS, STS)
- Natural Language Processing
- Geospatial & Computer Vision ML
- Health, Agriculture & Climate ML
🎧 Currently building Shona & Twi Speech-to-Speech models
🤝 Collaborating with Duke Congo (CodeJoe)
📬 Email: nashaa182@gmail.com
🔗 LinkedIn: https://www.linkedin.com/in/nyashadzaishe-masvongo-b67698254/
High-quality, open-source dataset links for Bantu languages.
👉 Repo: https://github.com/SeViVI-Tese/voxAfrica
Consistently Top 1–10%, across NLP, CV, forecasting, and health ML.
| Competition | Rank | Medal | Summary |
|---|---|---|---|
| Kenya Clinical Reasoning | 3/387 | 🥇 | NLP reasoning model |
| Barbados Plot Automation | 8/154 | — | OCR + Geospatial CV |
| MPEG-G Microbiome | 6/73 | 🥇 | Deep learning on federated data |
| CGIAR Root Volume | 15/245 | 🥇 | CV underground scanning |
| IBM Hydropower | 23/444 | 🥇 | Climate & energy forecasting |
| Amini Soil | 29/308 | 🥇 | EO nutrient modelling |
| Côte d'Ivoire Agriculture | 28/143 | 🥈 | Crop pixel classification |
- Speech: TTS, ASR, STS
- NLP: tokenisation, transformers, finetuning
- CV & Geospatial ML
- Bioinformatics & climate modelling
I build AI for people and languages often overlooked by the global AI ecosystem.
Impact > hype.
Last updated: 2025-11-21 12:39 UTC
Dataset Distillation for Pre-Trained Self-Supervised Vision Models
The task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real s...