DhivehiML 📋

Curating and organizing Dhivehi ML projects and experiments

The goal of this repo is to curate resources related to Machine Learning and language tools in general aimed for dhivehi, as such any new comer can easily get to an understanding about the current state and use the prebuiilt tools 😄

Current State

Language models

@mapmeid has done some amazing work training and fine tuning langauge models for dhivehi

dv-wave:ELECTRA model trained from scratch on dhivehi text
dv-muril:Experiment in inserting equivelent dhivehi words to muril
dv-labse: Inserting dhivehi wordpeice tokens to Google's LaBSE models

Text to Speech

Tacotron2 trained on Commonvoice data upto about 300k demo

Synthesis audio from Tacotron2 (griffin Lim)

Current best model: Tacotron2 ~300k

Speech to Text

No work has been done in this area in a way that benefits the publlic

Notebooks

Text to speech experiments

Training Mozilla's tacotron2 implementation with data from Mozilla common voice (griffin Lim)
Also Training Mozilla's tacotron2 implementation with data from Mozilla common voice (griffin Lim)
Training MultiBand MelGAN on mozilla common voice data(Single Speaker) ~10k model
Process Commonvoice data to LJspeech-1.1 format(Also allows to generate audio only from specified speakers)

Speech to text

[WIP]

Transliteration

Training Seq2Seq model to transliterate dhivehi to latin based on div-transliteration
Inference for the div-transliteration model

Name Entity recognition

Spacy NER

Datatsets

DhivehiDatasets: Many types of Curated Dhivehi datasets from many sources(News, )
Common Voice: Crowd sourced voice dataset
opendatamv: Effort to collect various types data by the Open Source community

Tasks For Evaluation

needs to decided and created

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
demo/tts/tactoron2/griffinlim		demo/tts/tactoron2/griffinlim
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DhivehiML 📋

Curating and organizing Dhivehi ML projects and experiments

Current State

Language models

Text to Speech

Speech to Text

Notebooks

Text to speech experiments

Speech to text

Transliteration

Name Entity recognition

Datatsets

Tasks For Evaluation

About

Releases

Packages

License

Dharisd/DhivehiML

Folders and files

Latest commit

History

Repository files navigation

DhivehiML 📋

Curating and organizing Dhivehi ML projects and experiments

Current State

Language models

Text to Speech

Speech to Text

Notebooks

Text to speech experiments

Speech to text

Transliteration

Name Entity recognition

Datatsets

Tasks For Evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages