paper: compare ga_bert against xml-r #68

jowagner · 2021-04-09T18:31:28Z

https://peltarion.com/blog/data-science/a-deep-dive-into-multilingual-nlp-models suggests "that training monolingual models for small languages is unnecessary" as "XLM-R achieved ~80% accuracy whereas the Swedish BERT models reached ~79% accuracy".

Check whether off-the-shelf xlm-roberta (that's more or less just roberta trained on the larger xlm training data in 100 languages, or more languages as the automatic language filter will have classified some data in other languages as belonging to 1 of the 100), performs better in our downstream tasks than Irish-specific ga_bert.

There are two models: base and large.

jowagner · 2021-11-16T17:25:23Z

Figures ready for xlm-roberta-base.

jowagner added the idea Future work idea label Apr 9, 2021

jowagner added the next step This issue should be addresses in Summer 2022 label Sep 16, 2021

This was referenced Sep 16, 2021

Run some configurations with RoBERTa instead of BERT #8

Open

Meta: Summary of future work ideas / feature requests #86

Open

jowagner closed this as completed Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper: compare ga_bert against xml-r #68

paper: compare ga_bert against xml-r #68

jowagner commented Apr 9, 2021 •

edited

jowagner commented Nov 16, 2021

paper: compare ga_bert against xml-r #68

paper: compare ga_bert against xml-r #68

Comments

jowagner commented Apr 9, 2021 • edited

jowagner commented Nov 16, 2021

jowagner commented Apr 9, 2021 •

edited