This repository will contain the results from Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation, presented at EACL2024, to be published in the next edition of TACL.
The models used in this work can be found at: https://huggingface.co/leukas
All of the models have a name as follows:
{model_name}-{dataset}-{amount_of_data}-{lang_pair}
model_name
= byt5 or mt5dataset
= wmt14 or nc16amount_of_data
= 400, 2k, 10k, 50k, 250k, 1250k, or not specified for full wmt14lang_pair
= ende, deen, enru, ruen, enes, ptes
Not all combinations exist, for more details check out the paper.
To use the models properly, be sure to start with the prompt "translate X to Y:" e.g. "translate English to German:"
Outputs and scores will be added soon.
If you have any questions don't hesitate to contact me (preferably via email).