- dataset: opusTCv20210807
- model: transformer-big
- source language(s): heb
- target language(s): bel bel_Latn bul ces pol rus slv ukr
- raw source language(s): heb
- raw target language(s): bel bul ces pol rus slv ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807_transformer-big_2022-09-15.zip
- test set translations: opusTCv20210807_transformer-big_2022-09-15.test.txt
- test set scores: opusTCv20210807_transformer-big_2022-09-15.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
Tatoeba-test-v2021-08-07.heb-bel | 34.8 | 0.52606 | 52 | 302 | 0.966 |
Tatoeba-test-v2021-08-07.heb-bul | 100.0 | 10.00000 | 1 | 4 | 1.000 |
Tatoeba-test-v2021-08-07.heb-ces | 36.5 | 0.66213 | 34 | 181 | 1.000 |
Tatoeba-test-v2021-08-07.heb-multi | 41.6 | 0.62078 | 8555 | 53756 | 0.956 |
Tatoeba-test-v2021-08-07.heb-pol | 42.7 | 0.63458 | 5000 | 31462 | 0.953 |
Tatoeba-test-v2021-08-07.heb-rus | 40.4 | 0.60154 | 2500 | 16481 | 0.962 |
Tatoeba-test-v2021-08-07.heb-slv | 61.8 | 0.76894 | 2 | 8 | 1.000 |
Tatoeba-test-v2021-08-07.heb-ukr | 37.6 | 0.58660 | 966 | 5175 | 0.972 |