opusTCv20210807_transformer-big_2022-09-15.zip

dataset: opusTCv20210807
model: transformer-big
source language(s): heb
target language(s): bel bel_Latn bul ces pol rus slv ukr
raw source language(s): heb
raw target language(s): bel bul ces pol rus slv ukr
model: transformer-big
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels:
download: opusTCv20210807_transformer-big_2022-09-15.zip
test set translations: opusTCv20210807_transformer-big_2022-09-15.test.txt
test set scores: opusTCv20210807_transformer-big_2022-09-15.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test-v2021-08-07.heb-bel	34.8	0.52606	52	302	0.966
Tatoeba-test-v2021-08-07.heb-bul	100.0	10.00000	1	4	1.000
Tatoeba-test-v2021-08-07.heb-ces	36.5	0.66213	34	181	1.000
Tatoeba-test-v2021-08-07.heb-multi	41.6	0.62078	8555	53756	0.956
Tatoeba-test-v2021-08-07.heb-pol	42.7	0.63458	5000	31462	0.953
Tatoeba-test-v2021-08-07.heb-rus	40.4	0.60154	2500	16481	0.962
Tatoeba-test-v2021-08-07.heb-slv	61.8	0.76894	2	8	1.000
Tatoeba-test-v2021-08-07.heb-ukr	37.6	0.58660	966	5175	0.972