Skip to content

Latest commit

 

History

History

gem-eng

opus-2020-07-04.zip

  • dataset: opus
  • model: transformer
  • source language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld non_Latn pdc sco stq swe swg yid
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-04.zip
  • test set translations: opus-2020-07-04.test.txt
  • test set scores: opus-2020-07-04.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.afr-eng.afr.eng 61.5 0.749
Tatoeba-test.ang-eng.ang.eng 7.0 0.241
Tatoeba-test.dan-eng.dan.eng 60.3 0.744
Tatoeba-test.deu-eng.deu.eng 48.7 0.659
Tatoeba-test.enm-eng.enm.eng 23.0 0.461
Tatoeba-test.fao-eng.fao.eng 21.6 0.411
Tatoeba-test.frr-eng.frr.eng 14.1 0.167
Tatoeba-test.fry-eng.fry.eng 27.5 0.474
Tatoeba-test.gos-eng.gos.eng 17.0 0.336
Tatoeba-test.got-eng.got.eng 0.0 0.004
Tatoeba-test.gsw-eng.gsw.eng 13.4 0.327
Tatoeba-test.isl-eng.isl.eng 50.2 0.663
Tatoeba-test.ksh-eng.ksh.eng 5.3 0.242
Tatoeba-test.ltz-eng.ltz.eng 34.1 0.497
Tatoeba-test.multi.eng 53.0 0.683
Tatoeba-test.nds-eng.nds.eng 31.5 0.515
Tatoeba-test.nld-eng.nld.eng 58.0 0.729
Tatoeba-test.non-eng.non.eng 36.6 0.550
Tatoeba-test.pdc-eng.pdc.eng 26.7 0.424
Tatoeba-test.sco-eng.sco.eng 44.9 0.589
Tatoeba-test.stq-eng.stq.eng 5.0 0.363
Tatoeba-test.swe-eng.swe.eng 61.5 0.742
Tatoeba-test.swg-eng.swg.eng 20.7 0.334
Tatoeba-test.yid-eng.yid.eng 17.2 0.380

opus-2020-07-14.zip

  • dataset: opus
  • model: transformer
  • source language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-14.zip
  • test set translations: opus-2020-07-14.test.txt
  • test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.afr-eng.afr.eng 62.4 0.755
Tatoeba-test.ang-eng.ang.eng 9.7 0.250
Tatoeba-test.dan-eng.dan.eng 60.5 0.745
Tatoeba-test.deu-eng.deu.eng 48.4 0.656
Tatoeba-test.enm-eng.enm.eng 23.2 0.495
Tatoeba-test.fao-eng.fao.eng 25.0 0.456
Tatoeba-test.frr-eng.frr.eng 16.8 0.233
Tatoeba-test.fry-eng.fry.eng 25.9 0.456
Tatoeba-test.gos-eng.gos.eng 17.2 0.339
Tatoeba-test.got-eng.got.eng 0.1 0.013
Tatoeba-test.gsw-eng.gsw.eng 12.7 0.320
Tatoeba-test.isl-eng.isl.eng 50.1 0.659
Tatoeba-test.ksh-eng.ksh.eng 3.6 0.247
Tatoeba-test.ltz-eng.ltz.eng 32.8 0.473
Tatoeba-test.multi.eng 53.2 0.681
Tatoeba-test.nds-eng.nds.eng 31.2 0.510
Tatoeba-test.nld-eng.nld.eng 58.0 0.727
Tatoeba-test.non-eng.non.eng 38.8 0.581
Tatoeba-test.pdc-eng.pdc.eng 31.7 0.456
Tatoeba-test.sco-eng.sco.eng 38.8 0.578
Tatoeba-test.stq-eng.stq.eng 20.2 0.421
Tatoeba-test.swe-eng.swe.eng 61.5 0.742
Tatoeba-test.swg-eng.swg.eng 17.5 0.316
Tatoeba-test.yid-eng.yid.eng 18.3 0.386

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-deueng.deu.eng 26.8 0.537
news-test2008-deueng.deu.eng 25.6 0.531
newstest2009-deueng.deu.eng 24.6 0.527
newstest2010-deueng.deu.eng 27.8 0.564
newstest2011-deueng.deu.eng 25.1 0.536
newstest2012-deueng.deu.eng 26.5 0.548
newstest2013-deueng.deu.eng 29.5 0.564
newstest2014-deen-deueng.deu.eng 29.9 0.569
newstest2015-ende-deueng.deu.eng 31.1 0.574
newstest2016-ende-deueng.deu.eng 36.3 0.619
newstest2017-ende-deueng.deu.eng 32.4 0.583
newstest2018-ende-deueng.deu.eng 39.4 0.636
newstest2019-deen-deueng.deu.eng 35.5 0.604
Tatoeba-test.afr-eng.afr.eng 62.8 0.756
Tatoeba-test.ang-eng.ang.eng 9.7 0.258
Tatoeba-test.dan-eng.dan.eng 60.4 0.744
Tatoeba-test.deu-eng.deu.eng 48.7 0.658
Tatoeba-test.enm-eng.enm.eng 22.4 0.478
Tatoeba-test.fao-eng.fao.eng 23.8 0.451
Tatoeba-test.frr-eng.frr.eng 16.8 0.241
Tatoeba-test.fry-eng.fry.eng 24.8 0.449
Tatoeba-test.gos-eng.gos.eng 17.3 0.341
Tatoeba-test.got-eng.got.eng 0.1 0.022
Tatoeba-test.gsw-eng.gsw.eng 13.3 0.324
Tatoeba-test.isl-eng.isl.eng 49.9 0.658
Tatoeba-test.ksh-eng.ksh.eng 3.2 0.240
Tatoeba-test.ltz-eng.ltz.eng 34.4 0.485
Tatoeba-test.multi.eng 53.4 0.682
Tatoeba-test.nds-eng.nds.eng 31.5 0.511
Tatoeba-test.nld-eng.nld.eng 58.2 0.727
Tatoeba-test.non-eng.non.eng 35.0 0.577
Tatoeba-test.nor-eng.nor.eng 54.1 0.692
Tatoeba-test.pdc-eng.pdc.eng 32.5 0.466
Tatoeba-test.sco-eng.sco.eng 35.9 0.569
Tatoeba-test.stq-eng.stq.eng 4.8 0.349
Tatoeba-test.swe-eng.swe.eng 61.3 0.741
Tatoeba-test.swg-eng.swg.eng 14.8 0.294
Tatoeba-test.yid-eng.yid.eng 18.6 0.389

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-deueng.deu.eng 27.2 0.542
news-test2008-deueng.deu.eng 26.3 0.536
newstest2009-deueng.deu.eng 25.1 0.531
newstest2010-deueng.deu.eng 28.3 0.569
newstest2011-deueng.deu.eng 26.0 0.543
newstest2012-deueng.deu.eng 26.8 0.550
newstest2013-deueng.deu.eng 30.2 0.570
newstest2014-deen-deueng.deu.eng 30.7 0.574
newstest2015-ende-deueng.deu.eng 32.1 0.581
newstest2016-ende-deueng.deu.eng 36.9 0.624
newstest2017-ende-deueng.deu.eng 32.8 0.588
newstest2018-ende-deueng.deu.eng 40.2 0.640
newstest2019-deen-deueng.deu.eng 36.8 0.614
Tatoeba-test.afr-eng.afr.eng 62.8 0.758
Tatoeba-test.ang-eng.ang.eng 10.5 0.262
Tatoeba-test.dan-eng.dan.eng 61.6 0.754
Tatoeba-test.deu-eng.deu.eng 49.7 0.665
Tatoeba-test.enm-eng.enm.eng 23.9 0.491
Tatoeba-test.fao-eng.fao.eng 23.4 0.446
Tatoeba-test.frr-eng.frr.eng 10.2 0.184
Tatoeba-test.fry-eng.fry.eng 29.6 0.486
Tatoeba-test.gos-eng.gos.eng 17.8 0.352
Tatoeba-test.got-eng.got.eng 0.1 0.058
Tatoeba-test.gsw-eng.gsw.eng 15.3 0.333
Tatoeba-test.isl-eng.isl.eng 51.0 0.669
Tatoeba-test.ksh-eng.ksh.eng 6.7 0.266
Tatoeba-test.ltz-eng.ltz.eng 33.0 0.505
Tatoeba-test.multi.eng 54.0 0.687
Tatoeba-test.nds-eng.nds.eng 33.6 0.529
Tatoeba-test.nld-eng.nld.eng 58.9 0.733
Tatoeba-test.non-eng.non.eng 37.3 0.546
Tatoeba-test.nor-eng.nor.eng 54.9 0.696
Tatoeba-test.pdc-eng.pdc.eng 29.6 0.446
Tatoeba-test.sco-eng.sco.eng 40.5 0.581
Tatoeba-test.stq-eng.stq.eng 14.5 0.361
Tatoeba-test.swe-eng.swe.eng 62.0 0.745
Tatoeba-test.swg-eng.swg.eng 17.1 0.334
Tatoeba-test.yid-eng.yid.eng 19.4 0.400

opus4m-2020-08-12.zip

  • dataset: opus4m
  • model: transformer
  • source language(s): afr ang_Latn dan deu enm_Latn fao frr fry gos got_Goth gsw isl ksh ltz nds nld nno nob nob_Hebr non_Latn pdc sco stq swe swg yid
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus4m-2020-08-12.zip
  • test set translations: opus4m-2020-08-12.test.txt
  • test set scores: opus4m-2020-08-12.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-deueng.deu.eng 27.8 0.545
news-test2008-deueng.deu.eng 26.3 0.538
newstest2009-deueng.deu.eng 25.4 0.533
newstest2010-deueng.deu.eng 28.4 0.569
newstest2011-deueng.deu.eng 25.9 0.543
newstest2012-deueng.deu.eng 27.2 0.554
newstest2013-deueng.deu.eng 30.3 0.570
newstest2014-deen-deueng.deu.eng 31.0 0.577
newstest2015-ende-deueng.deu.eng 32.2 0.582
newstest2016-ende-deueng.deu.eng 37.6 0.629
newstest2017-ende-deueng.deu.eng 33.5 0.592
newstest2018-ende-deueng.deu.eng 40.8 0.645
newstest2019-deen-deueng.deu.eng 37.2 0.618
Tatoeba-test.afr-eng.afr.eng 63.2 0.762
Tatoeba-test.ang-eng.ang.eng 10.4 0.273
Tatoeba-test.dan-eng.dan.eng 61.9 0.756
Tatoeba-test.deu-eng.deu.eng 50.3 0.670
Tatoeba-test.enm-eng.enm.eng 22.8 0.487
Tatoeba-test.fao-eng.fao.eng 27.4 0.462
Tatoeba-test.frr-eng.frr.eng 11.8 0.230
Tatoeba-test.fry-eng.fry.eng 30.1 0.490
Tatoeba-test.gos-eng.gos.eng 18.3 0.355
Tatoeba-test.got-eng.got.eng 0.1 0.069
Tatoeba-test.gsw-eng.gsw.eng 16.3 0.345
Tatoeba-test.isl-eng.isl.eng 50.9 0.669
Tatoeba-test.ksh-eng.ksh.eng 6.8 0.258
Tatoeba-test.ltz-eng.ltz.eng 34.0 0.504
Tatoeba-test.multi.eng 54.6 0.692
Tatoeba-test.nds-eng.nds.eng 34.1 0.534
Tatoeba-test.nld-eng.nld.eng 59.6 0.738
Tatoeba-test.non-eng.non.eng 37.3 0.562
Tatoeba-test.nor-eng.nor.eng 55.4 0.700
Tatoeba-test.pdc-eng.pdc.eng 28.1 0.427
Tatoeba-test.sco-eng.sco.eng 35.8 0.558
Tatoeba-test.stq-eng.stq.eng 21.6 0.430
Tatoeba-test.swe-eng.swe.eng 62.5 0.748
Tatoeba-test.swg-eng.swg.eng 17.8 0.315
Tatoeba-test.yid-eng.yid.eng 20.8 0.409