Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
opus-2020-07-14.yml		opus-2020-07-14.yml
opus-2020-07-20.yml		opus-2020-07-20.yml
opus-2020-07-27.yml		opus-2020-07-27.yml
opus.yml		opus.yml
opus1m+bt-2021-04-10.yml		opus1m+bt-2021-04-10.yml
opus1m+bt.yml		opus1m+bt.yml
opus2m-2020-08-01.yml		opus2m-2020-08-01.yml
opus2m.yml		opus2m.yml

README.md

opus-2020-07-14.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-14.zip
test set translations: opus-2020-07-14.test.txt
test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-arg.eng.arg	1.5	0.120
Tatoeba-test.eng-ast.eng.ast	17.1	0.384
Tatoeba-test.eng-cat.eng.cat	47.1	0.666
Tatoeba-test.eng-cos.eng.cos	3.1	0.274
Tatoeba-test.eng-egl.eng.egl	0.2	0.105
Tatoeba-test.eng-ext.eng.ext	4.9	0.243
Tatoeba-test.eng-fra.eng.fra	44.1	0.629
Tatoeba-test.eng-frm.eng.frm	1.2	0.207
Tatoeba-test.eng-gcf.eng.gcf	0.3	0.092
Tatoeba-test.eng-glg.eng.glg	43.1	0.635
Tatoeba-test.eng-hat.eng.hat	28.3	0.509
Tatoeba-test.eng-ita.eng.ita	44.8	0.669
Tatoeba-test.eng-lad.eng.lad	5.2	0.276
Tatoeba-test.eng-lat.eng.lat	11.9	0.376
Tatoeba-test.eng-lij.eng.lij	1.3	0.172
Tatoeba-test.eng-lld.eng.lld	0.9	0.211
Tatoeba-test.eng-lmo.eng.lmo	0.3	0.150
Tatoeba-test.eng-mfe.eng.mfe	68.0	0.848
Tatoeba-test.eng.multi	37.2	0.583
Tatoeba-test.eng-mwl.eng.mwl	2.7	0.356
Tatoeba-test.eng-oci.eng.oci	7.7	0.286
Tatoeba-test.eng-pap.eng.pap	43.9	0.641
Tatoeba-test.eng-pms.eng.pms	1.8	0.177
Tatoeba-test.eng-por.eng.por	40.7	0.632
Tatoeba-test.eng-roh.eng.roh	2.2	0.247
Tatoeba-test.eng-ron.eng.ron	39.7	0.626
Tatoeba-test.eng-scn.eng.scn	0.7	0.132
Tatoeba-test.eng-spa.eng.spa	48.8	0.679
Tatoeba-test.eng-vec.eng.vec	2.2	0.222
Tatoeba-test.eng-wln.eng.wln	6.2	0.213

opus-2020-07-20.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-20.zip
test set translations: opus-2020-07-20.test.txt
test set scores: opus-2020-07-20.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-arg.eng.arg	1.5	0.117
Tatoeba-test.eng-ast.eng.ast	17.7	0.382
Tatoeba-test.eng-cat.eng.cat	47.4	0.665
Tatoeba-test.eng-cos.eng.cos	3.1	0.297
Tatoeba-test.eng-egl.eng.egl	0.9	0.113
Tatoeba-test.eng-ext.eng.ext	7.9	0.277
Tatoeba-test.eng-fra.eng.fra	44.6	0.632
Tatoeba-test.eng-frm.eng.frm	1.1	0.214
Tatoeba-test.eng-gcf.eng.gcf	0.4	0.101
Tatoeba-test.eng-glg.eng.glg	43.1	0.638
Tatoeba-test.eng-hat.eng.hat	30.0	0.528
Tatoeba-test.eng-ita.eng.ita	45.0	0.670
Tatoeba-test.eng-lad.eng.lad	6.2	0.285
Tatoeba-test.eng-lat.eng.lat	11.9	0.376
Tatoeba-test.eng-lij.eng.lij	1.7	0.189
Tatoeba-test.eng-lld.eng.lld	0.5	0.201
Tatoeba-test.eng-lmo.eng.lmo	0.8	0.192
Tatoeba-test.eng-mfe.eng.mfe	83.6	0.909
Tatoeba-test.eng-msa.eng.msa	30.9	0.546
Tatoeba-test.eng.multi	37.6	0.585
Tatoeba-test.eng-mwl.eng.mwl	3.2	0.327
Tatoeba-test.eng-oci.eng.oci	7.8	0.286
Tatoeba-test.eng-pap.eng.pap	41.4	0.613
Tatoeba-test.eng-pms.eng.pms	2.0	0.182
Tatoeba-test.eng-por.eng.por	40.8	0.633
Tatoeba-test.eng-roh.eng.roh	4.0	0.262
Tatoeba-test.eng-ron.eng.ron	40.1	0.628
Tatoeba-test.eng-scn.eng.scn	1.6	0.175
Tatoeba-test.eng-spa.eng.spa	48.8	0.680
Tatoeba-test.eng-vec.eng.vec	2.6	0.237
Tatoeba-test.eng-wln.eng.wln	6.8	0.228

opus-2020-07-27.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-27.zip
test set translations: opus-2020-07-27.test.txt
test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2016-enro-engron.eng.ron	26.9	0.562
newsdiscussdev2015-enfr-engfra.eng.fra	29.7	0.572
newsdiscusstest2015-enfr-engfra.eng.fra	34.9	0.607
newssyscomb2009-engfra.eng.fra	27.6	0.565
newssyscomb2009-engita.eng.ita	28.7	0.586
newssyscomb2009-engspa.eng.spa	29.3	0.567
news-test2008-engfra.eng.fra	25.0	0.535
news-test2008-engspa.eng.spa	26.9	0.546
newstest2009-engfra.eng.fra	26.3	0.555
newstest2009-engita.eng.ita	28.4	0.581
newstest2009-engspa.eng.spa	28.6	0.566
newstest2010-engfra.eng.fra	29.2	0.572
newstest2010-engspa.eng.spa	33.5	0.597
newstest2011-engfra.eng.fra	30.7	0.589
newstest2011-engspa.eng.spa	34.6	0.597
newstest2012-engfra.eng.fra	29.0	0.572
newstest2012-engspa.eng.spa	34.6	0.598
newstest2013-engfra.eng.fra	29.6	0.563
newstest2013-engspa.eng.spa	31.5	0.574
newstest2016-enro-engron.eng.ron	25.4	0.544
Tatoeba-test.eng-arg.eng.arg	1.6	0.126
Tatoeba-test.eng-ast.eng.ast	18.0	0.399
Tatoeba-test.eng-cat.eng.cat	47.7	0.669
Tatoeba-test.eng-cos.eng.cos	2.9	0.284
Tatoeba-test.eng-egl.eng.egl	0.2	0.076
Tatoeba-test.eng-ext.eng.ext	11.0	0.280
Tatoeba-test.eng-fra.eng.fra	44.5	0.632
Tatoeba-test.eng-frm.eng.frm	0.8	0.214
Tatoeba-test.eng-gcf.eng.gcf	0.4	0.108
Tatoeba-test.eng-glg.eng.glg	43.7	0.641
Tatoeba-test.eng-hat.eng.hat	29.6	0.525
Tatoeba-test.eng-ita.eng.ita	45.0	0.670
Tatoeba-test.eng-lad.eng.lad	6.2	0.286
Tatoeba-test.eng-lat.eng.lat	11.9	0.377
Tatoeba-test.eng-lij.eng.lij	1.7	0.178
Tatoeba-test.eng-lld.eng.lld	0.8	0.201
Tatoeba-test.eng-lmo.eng.lmo	1.1	0.201
Tatoeba-test.eng-mfe.eng.mfe	91.9	0.956
Tatoeba-test.eng-msa.eng.msa	30.9	0.546
Tatoeba-test.eng.multi	37.5	0.585
Tatoeba-test.eng-mwl.eng.mwl	3.8	0.339
Tatoeba-test.eng-oci.eng.oci	7.7	0.290
Tatoeba-test.eng-pap.eng.pap	42.0	0.626
Tatoeba-test.eng-pms.eng.pms	2.0	0.184
Tatoeba-test.eng-por.eng.por	41.0	0.634
Tatoeba-test.eng-roh.eng.roh	3.8	0.245
Tatoeba-test.eng-ron.eng.ron	40.4	0.630
Tatoeba-test.eng-scn.eng.scn	1.6	0.177
Tatoeba-test.eng-spa.eng.spa	48.9	0.681
Tatoeba-test.eng-vec.eng.vec	3.1	0.232
Tatoeba-test.eng-wln.eng.wln	5.1	0.218

opus2m-2020-08-01.zip

dataset: opus2m
model: transformer
source language(s): eng
target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2016-enro-engron.eng.ron	27.1	0.565
newsdiscussdev2015-enfr-engfra.eng.fra	29.9	0.574
newsdiscusstest2015-enfr-engfra.eng.fra	35.3	0.609
newssyscomb2009-engfra.eng.fra	27.7	0.567
newssyscomb2009-engita.eng.ita	28.6	0.586
newssyscomb2009-engspa.eng.spa	29.8	0.569
news-test2008-engfra.eng.fra	25.0	0.536
news-test2008-engspa.eng.spa	27.1	0.548
newstest2009-engfra.eng.fra	26.7	0.557
newstest2009-engita.eng.ita	28.9	0.583
newstest2009-engspa.eng.spa	28.9	0.567
newstest2010-engfra.eng.fra	29.6	0.574
newstest2010-engspa.eng.spa	33.8	0.598
newstest2011-engfra.eng.fra	30.9	0.590
newstest2011-engspa.eng.spa	34.8	0.598
newstest2012-engfra.eng.fra	29.1	0.574
newstest2012-engspa.eng.spa	34.9	0.600
newstest2013-engfra.eng.fra	30.1	0.567
newstest2013-engspa.eng.spa	31.8	0.576
newstest2016-enro-engron.eng.ron	25.9	0.548
Tatoeba-test.eng-arg.eng.arg	1.6	0.120
Tatoeba-test.eng-ast.eng.ast	17.2	0.389
Tatoeba-test.eng-cat.eng.cat	47.6	0.668
Tatoeba-test.eng-cos.eng.cos	4.3	0.287
Tatoeba-test.eng-egl.eng.egl	0.9	0.101
Tatoeba-test.eng-ext.eng.ext	8.7	0.287
Tatoeba-test.eng-fra.eng.fra	44.9	0.635
Tatoeba-test.eng-frm.eng.frm	1.0	0.225
Tatoeba-test.eng-gcf.eng.gcf	0.7	0.115
Tatoeba-test.eng-glg.eng.glg	44.9	0.648
Tatoeba-test.eng-hat.eng.hat	30.9	0.533
Tatoeba-test.eng-ita.eng.ita	45.4	0.673
Tatoeba-test.eng-lad.eng.lad	5.6	0.279
Tatoeba-test.eng-lat.eng.lat	12.1	0.380
Tatoeba-test.eng-lij.eng.lij	1.4	0.183
Tatoeba-test.eng-lld.eng.lld	0.5	0.199
Tatoeba-test.eng-lmo.eng.lmo	0.7	0.187
Tatoeba-test.eng-mfe.eng.mfe	83.6	0.909
Tatoeba-test.eng-msa.eng.msa	31.3	0.549
Tatoeba-test.eng.multi	38.0	0.588
Tatoeba-test.eng-mwl.eng.mwl	2.7	0.322
Tatoeba-test.eng-oci.eng.oci	8.2	0.293
Tatoeba-test.eng-pap.eng.pap	46.7	0.663
Tatoeba-test.eng-pms.eng.pms	2.1	0.194
Tatoeba-test.eng-por.eng.por	41.2	0.635
Tatoeba-test.eng-roh.eng.roh	2.6	0.237
Tatoeba-test.eng-ron.eng.ron	40.6	0.632
Tatoeba-test.eng-scn.eng.scn	1.6	0.181
Tatoeba-test.eng-spa.eng.spa	49.5	0.685
Tatoeba-test.eng-vec.eng.vec	1.6	0.223
Tatoeba-test.eng-wln.eng.wln	7.1	0.250

opus1m+bt-2021-04-10.zip

dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ita lad lat lij lld lmo mfe mol mwl oci osp pap pms pob por roh ron scn spa vec wln
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ist<< >>ita<< >>itk<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lat<< >>lat_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>mcm<< >>mfe<< >>mol<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osc<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>qbb<< >>qhr<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>spx<< >>src<< >>srd<< >>sro<< >>tmg<< >>tvy<< >>vec<< >>vkp<< >>wln<< >>xfa<< >>xum<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
newsdev2016-enro.eng-ron	21.4	0.524	1999	51566	0.971
newsdiscussdev2015-enfr.eng-fra	27.7	0.556	1500	27986	1.000
newsdiscusstest2015-enfr.eng-fra	32.1	0.588	1500	28027	0.994
newssyscomb2009.eng-fra	26.6	0.558	502	12334	1.000
newssyscomb2009.eng-ita	27.4	0.578	502	11551	1.000
newssyscomb2009.eng-spa	28.8	0.565	502	12506	0.983
news-test2008.eng-fra	23.8	0.527	2051	52685	0.995
news-test2008.eng-spa	26.3	0.541	2051	52596	0.997
newstest2009.eng-fra	24.9	0.544	2525	69278	0.976
newstest2009.eng-ita	27.3	0.572	2525	63474	1.000
newstest2009.eng-spa	27.8	0.560	2525	68114	0.998
newstest2010.eng-fra	27.1	0.559	2489	66043	0.985
newstest2010.eng-spa	32.2	0.588	2489	65522	0.993
newstest2011.eng-fra	29.2	0.576	3003	80626	0.969
newstest2011.eng-spa	33.8	0.591	3003	79476	0.978
newstest2012.eng-fra	27.3	0.560	3003	78011	0.984
newstest2012.eng-spa	33.5	0.590	3003	79006	0.962
newstest2013.eng-fra	27.7	0.549	3000	70037	0.972
newstest2013.eng-spa	30.3	0.566	3000	70528	0.948
newstest2016-enro.eng-ron	20.8	0.510	1999	49094	0.984
Tatoeba-test.eng-arg	12.4	0.328	105	405	1.000
Tatoeba-test.eng-ast	24.4	0.476	99	720	0.980
Tatoeba-test.eng-cat	44.5	0.648	1631	12342	0.989
Tatoeba-test.eng-cbk	4.4	0.253	1498	10591	0.968
Tatoeba-test.eng-cos	39.5	0.680	5	45	0.931
Tatoeba-test.eng-egl	0.4	0.118	84	438	1.000
Tatoeba-test.eng-ext	11.4	0.345	69	353	1.000
Tatoeba-test.eng-fra	39.8	0.605	10000	80759	0.974
Tatoeba-test.eng-frm	2.1	0.221	18	211	1.000
Tatoeba-test.eng-gcf	0.8	0.118	99	560	0.989
Tatoeba-test.eng-glg	41.5	0.627	1008	7828	0.986
Tatoeba-test.eng-hat	33.1	0.549	64	416	0.978
Tatoeba-test.eng-ita	42.5	0.651	10000	65498	0.953
Tatoeba-test.eng-lad	7.5	0.288	629	3354	1.000
Tatoeba-test.eng-lad_Latn	8.0	0.314	582	3097	1.000
Tatoeba-test.eng-lat	10.4	0.371	10000	74902	0.930
Tatoeba-test.eng-lij	4.0	0.278	94	711	0.983
Tatoeba-test.eng-lld	1.0	0.213	21	228	0.973
Tatoeba-test.eng-lmo	8.8	0.317	17	124	1.000
Tatoeba-test.eng-mfe	83.6	0.905	7	36	1.000
Tatoeba-test.eng-multi	35.1	0.564	10000	74243	0.964
Tatoeba-test.eng-mwl	7.8	0.505	4	21	1.000
Tatoeba-test.eng-oci	9.9	0.330	841	5219	0.910
Tatoeba-test.eng-osp	13.9	0.331	3	20	1.000
Tatoeba-test.eng-pap	49.0	0.673	70	376	1.000
Tatoeba-test.eng-pms	14.3	0.359	268	2244	0.944
Tatoeba-test.eng-por	41.6	0.640	10000	75353	0.971
Tatoeba-test.eng-roh	22.5	0.476	16	198	1.000
Tatoeba-test.eng-ron	33.6	0.580	5000	36833	0.970
Tatoeba-test.eng-scn	38.9	0.482	4	42	1.000
Tatoeba-test.eng-spa	45.4	0.657	10000	77291	0.974
Tatoeba-test.eng-vec	5.6	0.315	19	127	0.927
Tatoeba-test.eng-wln	11.9	0.299	89	520	0.951
tico19-test.eng-fra	33.2	0.588	2100	64655	0.978
tico19-test.eng-pob	41.4	0.686	2100	62729	0.947
tico19-test.eng-por	40.7	0.683	2100	62729	0.959
tico19-test.eng-spa	42.4	0.681	2100	66591	0.950

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eng-itc

eng-itc

README.md

opus-2020-07-14.zip

Benchmarks

opus-2020-07-20.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks

Files

eng-itc

Directory actions

More options

Directory actions

More options

Latest commit

History

eng-itc

Folders and files

parent directory

README.md

opus-2020-07-14.zip

Benchmarks

opus-2020-07-20.zip

Benchmarks

opus-2020-07-27.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-10.zip

Benchmarks