Skip to content

Latest commit

 

History

History
66 lines (53 loc) · 6.25 KB

summary.md

File metadata and controls

66 lines (53 loc) · 6.25 KB

Initial results

(correspond to Tables 1, 2, A7 and A8 in the NMTScore paper)

Monolingual Paraphrase Identification

en ru fi sv de es fr ja zh pawsx-avg. macro-avg.
score_direct (prism) 72.6 84.1 72.4 70.6 73.9 73.5 75.7 66.4 68.9 71.7 74.3
score_pivot (prism) 72.1 84.9 70.3 70.9 77.4 76.2 76.9 68.4 70.8 74.0 74.4
score_cross_likelihood (prism) 71.7 86.6 71.2 72.4 76.6 75.1 75.6 65.8 70.5 72.7 74.9
score_direct (m2m100_418M) 72.0 83.2 71.1 71.1 71.1 69.0 72.3 61.9 65.4 67.9 73.1
score_pivot (m2m100_418M) 72.4 84.2 68.2 70.3 73.2 72.0 72.2 64.0 67.9 69.9 73.0
score_cross_likelihood (m2m100_418M) 72.1 85.1 69.7 71.5 71.6 72.6 72.2 63.1 66.7 69.2 73.5
score_direct (m2m100_1.2B) 72.9 84.0 71.4 71.2 73.0 70.2 72.4 62.4 66.4 68.9 73.7
score_pivot (m2m100_1.2B) 74.1 84.5 69.1 69.6 75.1 73.0 73.3 65.8 70.2 71.5 73.8
score_cross_likelihood (m2m100_1.2B) 72.8 85.0 70.0 71.0 74.1 73.0 73.3 66.2 69.5 71.2 74.0

Cross-lingual Paraphrase Identification

en-es en-fr en-ja en-zh de-en de-es de-fr de-ja de-zh es-fr es-ja es-zh fr-ja fr-zh ja-zh avg.
score_direct (prism) 76.4 76.1 68.6 68.8 76.4 73.3 74.5 66.0 66.9 74.3 66.7 66.8 66.8 67.4 64.4 70.2
score_pivot (prism) 76.9 77.3 68.9 70.7 77.4 75.0 76.0 67.0 69.5 75.5 67.6 69.5 67.5 69.9 66.5 71.7
score_cross_likelihood (prism) 75.9 75.9 65.2 66.0 76.0 74.5 75.2 64.8 65.8 74.2 64.6 66.2 64.4 65.7 65.3 69.3
score_direct (m2m100_418M) 70.9 72.2 63.3 65.0 72.5 67.9 69.0 61.0 63.1 68.4 60.6 62.0 61.4 63.2 60.6 65.4
score_pivot (m2m100_418M) 73.1 73.6 63.9 65.4 73.8 71.9 70.5 63.0 63.8 70.1 62.8 64.3 62.7 63.5 62.0 67.0
score_cross_likelihood (m2m100_418M) 72.3 72.2 61.9 62.3 72.8 69.7 69.0 60.2 63.2 69.8 61.5 61.5 60.5 61.7 61.9 65.4
score_direct (m2m100_1.2B) 72.4 73.0 64.8 67.0 75.0 71.4 71.7 62.7 65.1 69.8 61.5 63.4 62.7 65.1 62.6 67.2
score_pivot (m2m100_1.2B) 74.0 74.4 66.4 67.5 75.9 72.3 72.4 64.4 66.5 70.9 63.4 65.6 64.1 64.9 63.4 68.4
score_cross_likelihood (m2m100_1.2B) 74.1 73.8 63.0 63.6 74.8 70.9 70.5 61.4 63.5 71.3 61.6 62.5 61.8 63.8 63.3 66.7

Additional results (v0.3.0)

Monolingual Paraphrase Identification

en ru fi sv de es fr ja zh pawsx-avg. macro-avg.
score_direct (small100, fp16) 72.7 81.4 71.0 71.2 70.4 67.7 70.5 61.4 65.6 67.1 72.7
score_pivot (small100, fp16) 72.7 83.1 68.1 70.2 72.6 74.7 73.2 63.1 69.0 70.5 72.9
score_cross_likelihood (small100, fp16) 72.3 82.5 69.6 72.3 73.2 72.9 72.8 62.4 66.2 69.5 73.2
score_direct (m2m100_418M, fp16) 72.0 83.2 71.1 71.1 70.8 68.9 72.2 61.9 65.4 67.8 73.1
score_pivot (m2m100_418M, fp16) 72.5 84.1 68.2 70.3 73.1 72.2 72.1 64.0 67.8 69.8 73.0
score_cross_likelihood (m2m100_418M, fp16) 72.1 85.1 69.7 71.5 71.6 72.5 72.2 63.6 65.9 69.1 73.5

Cross-lingual Paraphrase Identification

en-es en-fr en-ja en-zh de-en de-es de-fr de-ja de-zh es-fr es-ja es-zh fr-ja fr-zh ja-zh avg.
score_direct (small100, fp16) 72.0 72.4 62.6 64.3 72.8 69.0 69.0 61.4 63.6 68.7 60.5 62.9 60.9 63.6 61.4 65.7
score_pivot (small100, fp16) 73.2 73.6 64.5 66.1 74.5 71.2 70.9 63.2 64.8 72.3 62.3 63.6 62.7 64.6 62.5 67.3
score_cross_likelihood (small100, fp16) 71.8 72.0 60.3 62.9 72.6 69.5 69.5 61.3 61.9 71.6 59.5 62.8 61.4 62.3 62.2 65.4
score_direct (m2m100_418M, fp16) 71.1 72.2 63.3 65.0 72.5 67.9 69.0 61.0 63.1 68.4 60.7 61.9 61.3 63.4 60.6 65.4
score_pivot (m2m100_418M, fp16) 73.2 73.7 63.9 65.5 73.7 71.8 70.3 63.1 63.9 70.1 62.9 64.1 62.7 63.5 62.0 67.0
score_cross_likelihood (m2m100_418M, fp16) 72.1 72.2 61.9 62.4 72.8 69.7 68.9 60.2 63.3 69.7 61.5 61.7 60.5 61.8 61.9 65.4

Inference Time (ms)

score_direct score_pivot score_cross_likelihood
prism 21 138 69
m2m100_418M 21 1309 663
– fp16 8 1100 690
small100 15 762 342
– fp16 6 439 247