How #54

Zierzzz · 2019-01-13T14:40:25Z

No description provided.

Zierzzz · 2019-01-13T14:42:32Z

Hello, I want to ask how to use it on Chinese evaluation

temporaer · 2019-01-14T14:27:58Z

This is something we haven't tested. However, assuming you take care of the input/output tokenization, word overlap metrics should work in a similar way.

ringsaturn · 2019-03-02T14:30:40Z

That will be easy if you use something like jieba. Example:

import jieba

# text from http://li-xirong.github.io/pub/icmr2016_chinese_caption.pdf

hyp = """一个亚洲女子走的是一条白色的圆柱状建筑外的照片
这个小女孩奔跑着，欢笑着
一个毛茸茸的黑色和白色的狗跳在一个酒吧的敏捷性测试中
A组篮球运动员身穿黄色和绿色端起一球
"""

ref = """一个亚洲女人是白色圆柱的大楼外拍照
小女孩在奔跑和欢笑
一个毛茸茸的黑色和白色的敏捷测试过程中在一个酒吧的狗跳
一群穿着黄色和绿色的篮球运动员
"""

hyp = [' '.join(jieba.cut(item)) for item in hyp.split('\n')]
ref = [' '.join(jieba.cut(item)) for item in ref.split('\n')]

with open('hyp_cn.txt', 'w') as f:
    f.write('\n'.join(hyp))

with open('ref_cn.txt', 'w') as f:
    f.write('\n'.join(ref))

Then use nlg-eval get scores in terminal:

$ nlg-eval --hypothesis=hyp_cn.txt --references=ref_cn.txt --no-skipthoughts --no-glove
Bleu_1: 0.617021
Bleu_2: 0.463940
Bleu_3: 0.367593
Bleu_4: 0.290236
METEOR: 0.420750
ROUGE_L: 0.508545
CIDEr: 2.743118

juharris closed this as completed May 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How #54

How #54

Zierzzz commented Jan 13, 2019

Zierzzz commented Jan 13, 2019

temporaer commented Jan 14, 2019

ringsaturn commented Mar 2, 2019 •

edited

Loading

How #54

How #54

Comments

Zierzzz commented Jan 13, 2019

Zierzzz commented Jan 13, 2019

temporaer commented Jan 14, 2019

ringsaturn commented Mar 2, 2019 • edited Loading

ringsaturn commented Mar 2, 2019 •

edited

Loading