-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How #54
Comments
Hello, I want to ask how to use it on Chinese evaluation |
This is something we haven't tested. However, assuming you take care of the input/output tokenization, word overlap metrics should work in a similar way. |
That will be easy if you use something like jieba. Example: import jieba
# text from http://li-xirong.github.io/pub/icmr2016_chinese_caption.pdf
hyp = """一个亚洲女子走的是一条白色的圆柱状建筑外的照片
这个小女孩奔跑着,欢笑着
一个毛茸茸的黑色和白色的狗跳在一个酒吧的敏捷性测试中
A组篮球运动员身穿黄色和绿色端起一球
"""
ref = """一个亚洲女人是白色圆柱的大楼外拍照
小女孩在奔跑和欢笑
一个毛茸茸的黑色和白色的敏捷测试过程中在一个酒吧的狗跳
一群穿着黄色和绿色的篮球运动员
"""
hyp = [' '.join(jieba.cut(item)) for item in hyp.split('\n')]
ref = [' '.join(jieba.cut(item)) for item in ref.split('\n')]
with open('hyp_cn.txt', 'w') as f:
f.write('\n'.join(hyp))
with open('ref_cn.txt', 'w') as f:
f.write('\n'.join(ref)) Then use nlg-eval get scores in terminal: $ nlg-eval --hypothesis=hyp_cn.txt --references=ref_cn.txt --no-skipthoughts --no-glove
Bleu_1: 0.617021
Bleu_2: 0.463940
Bleu_3: 0.367593
Bleu_4: 0.290236
METEOR: 0.420750
ROUGE_L: 0.508545
CIDEr: 2.743118 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: