-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing WMT 2017 word_level/test.tags required for predictor-estimator evaluation #24
Comments
Hi @ninalopatina thanks for experimenting with OpenKiwi! The WMT 2017 test data is available on their website: here (there is a purple link with gold-standard labels) We should probably also include a link for these in the Quickstart document to make them more visible. On the other hand the test tags for 2018 are not public (because they are the same as 2019 which is currently accepting submissions) I'm closing this issue as it is solved and not really a bug with OpenKiwi. Feel free to re-open if you have any other questions! Edit: I had wrongly stated that 2017 gold files were also not available |
Thanks for looking into this so quickly. @captainvera. I had attempted to run this with the test data for 2017 & 2018, which I had obtained from the same site you linked. For both years, the test data includes only a .mt, .src, and .align file. There is no .tags file for either year, nor for 2016. Should I replace the test set links with dev set, to have a .tags files to evaluate with? |
Hey @ninalopatina ! The .tags file is downloaded from a different location than the other test files. It's pointed out here: You can download the .tags from there and evaluate your model! You could also replace it with the dev set but (if you trained using the dev set as validation) that would just give you your validation scores which you should be familiar with and not a "real" evaluation. |
Thanks so much, @captainvera, I missed those links! I was thinking to test out the pipeline with the dev data until the test data becomes available, but this will work much better |
Describe the bug
The WMT 2017 test data set is missing the word_level/test.tags file that is required for predictor-estimator evaluation
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expected the evaluation to run. Second, I expected to find the WMT 2017 word_level/test.tags file, but it was not in the download from WMT test website.
Screenshots
$ kiwi evaluate --config experiments/evaluate_estimator.yaml
usage: kiwi evaluate [--config CONFIG] [--save-config SAVE_CONFIG] [-d] [-q] [--type {probs,tags}]
[--format {wmt17,wmt18}] [--pred-format {wmt17,wmt18}] [--sents-avg {probs,tags}]
[--gold-sents GOLD_SENTS] [--gold-target GOLD_TARGET] [--gold-source GOLD_SOURCE]
[--gold-cal GOLD_CAL] [--input-dir INPUT_DIR [INPUT_DIR ...]]
[--pred-sents PRED_SENTS [PRED_SENTS ...]] [--pred-target PRED_TARGET [PRED_TARGET ...]]
[--pred-gaps PRED_GAPS [PRED_GAPS ...]] [--pred-source PRED_SOURCE [PRED_SOURCE ...]]
[--pred-cal PRED_CAL]
kiwi evaluate: error: argument --gold-target: path must exist: data/WMT17/word_level/test.tags
The error is correct in that the file does not exist. I don't know where to find this file
Environment (please complete the following information):
Additional context
The 2018 test data doesn't have a .tags file either.
The text was updated successfully, but these errors were encountered: