-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to conduct Predictor-Estimator predicting #22
Comments
Hey @VictorLi2017, I also realised that this isn't addressed in the documentation and will update it 👍 Note: As a friendly reminder, to use a predictor-estimator you need to first pre-train the predictor on a large parallel corpora and then the estimator on QE data (with tags). I'm closing the issue, feel free to re-open it if the problem persists! |
Hello @captainvera , following your guide, I added Would you please check it for me, and give me further instructions? Thank you very much! |
Hey @VictorLi2017 , I believe the issue is that the model you are loading is a Does this solve your issue? A bit more detail about what happened: And thanks for reporting these problems, you are pointing out some important flaws in our handling of flags and incorrect inputs. This should have generated an informative error message. Improving the parameter parsing and validation is one of our main priorities moving forward. |
Hello @trenous , Yes, I think what I have trained is with By the way, I checked the training output.log, records in most epoches are as follow: |
Hey,
The pretraining step allows you to make use of any parallel corpus in your target language. This can make a significant difference as public QE corpora are usually of a very small size. Indeed it seems something went wrong with your training, would you mind sharing the config file and data you used? |
Hello @trenous , I trained sentence-level QE with Attached is the train_estimator.yaml config file for your reference. Quite strangely, with the exactly same config file, my colleague ran it successfully on his machine. The data is official data used for QE by China Workshop of Machine Translation (CWMT). So I think the data should be good. Thank you! |
@trenous PS: the train/dev data files include 4 files respectively: train.source, train.target, train.pe and train.hter, similar name formats like those in WMT sentence-level data. |
Hello @VictorLi2017 it is indeed extremely weird that your colleague can run it successfully on his machine. From the error messages it seems there was an error with data loading. This issue is hard to diagnose based on the error message since the only information we're getting is that there was an error in data loading. As @trenous mentioned earlier, our handling of flags and inputs is not the safest. As such, it is hard to conclude the exact problem solely from the error message. If you are sure there is no issue in your path to the files, would you mind running with the |
Hello @captainvera I'm sure that the path to the data is correct, I've already finished train_predictor step once again, but |
@VictorLi2017 Can you run |
Hello @trenous the repo I used yesterday is already the latest version. I tried zh-en, en-zh pairs, even the official sentence-level data of WMT18, all resulted in the same error TypeError: 'NoneType' object is not subscriptable in |
Hello VictorLi, Sorry for the long response time our team was working on a deadline. The line numbers in your log file don't match the current version, e.g.:
If you look at the changes introduced in this commit - which addresses the bug you encountered - you'll see that that line was No Can you just do a fresh checkout of the repo, that should solve your problem. |
I am closing this as it seems to be solved. |
After training zh-en data with predictor model, I continued the predict step with following command:
kiwi predict --model estimator --test-source /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.source --test-target /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.target --sentence-level True --gpu-id 0 --output-dir /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/
I got following errors:
[kiwi.lib.predict setup:159] {'batch_size': 64,
'config': None,
'debug': False,
'experiment_name': None,
'gpu_id': 0,
'load_data': None,
'load_model': None,
'load_vocab': None,
'log_interval': 100,
'mlflow_always_log_artifacts': False,
'mlflow_tracking_uri': 'mlruns/',
'model': 'estimator',
'output_dir': '/home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/',
'quiet': False,
'run_uuid': None,
'save_config': None,
'save_data': None,
'seed': 42}
Traceback (most recent call last):
File "/home/hzli/anaconda3/bin/kiwi", line 11, in
sys.exit(main())
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/main.py", line 22, in main
return kiwi.cli.main.cli()
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/main.py", line 73, in cli
predict.main(extra_args)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/pipelines/predict.py", line 56, in main
predict.predict_from_options(options)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/predict.py", line 54, in predict_from_options
run(options.model_api, output_dir, options.pipeline, options.model)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/predict.py", line 113, in run
model = Model.create_from_file(pipeline_opts.load_model)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/models/model.py", line 210, in create_from_file
str(path), map_location=lambda storage, loc: storage
File "/home/hzli/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 356, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'None'
I spent lots time to find out the error, but it never work.
Would you please give me some advice for solving this error? Thank you very much!
The text was updated successfully, but these errors were encountered: