Failed to conduct Predictor-Estimator predicting #22

lihongzheng-nlp · 2019-04-16T08:42:28Z

After training zh-en data with predictor model, I continued the predict step with following command:
kiwi predict --model estimator --test-source /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.source --test-target /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.target --sentence-level True --gpu-id 0 --output-dir /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/
I got following errors:
[kiwi.lib.predict setup:159] {'batch_size': 64,
'config': None,
'debug': False,
'experiment_name': None,
'gpu_id': 0,
'load_data': None,
'load_model': None,
'load_vocab': None,
'log_interval': 100,
'mlflow_always_log_artifacts': False,
'mlflow_tracking_uri': 'mlruns/',
'model': 'estimator',
'output_dir': '/home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/',
'quiet': False,
'run_uuid': None,
'save_config': None,
'save_data': None,
'seed': 42}

Traceback (most recent call last):
File "/home/hzli/anaconda3/bin/kiwi", line 11, in
sys.exit(main())
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/main.py", line 22, in main
return kiwi.cli.main.cli()
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/main.py", line 73, in cli
predict.main(extra_args)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/pipelines/predict.py", line 56, in main
predict.predict_from_options(options)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/predict.py", line 54, in predict_from_options
run(options.model_api, output_dir, options.pipeline, options.model)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/predict.py", line 113, in run
model = Model.create_from_file(pipeline_opts.load_model)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/models/model.py", line 210, in create_from_file
str(path), map_location=lambda storage, loc: storage
File "/home/hzli/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 356, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'None'
I spent lots time to find out the error, but it never work.
Would you please give me some advice for solving this error? Thank you very much！

The text was updated successfully, but these errors were encountered:

captainvera · 2019-04-16T10:08:25Z

Hey @VictorLi2017,
The way the predict pipeline works is by loading a pre-trained model and creating predictions for data where you don't have tags. Here, you forgot the step of loading the pre-trained model.
You need to pass a --load-model [Path to model] flag to the predict pipeline.

I also realised that this isn't addressed in the documentation and will update it 👍

Note: As a friendly reminder, to use a predictor-estimator you need to first pre-train the predictor on a large parallel corpora and then the estimator on QE data (with tags).

I'm closing the issue, feel free to re-open it if the problem persists!

…e to #22

lihongzheng-nlp · 2019-04-17T03:22:30Z

Hey @VictorLi2017,
The way the predict pipeline works is by loading a pre-trained model and creating predictions for data where you don't have tags. Here, you forgot the step of loading the pre-trained model.
You need to pass a --load-model [Path to model] flag to the predict pipeline.

I also realised that this isn't addressed in the documentation and will update it

Note: As a friendly reminder, to use a predictor-estimator you need to first pre-train the predictor on a large parallel corpora and then the estimator on QE data (with tags).

I'm closing the issue, feel free to re-open it if the problem persists!

Hello @captainvera , following your guide, I added --load-model to above full command,
kiwi predict --model estimator --test-source /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.source --test-target /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/dev/dev.target --sentence-level True --gpu-id 0 --output-dir /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/ **--load-model /home/hzli/work/MTQE/CWMT_2018/zh-en/zh-en/runs/0/de596b315f7a4428bd881224376158bc/best_model.torch**
After a while, but there are no predictation results instead of only a output.log file under the output dir, as attached. I guess there should be some predictation results, right?
output.log

Would you please check it for me, and give me further instructions? Thank you very much!

trenous · 2019-04-17T14:58:26Z

Hey @VictorLi2017 ,

I believe the issue is that the model you are loading is a Predictor, not an Estimator. Is that possible?
If so, train an Estimator with your pretrained Predictor (see this section in the docs, example config) and then run the prediction pipeline again.

Does this solve your issue?

A bit more detail about what happened:
The Predictor model itself does not do quality estimation, it is a conditional language model predicting words in the target given the source.
Now, when calling the Model.predict method, only QE predictions are generated which explains why you did not see any outputs.

And thanks for reporting these problems, you are pointing out some important flaws in our handling of flags and incorrect inputs. This should have generated an informative error message. Improving the parameter parsing and validation is one of our main priorities moving forward.
Best,
Sony

lihongzheng-nlp · 2019-04-18T05:50:11Z

Hey @VictorLi2017 ,

I believe the issue is that the model you are loading is a Predictor, not an Estimator. Is that possible?
If so, train an Estimator with your pretrained Predictor (see this section in the docs, example config) and then run the prediction pipeline again.

Does this solve your issue?

A bit more detail about what happened:
The Predictor model itself does not do quality estimation, it is a conditional language model predicting words in the target given the source.
Now, when calling the Model.predict method, only QE predictions are generated which explains why you did not see any outputs.

And thanks for reporting these problems, you are pointing out some important flaws in our handling of flags and incorrect inputs. This should have generated an informative error message. Improving the parameter parsing and validation is one of our main priorities moving forward.
Best,
Sony

Hello @trenous , Yes, I think what I have trained is with Predictor. If not mistaken, the QE pipeline includes three main stages: Training, Predicting and Evaluation, right?
I want to try the Predictor-Estimator model with official Chinese-English data. In the training step, I used kiwi train --model predictor and corresponding parameters, after 50 epoches, I got the best_model.torch in the output dir.
Then the Predicting step, I ran kiwi predict --model estimator --load-model best_model.torch, and got above problems: only a output.log, but with any no prediction results at all. I'm not quite sure that I used the correct model name in the two step?

By the way, I checked the training output.log, records in most epoches are as follow:
target_PERP: nan, target_CORRECT: 0.0000, target_ExpErr: nan
target_PERP: nan, target_CORRECT: 0.0000, target_ExpErr: nan
EVAL_target_PERP: nan, EVAL_target_CORRECT: 0.0415, EVAL_target_ExpErr: nan
I guess there must be some problems with the data. Right?
I'll retry the WMT18 data with the whole pipeline once again, and will update you soon later. Thank you!

trenous · 2019-04-18T11:00:44Z

Hey,
The predictor-estimator model relies on pretraining of its component model predictor. This is what you did with the command kiwi train --model predictor. The resulting best_model.torch is not a QE model, but can be used to initialize an estimator model like so:

kiwi train --model estimator --load-pred-target best_model.torch

The pretraining step allows you to make use of any parallel corpus in your target language. This can make a significant difference as public QE corpora are usually of a very small size.

Indeed it seems something went wrong with your training, would you mind sharing the config file and data you used?

lihongzheng-nlp · 2019-04-20T11:03:37Z

Hey,
The predictor-estimator model relies on pretraining of its component model predictor. This is what you did with the command kiwi train --model predictor. The resulting best_model.torch is not a QE model, but can be used to initialize an estimator model like so:

kiwi train --model estimator --load-pred-target best_model.torch

The pretraining step allows you to make use of any parallel corpus in your target language. This can make a significant difference as public QE corpora are usually of a very small size.

Indeed it seems something went wrong with your training, would you mind sharing the config file and data you used?

Hello @trenous , I trained sentence-level QE with predictor-estimator, following your last guide, I ran
kiwi train --config experiments/train_predictor.yaml
successfully, and got the best_model.torch,
then I ran kiwi train --config experiments/train_estimator.yaml, but failed once again.
Here is the errors:
Traceback (most recent call last):
File "/home/hzli/anaconda3/bin/kiwi", line 11, in
sys.exit(main())
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/main.py", line 22, in main
return kiwi.cli.main.cli()
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/main.py", line 71, in cli
train.main(extra_args)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/cli/pipelines/train.py", line 141, in main
train.train_from_options(options)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/train.py", line 123, in train_from_options
trainer = run(ModelClass, output_dir, pipeline_options, model_options)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/lib/train.py", line 204, in run
trainer.run(train_iter, valid_iter, epochs=pipeline_options.epochs)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/trainers/trainer.py", line 75, in run
self.train_epoch(train_iterator, valid_iterator)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/trainers/trainer.py", line 95, in train_epoch
outputs = self.train_step(batch)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/trainers/trainer.py", line 139, in train_step
model_out = self.model(batch)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/models/predictor_estimator.py", line 349, in forward
sentence_input = self.make_sentence_input(h_tgt, h_src)
File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/models/predictor_estimator.py", line 418, in make_sentence_input
h = h_tgt[0] if h_tgt else h_src[0]
TypeError: 'NoneType' object is not subscriptable

Attached is the train_estimator.yaml config file for your reference. Quite strangely, with the exactly same config file, my colleague ran it successfully on his machine. The data is official data used for QE by China Workshop of Machine Translation (CWMT). So I think the data should be good. Thank you!

train_estimator_yaml.txt

lihongzheng-nlp · 2019-04-20T11:28:02Z

@trenous PS: the train/dev data files include 4 files respectively: train.source, train.target, train.pe and train.hter, similar name formats like those in WMT sentence-level data.

captainvera · 2019-04-22T16:03:33Z

Hello @VictorLi2017 it is indeed extremely weird that your colleague can run it successfully on his machine. From the error messages it seems there was an error with data loading.
As a first step I would make sure the path to your data is correct and that there is no typo.

This issue is hard to diagnose based on the error message since the only information we're getting is that there was an error in data loading. As @trenous mentioned earlier, our handling of flags and inputs is not the safest. As such, it is hard to conclude the exact problem solely from the error message.

If you are sure there is no issue in your path to the files, would you mind running with the --debug flag and posting the output log here (or the console output with timestamps if possible)?

lihongzheng-nlp · 2019-04-23T04:35:41Z

Hello @captainvera I'm sure that the path to the data is correct, I've already finished train_predictor step once again, but train_estimator step alway has the same error that TypeError: 'NoneType' object is not subscriptable
Attached is the train_estimator.log trained with --debug, please check it. Thank you!
train_estimator.log

trenous · 2019-04-23T15:40:39Z

@VictorLi2017 Can you run git pull and let us know if the error persists? We fixed a bug related to training sentence-level only models recently.

lihongzheng-nlp · 2019-04-24T02:28:06Z

Hello @trenous the repo I used yesterday is already the latest version. I tried zh-en, en-zh pairs, even the official sentence-level data of WMT18, all resulted in the same error TypeError: 'NoneType' object is not subscriptable in train_estimator step.

trenous · 2019-05-25T02:28:33Z

Hello VictorLi,

Sorry for the long response time our team was working on a deadline.

The line numbers in your log file don't match the current version, e.g.:

File "/home/hzli/anaconda3/lib/python3.6/site-packages/kiwi/models/predictor_estimator.py", line 349, in forward:
    sentence_input = self.make_sentence_input(h_tgt, h_src)

If you look at the changes introduced in this commit - which addresses the bug you encountered - you'll see that that line was No 349 beforehand, and 357 afterwards.

Can you just do a fresh checkout of the repo, that should solve your problem.

trenous · 2019-06-18T10:43:57Z

I am closing this as it seems to be solved.

captainvera closed this as completed Apr 16, 2019

captainvera added the good first issue Good for newcomers label Apr 16, 2019

captainvera self-assigned this Apr 16, 2019

captainvera added a commit that referenced this issue Apr 16, 2019

Add note to prediction documentation on loading pre-trained models du…

b02b081

…e to #22

captainvera added a commit that referenced this issue Apr 16, 2019

Add note to prediction documentation on loading pre-trained models du…

4f32ca5

…e to #22

captainvera reopened this Apr 17, 2019

captainvera added bug Something isn't working and removed good first issue Good for newcomers labels Apr 26, 2019

lihongzheng-nlp mentioned this issue Apr 26, 2019

Error training estimator model #18

Closed

trenous closed this as completed Jun 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to conduct Predictor-Estimator predicting #22

Failed to conduct Predictor-Estimator predicting #22

lihongzheng-nlp commented Apr 16, 2019

captainvera commented Apr 16, 2019

lihongzheng-nlp commented Apr 17, 2019

trenous commented Apr 17, 2019

lihongzheng-nlp commented Apr 18, 2019

trenous commented Apr 18, 2019

lihongzheng-nlp commented Apr 20, 2019 •

edited

Loading

lihongzheng-nlp commented Apr 20, 2019

captainvera commented Apr 22, 2019 •

edited

Loading

lihongzheng-nlp commented Apr 23, 2019

trenous commented Apr 23, 2019 •

edited

Loading

lihongzheng-nlp commented Apr 24, 2019

trenous commented May 25, 2019

trenous commented Jun 18, 2019

Failed to conduct Predictor-Estimator predicting #22

Failed to conduct Predictor-Estimator predicting #22

Comments

lihongzheng-nlp commented Apr 16, 2019

captainvera commented Apr 16, 2019

lihongzheng-nlp commented Apr 17, 2019

trenous commented Apr 17, 2019

lihongzheng-nlp commented Apr 18, 2019

trenous commented Apr 18, 2019

lihongzheng-nlp commented Apr 20, 2019 • edited Loading

lihongzheng-nlp commented Apr 20, 2019

captainvera commented Apr 22, 2019 • edited Loading

lihongzheng-nlp commented Apr 23, 2019

trenous commented Apr 23, 2019 • edited Loading

lihongzheng-nlp commented Apr 24, 2019

trenous commented May 25, 2019

trenous commented Jun 18, 2019

lihongzheng-nlp commented Apr 20, 2019 •

edited

Loading

captainvera commented Apr 22, 2019 •

edited

Loading

trenous commented Apr 23, 2019 •

edited

Loading