Error occurred while using sentence-level Predictor-Estimator to predict #23

Zachary-YL · 2019-04-22T07:40:48Z

After successfully training the sentence-level Predictor and Estimator model，an error occurred while using the Estimator model to predict the sentence-level data.

The command is:
kiwi predict --config experiments_sl/predict_estimator.yaml

And the error is:
2019-04-22 07:19:37.521 [kiwi.lib.predict setup:159] {'batch_size': 64,
'config': 'experiments_sl/predict_estimator.yaml',
'debug': False,
'experiment_name': 'EN-ZH Pretrain Predictor',
'gpu_id': None,
'load_data': None,
'load_model': 'runs/0/464dc10bfc174ac79ca082eae0dea352/best_model.torch',
'load_vocab': None,
'log_interval': 100,
'mlflow_always_log_artifacts': False,
'mlflow_tracking_uri': 'mlruns/',
'model': 'estimator',
'output_dir': 'predictions/predest/ccmt/en_zh',
'quiet': False,
'run_uuid': None,
'save_config': None,
'save_data': None,
'seed': 42}
2019-04-22 07:19:37.521 [kiwi.lib.predict setup:160] Local output directory is: predictions/predest/ccmt/en_zh
2019-04-22 07:19:37.521 [kiwi.lib.predict run:100] Predict with the PredEst (Predictor-Estimator) model
Traceback (most recent call last):
File "/home2/zyl/anaconda3/envs/openkiwi/bin/kiwi", line 10, in
sys.exit(main())
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/main.py", line 22, in main
return kiwi.cli.main.cli()
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/cli/main.py", line 73, in cli
predict.main(extra_args)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/cli/pipelines/predict.py", line 56, in main
predict.predict_from_options(options)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/lib/predict.py", line 54, in predict_from_options
run(options.model_api, output_dir, options.pipeline, options.model)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/lib/predict.py", line 113, in run
model = Model.create_from_file(pipeline_opts.load_model)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/models/model.py", line 214, in create_from_file
model = Model.subclasses[model_name].from_dict(model_dict)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/kiwi/models/model.py", line 235, in from_dict
model.load_state_dict(class_dict[const.STATE_DICT])
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Estimator:
Unexpected key(s) in state_dict: "predictor_tgt.W2", "predictor_tgt.V", "predictor_tgt.C", "predictor_tgt.S", "predictor_tgt.attention.scorer.layers.0.0.weight", "predictor_tgt.attention.scorer.layers.0.0.bias", "predictor_tgt.attention.scorer.layers.1.0.weight", "predictor_tgt.attention.scorer.layers.1.0.bias", "predictor_tgt.embedding_source.weight", "predictor_tgt.embedding_target.weight", "predictor_tgt.lstm_source.weight_ih_l0", "predictor_tgt.lstm_source.weight_hh_l0", "predictor_tgt.lstm_source.bias_ih_l0", "predictor_tgt.lstm_source.bias_hh_l0", "predictor_tgt.lstm_source.weight_ih_l0_reverse", "predictor_tgt.lstm_source.weight_hh_l0_reverse", "predictor_tgt.lstm_source.bias_ih_l0_reverse", "predictor_tgt.lstm_source.bias_hh_l0_reverse", "predictor_tgt.lstm_source.weight_ih_l1", "predictor_tgt.lstm_source.weight_hh_l1", "predictor_tgt.lstm_source.bias_ih_l1", "predictor_tgt.lstm_source.bias_hh_l1", "predictor_tgt.lstm_source.weight_ih_l1_reverse", "predictor_tgt.lstm_source.weight_hh_l1_reverse", "predictor_tgt.lstm_source.bias_ih_l1_reverse", "predictor_tgt.lstm_source.bias_hh_l1_reverse", "predictor_tgt.forward_target.weight_ih_l0", "predictor_tgt.forward_target.weight_hh_l0", "predictor_tgt.forward_target.bias_ih_l0", "predictor_tgt.forward_target.bias_hh_l0", "predictor_tgt.forward_target.weight_ih_l1", "predictor_tgt.forward_target.weight_hh_l1", "predictor_tgt.forward_target.bias_ih_l1", "predictor_tgt.forward_target.bias_hh_l1", "predictor_tgt.backward_target.weight_ih_l0", "predictor_tgt.backward_target.weight_hh_l0", "predictor_tgt.backward_target.bias_ih_l0", "predictor_tgt.backward_target.bias_hh_l0", "predictor_tgt.backward_target.weight_ih_l1", "predictor_tgt.backward_target.weight_hh_l1", "predictor_tgt.backward_target.bias_ih_l1", "predictor_tgt.backward_target.bias_hh_l1", "predictor_tgt.W1.weight".

Could you give some advice for solving this error? Thanks a lot!

The text was updated successfully, but these errors were encountered:

trenous · 2019-04-23T15:30:59Z

Hello Zachary, I was not able to reproduce your bug. Can you verify that you are using the newest version of the repository? If you already do or that does not solve your problem, can you provide the trained model for us to analyze what is the problem?

Best

Zachary-YL · 2019-04-24T06:02:21Z

Thank you for your reply.
Here is the trained Estimator model.

https://www.dropbox.com/s/ce9akwcvhs4tcbo/best_model.torch?dl=0

My config file for training Estimator model, and my config file for predicting.

train_estimator_yaml.txt
predict_estimator_yaml.txt

It's worth noting that I trained Estimator model with a CPU, because there will be errors when using GPU training:

2019-04-24 04:03:26.789 [root setup:380] This is run ID: 62e6dc469e3a4971bbce19bc119487c5
2019-04-24 04:03:26.790 [root setup:383] Inside experiment ID: 0 (None)
2019-04-24 04:03:26.790 [root setup:386] Local output directory is: runs/0/62e6dc469e3a4971bbce19bc119487c5
2019-04-24 04:03:26.790 [root setup:389] Logging execution to MLflow at: None
2019-04-24 04:03:26.872 [root setup:395] Using GPU: 0
2019-04-24 04:03:26.873 [root setup:400] Artifacts location: None
2019-04-24 04:03:26.886 [kiwi.lib.train run:154] Training the PredEst (Predictor-Estimator) model
2019-04-24 04:03:27.666 [kiwi.data.utils load_vocabularies_to_fields:126] Loaded vocabularies from runs/predictor/best_model.torch
2019-04-24 04:03:38.657 [kiwi.lib.train run:187] Estimator(
(predictor_tgt): Predictor(
(attention): Attention(
(scorer): MLPScorer(
(layers): ModuleList(
(0): Sequential(
(0): Linear(in_features=1600, out_features=800, bias=True)
(1): Tanh()
)
(1): Sequential(
(0): Linear(in_features=800, out_features=1, bias=True)
(1): Tanh()
)
)
)
)
(embedding_source): Embedding(9300, 200, padding_idx=1)
(embedding_target): Embedding(3845, 200, padding_idx=1)
(lstm_source): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
(forward_target): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5)
(backward_target): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5)
(W1): Embedding(3845, 200, padding_idx=1)
(_loss): CrossEntropyLoss()
)
(mlp): Sequential(
(0): Linear(in_features=1000, out_features=125, bias=True)
(1): Tanh()
)
(lstm): LSTM(125, 125, batch_first=True, bidirectional=True)
(embedding_out): Linear(in_features=250, out_features=2, bias=True)
(sentence_pred): Sequential(
(0): Linear(in_features=250, out_features=125, bias=True)
(1): Sigmoid()
(2): Linear(in_features=125, out_features=62, bias=True)
(3): Sigmoid()
(4): Linear(in_features=62, out_features=1, bias=True)
)
(xents): ModuleDict(
(tags): CrossEntropyLoss()
)
(mse_loss): MSELoss()
)
2019-04-24 04:03:38.658 [kiwi.lib.train run:188] 16202078 parameters
2019-04-24 04:03:38.670 [kiwi.trainers.trainer run:74] Epoch 1 of 10
Batches: 0%| | 1/232 [00:02<09:19, 2.42s/ batches]
Traceback (most recent call last):
File "estimator_train_sl.py", line 4, in
kiwi.train(estimator_config)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 79, in train_from_file
return train_from_options(options)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 123, in train_from_options
trainer = run(ModelClass, output_dir, pipeline_options, model_options)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 204, in run
trainer.run(train_iter, valid_iter, epochs=pipeline_options.epochs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 75, in run
self.train_epoch(train_iterator, valid_iterator)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 95, in train_epoch
outputs = self.train_step(batch)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 139, in train_step
model_out = self.model(batch)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor_estimator.py", line 324, in forward
model_out_tgt = self.predictor_tgt(batch)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor.py", line 275, in forward
for i in range(target_len - 2)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor.py", line 275, in
for i in range(target_len - 2)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/modules/attention.py", line 36, in forward
scores = self.scorer(query, keys)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/modules/scorer.py", line 60, in forward
layer_in = layer(layer_in)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/activation.py", line 292, in forward
return torch.tanh(input)
RuntimeError: CUDA out of memory. Tried to allocate 57.62 MiB (GPU 0; 10.92 GiB total capacity; 6.78 GiB already allocated; 31.50 MiB free; 109.37 MiB cached)

Thanks a lot!

captainvera · 2019-04-24T13:29:45Z

Hi @Zachary-YL we will look into what is happening on the predict pipeline.

Meanwhile, the error that you're getting when training on a GPU just means that OpenKiwi is trying to allocate more memory than what is available on your GPU. This happens when the combination of Batch_size and N of tokens on a sentence is too large.

You can easily train using the GPU if you do one of two things (or both):

Reduce batch size
set the source-max-length and target-max-length flags in the training yaml

trenous · 2019-05-25T02:11:27Z

Hello Zachary,

Sorry for the long delay in response, our team was busy with the WMT shared task.

I have run your predict-yaml with the model you provided (changing source and target to a toy file) and it worked fine without error.
Are you sure it is not a version issue? The first release of OpenKiwi was breaking when training for sentence level only.

trenous · 2019-06-18T10:44:33Z

I am closing this as it seems to be solved.

Zachary-YL added the bug Something isn't working label Apr 22, 2019

captainvera pinned this issue Apr 22, 2019

captainvera unpinned this issue Apr 22, 2019

lihongzheng-nlp mentioned this issue Apr 26, 2019

Error training estimator model #18

Closed

trenous closed this as completed Jun 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error occurred while using sentence-level Predictor-Estimator to predict #23

Error occurred while using sentence-level Predictor-Estimator to predict #23

Zachary-YL commented Apr 22, 2019

trenous commented Apr 23, 2019

Zachary-YL commented Apr 24, 2019

captainvera commented Apr 24, 2019

trenous commented May 25, 2019

trenous commented Jun 18, 2019

Error occurred while using sentence-level Predictor-Estimator to predict #23

Error occurred while using sentence-level Predictor-Estimator to predict #23

Comments

Zachary-YL commented Apr 22, 2019

trenous commented Apr 23, 2019

Zachary-YL commented Apr 24, 2019

captainvera commented Apr 24, 2019

trenous commented May 25, 2019

trenous commented Jun 18, 2019