Output model files compatible with Official Bert's pre-trained models? #18

1e0ng · 2019-09-17T03:35:00Z

Hi, I tried to pre-train a Bert model with this project. I find the output of the model is not compatible with the official Bert's pre-trained model. Is it easy to make it compatible?

For example, I can use pytorch_transformers to read the official Bert's pre-trained models, but when I do this same for the model trained by this project, I get some errors about some shape sizes are not the same.

RuntimeError: Error(s) in loading state_dict for BertForMultiLabelSequenceClassification:
	size mismatch for bert.encoder.layer.0.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.0.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).
	size mismatch for bert.encoder.layer.1.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.1.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).
	size mismatch for bert.encoder.layer.2.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.2.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).
	size mismatch for bert.encoder.layer.3.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.3.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).
	size mismatch for bert.encoder.layer.4.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.4.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).
	size mismatch for bert.encoder.layer.5.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
	size mismatch for bert.encoder.layer.5.output.dense.weight: copying a param with shape torch.Size([256, 128]) from checkpoint, the shape in current model is torch.Size([128, 256]).

The text was updated successfully, but these errors were encountered:

guotong1988 · 2019-09-17T07:05:29Z

try delete this line of https://github.com/guotong1988/BERT-multi-gpu/blob/master/run_pretraining_gpu.py

guotong1988 · 2019-09-17T07:06:08Z

or try https://github.com/guotong1988/BERT-multi-gpu/blob/master/run_pretraining_gpu_v2.py

1e0ng · 2019-09-17T08:16:00Z

Hi @guotong1988 thanks for the reply. Actually I'm using the the run_pretraining_gpu_v2.py script from the beginning.

guotong1988 · 2019-09-17T08:24:09Z

Could you find the error tensor corresponding to the code?

1e0ng · 2019-09-17T08:44:58Z

Hi @guotong1988 I don't know how to find the error tensor...

The above error was from this code (if it helps):


/usr/local/lib/python3.6/dist-packages/pytorch_transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    592         if len(error_msgs) > 0:
    593             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 594                                model.__class__.__name__, "\n\t".join(error_msgs)))
    595 
    596         if hasattr(model, 'tie_weights'):

RuntimeError: Error(s) in loading state_dict for BertForMultiLabelSequenceClassification:
	size mismatch for bert.encoder.layer.0.intermediate.dense.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).

guotong1988 · 2019-09-17T09:08:29Z

try https://github.com/guotong1988/BERT-multi-gpu/blob/master/modeling_lastest.py
edit the code to import it.
I copy it from https://github.com/google-research/bert 10 minutes ago.
hope you can give me feedback.

1e0ng · 2019-09-17T09:42:05Z

Hi, I assume you mean to change the run_pretraining_gpu_v2.py script by changing

the line

import modeling

to

import modeling_lastest as modeling

I'll have a try and let you know.

guotong1988 · 2019-09-17T09:42:57Z

Yes

1e0ng · 2019-09-19T06:31:12Z

Hi @guotong1988 It works.
Thanks so much!

guotong1988 closed this as completed Sep 19, 2019

This was referenced Oct 29, 2019

run_pretraining_gpu.py not working #16

Closed

ModuleNotFoundError: No module named 'tensorflow.python.distribute.cross_device_ops #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output model files compatible with Official Bert's pre-trained models? #18

Output model files compatible with Official Bert's pre-trained models? #18

1e0ng commented Sep 17, 2019 •

edited by guotong1988

Loading

guotong1988 commented Sep 17, 2019 •

edited

Loading

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019 •

edited

Loading

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 19, 2019

Output model files compatible with Official Bert's pre-trained models? #18

Output model files compatible with Official Bert's pre-trained models? #18

Comments

1e0ng commented Sep 17, 2019 • edited by guotong1988 Loading

guotong1988 commented Sep 17, 2019 • edited Loading

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019 • edited Loading

1e0ng commented Sep 17, 2019

guotong1988 commented Sep 17, 2019

1e0ng commented Sep 19, 2019

1e0ng commented Sep 17, 2019 •

edited by guotong1988

Loading

guotong1988 commented Sep 17, 2019 •

edited

Loading

guotong1988 commented Sep 17, 2019 •

edited

Loading