Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

可否提供run_gen.py的bart版本? #7

Closed
6666ev opened this issue Oct 31, 2021 · 3 comments
Closed

可否提供run_gen.py的bart版本? #7

6666ev opened this issue Oct 31, 2021 · 3 comments

Comments

@6666ev
Copy link

6666ev commented Oct 31, 2021

路径下CPT/blob/master/finetune/generation/run_gen.py是CPT的版本
我自己按照这个改了一个bart版本,但是显示有很多层not used或者not initialized。
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration
Some weights of BartForConditionalGeneration were not initialized
不知道这些警告是否有影响,或者能否提供一个run_gen.py的bart版本?

详细信息如下所示:

loading weights file model/bart-base-chinese/pytorch_model.bin
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration: ['encoder.layers.4.fc1.bias',
 'encoder.layers.0.self_attn.k_proj.bias',
 'encoder.layers.3.fc1.bias',
 'encoder.layers.4.fc1.weight',
 'encoder.layers.1.final_layer_norm.bias',
 'encoder.layers.0.fc2.weight',
 'encoder.layers.0.self_attn.out_proj.bias',
 'encoder.layers.1.self_attn.out_proj.weight',
 'encoder.layers.3.self_attn.k_proj.bias',
 'encoder.layernorm_embedding.weight',
 'encoder.layers.1.fc2.weight',
 'encoder.layers.5.self_attn.q_proj.weight',
 'encoder.layers.5.self_attn.q_proj.bias',
 'encoder.layers.0.final_layer_norm.weight',
 'encoder.layers.1.self_attn.v_proj.weight',
 'encoder.layers.4.self_attn.out_proj.weight',
 'encoder.layers.5.self_attn_layer_norm.bias',
 'encoder.layers.0.self_attn_layer_norm.bias',
 'encoder.layers.3.self_attn.k_proj.weight',
 'encoder.embed_tokens.weight',
 'encoder.layers.1.self_attn.v_proj.bias',
 'encoder.layers.5.final_layer_norm.bias',
 'encoder.layers.1.fc1.weight',
 'encoder.layers.5.self_attn_layer_norm.weight',
 'encoder.layers.2.fc1.weight',
 'encoder.layers.0.final_layer_norm.bias',
 'encoder.layers.1.fc2.bias',
 'encoder.layers.3.self_attn.v_proj.weight',
 'encoder.layers.3.final_layer_norm.bias',
 'encoder.layers.2.fc1.bias',
 'encoder.layers.3.self_attn.q_proj.weight',
 'encoder.layers.1.final_layer_norm.weight',
 'encoder.layers.4.fc2.bias',
 'encoder.layers.4.self_attn.out_proj.bias',
 'encoder.layers.2.self_attn.q_proj.weight',
 'encoder.layers.2.final_layer_norm.weight',
 'encoder.embed_positions.weight',
 'encoder.layers.3.self_attn.out_proj.bias',
 'encoder.layers.3.fc1.weight',
 'encoder.layers.1.fc1.bias',
 'encoder.layers.0.self_attn.k_proj.weight',
 'encoder.layers.1.self_attn.k_proj.bias',
 'encoder.layers.0.fc2.bias',
 'encoder.layers.1.self_attn.k_proj.weight',
 'encoder.layers.5.self_attn.v_proj.bias',
 'encoder.layers.1.self_attn.q_proj.weight',
 'encoder.layers.2.final_layer_norm.bias',
 'encoder.layers.4.self_attn_layer_norm.weight',
 'encoder.layers.4.self_attn.v_proj.bias',
 'encoder.layers.2.self_attn_layer_norm.weight',
 'encoder.layers.0.fc1.weight',
 'encoder.layers.4.self_attn.k_proj.bias',
 'encoder.layers.0.self_attn.q_proj.bias',
 'encoder.layers.4.final_layer_norm.bias',
 'encoder.layers.0.self_attn.v_proj.weight',
 'encoder.layers.3.final_layer_norm.weight',
 'encoder.layers.5.self_attn.out_proj.weight',
 'encoder.layers.4.self_attn.q_proj.weight',
 'encoder.layers.0.self_attn_layer_norm.weight',
 'encoder.layers.5.self_attn.v_proj.weight',
 'encoder.layers.2.self_attn.v_proj.weight',
 'encoder.layers.1.self_attn.out_proj.bias',
 'encoder.layers.2.self_attn.k_proj.bias',
 'encoder.layers.2.self_attn.out_proj.weight',
 'encoder.layers.3.self_attn.v_proj.bias',
 'encoder.layers.2.self_attn.q_proj.bias',
 'encoder.layers.2.self_attn.out_proj.bias',
 'encoder.layers.3.fc2.bias',
 'encoder.layers.5.fc1.weight',
 'encoder.layernorm_embedding.bias',
 'encoder.layers.0.fc1.bias',
 'encoder.layers.3.self_attn_layer_norm.bias',
 'encoder.layers.5.self_attn.k_proj.weight',
 'encoder.layers.5.fc1.bias',
 'encoder.layers.3.fc2.weight',
 'encoder.layers.4.fc2.weight',
 'encoder.layers.0.self_attn.v_proj.bias',
 'encoder.layers.0.self_attn.q_proj.weight',
 'encoder.layers.1.self_attn.q_proj.bias',
 'encoder.layers.3.self_attn_layer_norm.weight',
 'encoder.layers.2.self_attn.k_proj.weight',
 'encoder.layers.2.self_attn.v_proj.bias',
 'encoder.layers.5.final_layer_norm.weight',
 'encoder.layers.5.self_attn.out_proj.bias',
 'encoder.layers.0.self_attn.out_proj.weight',
 'encoder.layers.5.fc2.weight',
 'encoder.layers.5.fc2.bias',
 'encoder.layers.1.self_attn_layer_norm.bias',
 'encoder.layers.4.self_attn.k_proj.weight',
 'encoder.layers.5.self_attn.k_proj.bias',
 'encoder.layers.3.self_attn.q_proj.bias',
 'encoder.layers.4.self_attn.q_proj.bias',
 'encoder.layers.1.self_attn_layer_norm.weight',
 'encoder.layers.2.self_attn_layer_norm.bias',
 'encoder.layers.4.final_layer_norm.weight',
 'encoder.layers.4.self_attn.v_proj.weight',
 'encoder.layers.2.fc2.weight',
 'encoder.layers.2.fc2.bias',
 'encoder.layers.4.self_attn_layer_norm.bias',
 'encoder.layers.3.self_attn.out_proj.weight']
- This IS expected if you are initializing BartForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BartForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at model/bart-base-chinese and are newly initialized: 
['encoder.encoder.layer.1.output.dense.bias',
 'encoder.encoder.layer.3.attention.self.key.bias',
 'encoder.encoder.layer.3.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.value.bias',
 'encoder.encoder.layer.2.attention.output.dense.bias',
 'encoder.encoder.layer.4.output.LayerNorm.bias',
 'encoder.encoder.layer.4.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.0.intermediate.dense.bias',
 'encoder.encoder.layer.5.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.0.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.2.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.2.attention.self.key.weight',
 'encoder.embeddings.LayerNorm.weight',
 'encoder.encoder.layer.0.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.1.attention.self.key.bias',
 'encoder.encoder.layer.3.intermediate.dense.weight',
 'encoder.encoder.layer.5.intermediate.dense.weight',
 'encoder.encoder.layer.0.output.dense.weight',
 'encoder.encoder.layer.5.output.LayerNorm.bias',
 'encoder.encoder.layer.1.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.query.weight',
 'encoder.encoder.layer.1.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.key.bias',
 'encoder.encoder.layer.3.output.LayerNorm.bias',
 'encoder.encoder.layer.5.output.dense.bias',
 'encoder.encoder.layer.4.attention.self.key.weight',
 'encoder.encoder.layer.0.attention.self.key.bias',
 'encoder.encoder.layer.0.attention.self.query.weight',
 'encoder.encoder.layer.0.intermediate.dense.weight',
 'encoder.encoder.layer.3.output.LayerNorm.weight',
 'encoder.encoder.layer.3.attention.output.dense.bias',
 'encoder.encoder.layer.5.output.dense.weight',
 'encoder.embeddings.LayerNorm.bias',
 'encoder.encoder.layer.1.attention.self.value.weight',
 'encoder.encoder.layer.2.output.dense.weight',
 'encoder.encoder.layer.4.intermediate.dense.weight',
 'encoder.encoder.layer.2.attention.self.value.weight',
 'encoder.encoder.layer.0.attention.self.value.weight',
 'encoder.encoder.layer.0.attention.output.dense.bias',
 'encoder.encoder.layer.2.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.3.output.dense.bias',
 'encoder.encoder.layer.5.output.LayerNorm.weight',
 'encoder.encoder.layer.5.attention.output.dense.bias',
 'encoder.encoder.layer.4.attention.self.value.weight',
 'encoder.encoder.layer.3.attention.self.query.bias',
 'encoder.encoder.layer.3.attention.self.value.weight',
 'encoder.encoder.layer.3.attention.self.key.weight',
 'encoder.encoder.layer.0.output.dense.bias',
 'encoder.encoder.layer.1.intermediate.dense.bias',
 'encoder.encoder.layer.0.attention.self.query.bias',
 'encoder.encoder.layer.1.intermediate.dense.weight',
 'encoder.encoder.layer.0.attention.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.value.bias',
 'encoder.embeddings.token_type_embeddings.weight',
 'encoder.encoder.layer.1.attention.output.dense.weight',
 'encoder.encoder.layer.2.attention.self.query.bias',
 'encoder.encoder.layer.2.attention.self.query.weight',
 'encoder.encoder.layer.2.attention.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.query.bias',
 'encoder.embeddings.position_ids',
 'encoder.embeddings.position_embeddings.weight',
 'encoder.encoder.layer.3.attention.self.query.weight',
 'encoder.embeddings.word_embeddings.weight',
 'encoder.encoder.layer.4.output.dense.bias',
 'encoder.encoder.layer.1.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.query.bias',
 'encoder.encoder.layer.3.attention.self.value.bias',
 'encoder.encoder.layer.5.intermediate.dense.bias',
 'encoder.encoder.layer.1.output.LayerNorm.bias',
 'encoder.encoder.layer.3.attention.output.dense.weight',
 'encoder.encoder.layer.3.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.2.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.output.dense.weight',
 'encoder.encoder.layer.4.intermediate.dense.bias',
 'encoder.encoder.layer.2.attention.self.value.bias',
 'encoder.encoder.layer.0.attention.self.key.weight',
 'encoder.encoder.layer.1.attention.self.query.weight',
 'encoder.encoder.layer.2.intermediate.dense.bias',
 'encoder.encoder.layer.2.intermediate.dense.weight',
 'encoder.encoder.layer.5.attention.self.key.bias',
 'encoder.encoder.layer.2.attention.self.key.bias',
 'encoder.encoder.layer.2.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.self.key.weight',
 'encoder.encoder.layer.0.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.self.value.weight',
 'encoder.encoder.layer.4.attention.output.dense.bias',
 'encoder.encoder.layer.1.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.1.attention.output.dense.bias',
 'encoder.encoder.layer.5.attention.output.dense.weight',
 'encoder.encoder.layer.4.output.dense.weight',
 'encoder.encoder.layer.0.attention.self.value.bias',
 'encoder.encoder.layer.1.attention.self.value.bias',
 'encoder.encoder.layer.0.output.LayerNorm.weight',
 'encoder.encoder.layer.1.attention.self.key.weight',
 'encoder.encoder.layer.3.intermediate.dense.bias',
 'encoder.encoder.layer.1.attention.self.query.bias',
 'encoder.encoder.layer.4.attention.self.query.weight',
 'encoder.encoder.layer.3.output.dense.weight',
 'encoder.encoder.layer.2.output.dense.bias',
 'encoder.encoder.layer.4.attention.output.LayerNorm.bias']
@choosewhatulike
Copy link
Member

run_gen.py 的111行替换成 BartForConditionalGeneration,然后直接运行就好:

python run_gen.py --model_path fnlp/bart-base-chinese --dataset adgen --data_dir demo_data

@6666ev
Copy link
Author

6666ev commented Nov 1, 2021

感谢感谢,因为没有modeling_bart.py这个文件,自己需要从megatron里面找一份代码改过来,我之前改的有误,现在改好了。

@choosewhatulike
Copy link
Member

不客气:)
其实也可以用transformers库里的BART

from transformers import BartForConditionalGeneration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants