可否提供run_gen.py的bart版本？ #7

6666ev · 2021-10-31T09:06:51Z

路径下CPT/blob/master/finetune/generation/run_gen.py是CPT的版本
我自己按照这个改了一个bart版本，但是显示有很多层not used或者not initialized。
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration
Some weights of BartForConditionalGeneration were not initialized
不知道这些警告是否有影响，或者能否提供一个run_gen.py的bart版本？

详细信息如下所示：

loading weights file model/bart-base-chinese/pytorch_model.bin
Some weights of the model checkpoint at model/bart-base-chinese were not used when initializing BartForConditionalGeneration: ['encoder.layers.4.fc1.bias',
 'encoder.layers.0.self_attn.k_proj.bias',
 'encoder.layers.3.fc1.bias',
 'encoder.layers.4.fc1.weight',
 'encoder.layers.1.final_layer_norm.bias',
 'encoder.layers.0.fc2.weight',
 'encoder.layers.0.self_attn.out_proj.bias',
 'encoder.layers.1.self_attn.out_proj.weight',
 'encoder.layers.3.self_attn.k_proj.bias',
 'encoder.layernorm_embedding.weight',
 'encoder.layers.1.fc2.weight',
 'encoder.layers.5.self_attn.q_proj.weight',
 'encoder.layers.5.self_attn.q_proj.bias',
 'encoder.layers.0.final_layer_norm.weight',
 'encoder.layers.1.self_attn.v_proj.weight',
 'encoder.layers.4.self_attn.out_proj.weight',
 'encoder.layers.5.self_attn_layer_norm.bias',
 'encoder.layers.0.self_attn_layer_norm.bias',
 'encoder.layers.3.self_attn.k_proj.weight',
 'encoder.embed_tokens.weight',
 'encoder.layers.1.self_attn.v_proj.bias',
 'encoder.layers.5.final_layer_norm.bias',
 'encoder.layers.1.fc1.weight',
 'encoder.layers.5.self_attn_layer_norm.weight',
 'encoder.layers.2.fc1.weight',
 'encoder.layers.0.final_layer_norm.bias',
 'encoder.layers.1.fc2.bias',
 'encoder.layers.3.self_attn.v_proj.weight',
 'encoder.layers.3.final_layer_norm.bias',
 'encoder.layers.2.fc1.bias',
 'encoder.layers.3.self_attn.q_proj.weight',
 'encoder.layers.1.final_layer_norm.weight',
 'encoder.layers.4.fc2.bias',
 'encoder.layers.4.self_attn.out_proj.bias',
 'encoder.layers.2.self_attn.q_proj.weight',
 'encoder.layers.2.final_layer_norm.weight',
 'encoder.embed_positions.weight',
 'encoder.layers.3.self_attn.out_proj.bias',
 'encoder.layers.3.fc1.weight',
 'encoder.layers.1.fc1.bias',
 'encoder.layers.0.self_attn.k_proj.weight',
 'encoder.layers.1.self_attn.k_proj.bias',
 'encoder.layers.0.fc2.bias',
 'encoder.layers.1.self_attn.k_proj.weight',
 'encoder.layers.5.self_attn.v_proj.bias',
 'encoder.layers.1.self_attn.q_proj.weight',
 'encoder.layers.2.final_layer_norm.bias',
 'encoder.layers.4.self_attn_layer_norm.weight',
 'encoder.layers.4.self_attn.v_proj.bias',
 'encoder.layers.2.self_attn_layer_norm.weight',
 'encoder.layers.0.fc1.weight',
 'encoder.layers.4.self_attn.k_proj.bias',
 'encoder.layers.0.self_attn.q_proj.bias',
 'encoder.layers.4.final_layer_norm.bias',
 'encoder.layers.0.self_attn.v_proj.weight',
 'encoder.layers.3.final_layer_norm.weight',
 'encoder.layers.5.self_attn.out_proj.weight',
 'encoder.layers.4.self_attn.q_proj.weight',
 'encoder.layers.0.self_attn_layer_norm.weight',
 'encoder.layers.5.self_attn.v_proj.weight',
 'encoder.layers.2.self_attn.v_proj.weight',
 'encoder.layers.1.self_attn.out_proj.bias',
 'encoder.layers.2.self_attn.k_proj.bias',
 'encoder.layers.2.self_attn.out_proj.weight',
 'encoder.layers.3.self_attn.v_proj.bias',
 'encoder.layers.2.self_attn.q_proj.bias',
 'encoder.layers.2.self_attn.out_proj.bias',
 'encoder.layers.3.fc2.bias',
 'encoder.layers.5.fc1.weight',
 'encoder.layernorm_embedding.bias',
 'encoder.layers.0.fc1.bias',
 'encoder.layers.3.self_attn_layer_norm.bias',
 'encoder.layers.5.self_attn.k_proj.weight',
 'encoder.layers.5.fc1.bias',
 'encoder.layers.3.fc2.weight',
 'encoder.layers.4.fc2.weight',
 'encoder.layers.0.self_attn.v_proj.bias',
 'encoder.layers.0.self_attn.q_proj.weight',
 'encoder.layers.1.self_attn.q_proj.bias',
 'encoder.layers.3.self_attn_layer_norm.weight',
 'encoder.layers.2.self_attn.k_proj.weight',
 'encoder.layers.2.self_attn.v_proj.bias',
 'encoder.layers.5.final_layer_norm.weight',
 'encoder.layers.5.self_attn.out_proj.bias',
 'encoder.layers.0.self_attn.out_proj.weight',
 'encoder.layers.5.fc2.weight',
 'encoder.layers.5.fc2.bias',
 'encoder.layers.1.self_attn_layer_norm.bias',
 'encoder.layers.4.self_attn.k_proj.weight',
 'encoder.layers.5.self_attn.k_proj.bias',
 'encoder.layers.3.self_attn.q_proj.bias',
 'encoder.layers.4.self_attn.q_proj.bias',
 'encoder.layers.1.self_attn_layer_norm.weight',
 'encoder.layers.2.self_attn_layer_norm.bias',
 'encoder.layers.4.final_layer_norm.weight',
 'encoder.layers.4.self_attn.v_proj.weight',
 'encoder.layers.2.fc2.weight',
 'encoder.layers.2.fc2.bias',
 'encoder.layers.4.self_attn_layer_norm.bias',
 'encoder.layers.3.self_attn.out_proj.weight']
- This IS expected if you are initializing BartForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BartForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at model/bart-base-chinese and are newly initialized: 
['encoder.encoder.layer.1.output.dense.bias',
 'encoder.encoder.layer.3.attention.self.key.bias',
 'encoder.encoder.layer.3.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.value.bias',
 'encoder.encoder.layer.2.attention.output.dense.bias',
 'encoder.encoder.layer.4.output.LayerNorm.bias',
 'encoder.encoder.layer.4.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.0.intermediate.dense.bias',
 'encoder.encoder.layer.5.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.0.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.2.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.2.attention.self.key.weight',
 'encoder.embeddings.LayerNorm.weight',
 'encoder.encoder.layer.0.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.1.attention.self.key.bias',
 'encoder.encoder.layer.3.intermediate.dense.weight',
 'encoder.encoder.layer.5.intermediate.dense.weight',
 'encoder.encoder.layer.0.output.dense.weight',
 'encoder.encoder.layer.5.output.LayerNorm.bias',
 'encoder.encoder.layer.1.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.query.weight',
 'encoder.encoder.layer.1.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.key.bias',
 'encoder.encoder.layer.3.output.LayerNorm.bias',
 'encoder.encoder.layer.5.output.dense.bias',
 'encoder.encoder.layer.4.attention.self.key.weight',
 'encoder.encoder.layer.0.attention.self.key.bias',
 'encoder.encoder.layer.0.attention.self.query.weight',
 'encoder.encoder.layer.0.intermediate.dense.weight',
 'encoder.encoder.layer.3.output.LayerNorm.weight',
 'encoder.encoder.layer.3.attention.output.dense.bias',
 'encoder.encoder.layer.5.output.dense.weight',
 'encoder.embeddings.LayerNorm.bias',
 'encoder.encoder.layer.1.attention.self.value.weight',
 'encoder.encoder.layer.2.output.dense.weight',
 'encoder.encoder.layer.4.intermediate.dense.weight',
 'encoder.encoder.layer.2.attention.self.value.weight',
 'encoder.encoder.layer.0.attention.self.value.weight',
 'encoder.encoder.layer.0.attention.output.dense.bias',
 'encoder.encoder.layer.2.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.3.output.dense.bias',
 'encoder.encoder.layer.5.output.LayerNorm.weight',
 'encoder.encoder.layer.5.attention.output.dense.bias',
 'encoder.encoder.layer.4.attention.self.value.weight',
 'encoder.encoder.layer.3.attention.self.query.bias',
 'encoder.encoder.layer.3.attention.self.value.weight',
 'encoder.encoder.layer.3.attention.self.key.weight',
 'encoder.encoder.layer.0.output.dense.bias',
 'encoder.encoder.layer.1.intermediate.dense.bias',
 'encoder.encoder.layer.0.attention.self.query.bias',
 'encoder.encoder.layer.1.intermediate.dense.weight',
 'encoder.encoder.layer.0.attention.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.value.bias',
 'encoder.embeddings.token_type_embeddings.weight',
 'encoder.encoder.layer.1.attention.output.dense.weight',
 'encoder.encoder.layer.2.attention.self.query.bias',
 'encoder.encoder.layer.2.attention.self.query.weight',
 'encoder.encoder.layer.2.attention.output.dense.weight',
 'encoder.encoder.layer.5.attention.self.query.bias',
 'encoder.embeddings.position_ids',
 'encoder.embeddings.position_embeddings.weight',
 'encoder.encoder.layer.3.attention.self.query.weight',
 'encoder.embeddings.word_embeddings.weight',
 'encoder.encoder.layer.4.output.dense.bias',
 'encoder.encoder.layer.1.attention.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.self.query.bias',
 'encoder.encoder.layer.3.attention.self.value.bias',
 'encoder.encoder.layer.5.intermediate.dense.bias',
 'encoder.encoder.layer.1.output.LayerNorm.bias',
 'encoder.encoder.layer.3.attention.output.dense.weight',
 'encoder.encoder.layer.3.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.2.output.LayerNorm.weight',
 'encoder.encoder.layer.4.attention.output.dense.weight',
 'encoder.encoder.layer.4.intermediate.dense.bias',
 'encoder.encoder.layer.2.attention.self.value.bias',
 'encoder.encoder.layer.0.attention.self.key.weight',
 'encoder.encoder.layer.1.attention.self.query.weight',
 'encoder.encoder.layer.2.intermediate.dense.bias',
 'encoder.encoder.layer.2.intermediate.dense.weight',
 'encoder.encoder.layer.5.attention.self.key.bias',
 'encoder.encoder.layer.2.attention.self.key.bias',
 'encoder.encoder.layer.2.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.self.key.weight',
 'encoder.encoder.layer.0.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.5.attention.self.value.weight',
 'encoder.encoder.layer.4.attention.output.dense.bias',
 'encoder.encoder.layer.1.attention.output.LayerNorm.bias',
 'encoder.encoder.layer.1.attention.output.dense.bias',
 'encoder.encoder.layer.5.attention.output.dense.weight',
 'encoder.encoder.layer.4.output.dense.weight',
 'encoder.encoder.layer.0.attention.self.value.bias',
 'encoder.encoder.layer.1.attention.self.value.bias',
 'encoder.encoder.layer.0.output.LayerNorm.weight',
 'encoder.encoder.layer.1.attention.self.key.weight',
 'encoder.encoder.layer.3.intermediate.dense.bias',
 'encoder.encoder.layer.1.attention.self.query.bias',
 'encoder.encoder.layer.4.attention.self.query.weight',
 'encoder.encoder.layer.3.output.dense.weight',
 'encoder.encoder.layer.2.output.dense.bias',
 'encoder.encoder.layer.4.attention.output.LayerNorm.bias']

The text was updated successfully, but these errors were encountered:

choosewhatulike · 2021-10-31T14:21:28Z

把 run_gen.py 的111行替换成 BartForConditionalGeneration，然后直接运行就好：

python run_gen.py --model_path fnlp/bart-base-chinese --dataset adgen --data_dir demo_data

6666ev · 2021-11-01T01:52:01Z

感谢感谢，因为没有modeling_bart.py这个文件，自己需要从megatron里面找一份代码改过来，我之前改的有误，现在改好了。

choosewhatulike · 2021-11-02T07:09:10Z

不客气：）
其实也可以用transformers库里的BART

from transformers import BartForConditionalGeneration

choosewhatulike closed this as completed Nov 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

可否提供run_gen.py的bart版本？ #7

可否提供run_gen.py的bart版本？ #7

6666ev commented Oct 31, 2021

choosewhatulike commented Oct 31, 2021

6666ev commented Nov 1, 2021

choosewhatulike commented Nov 2, 2021

可否提供run_gen.py的bart版本？ #7

可否提供run_gen.py的bart版本？ #7

Comments

6666ev commented Oct 31, 2021

choosewhatulike commented Oct 31, 2021

6666ev commented Nov 1, 2021

choosewhatulike commented Nov 2, 2021