Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: "PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field" ValueError: PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field #4345

Closed
1 task done
datalee opened this issue Jan 5, 2023 · 24 comments
Labels
bug Something isn't working

Comments

@datalee
Copy link

datalee commented Jan 5, 2023

软件环境

- paddlepaddle:2.4.1
- paddlepaddle-gpu: None
- paddlenlp: 2.4.9

重复问题

  • I have searched the existing issues

错误描述

使用paddlenlp 2.4.9动态转静态模型时,出现如下报错:

"PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field"
ValueError: PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field

模型文件包含以下内容,是在2.4.5版本下训练的:
model_config.json
model_state.pdparams
special_tokens_map.json
tokenizer_config.json
vocab.txt



### 稳定复现步骤 & 代码

from model import UIE

parser = argparse.ArgumentParser()
parser.add_argument("--model_path", type=str, required=True, default='./checkpoint/model_best', help="The path to model parameters to be loaded.")
parser.add_argument("--output_path", type=str, default='./export', help="The path of model parameter in static graph to be saved.")
args = parser.parse_args()

if name == "main":
model = UIE.from_pretrained(args.model_path)
model.eval()

# Convert to static graph with specific input description
model = paddle.jit.to_static(model,
                             input_spec=[
                                 paddle.static.InputSpec(shape=[None, None],
                                                         dtype="int64",
                                                         name='input_ids'),
                                 paddle.static.InputSpec(
                                     shape=[None, None],
                                     dtype="int64",
                                     name='token_type_ids'),
                                 paddle.static.InputSpec(shape=[None, None],
                                                         dtype="int64",
                                                         name='pos_ids'),
                                 paddle.static.InputSpec(shape=[None, None],
                                                         dtype="int64",
                                                         name='att_mask'),
                             ])
# Save in static graph model.
save_path = os.path.join(args.output_path, "inference")
paddle.jit.save(model, save_path)
@datalee datalee added the bug Something isn't working label Jan 5, 2023
@github-actions github-actions bot added the triage label Jan 5, 2023
@LemonNoel
Copy link
Contributor

2.4.9 版本的 from_pretrained 有不兼容升级,建议仍使用2.4.5进行动转静

@datalee
Copy link
Author

datalee commented Jan 5, 2023

建议仍使用2.4

好吧

@sijunhe sijunhe removed the triage label Jan 5, 2023
@sijunhe
Copy link
Collaborator

sijunhe commented Jan 5, 2023

@datalee
可以手动修改一下model_config.json即可:

  1. model_config.json 改名为config.json
  2. 相应修改内容和格式,可以参考https://huggingface.co/PaddlePaddle/uie-base/blob/main/config.json

@datalee
Copy link
Author

datalee commented Jan 5, 2023

2. 相应修改内容和格式

没的用,改成这样也不行:

{
  "architectures": [
    "UIE"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 2048,
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "paddlenlp_version": null,
  "task_type_vocab_size": 16,
  "type_vocab_size": 4,
  "use_task_id": true,
  "vocab_size": 40000,
  "model_type": "ernie"
}

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 5, 2023

请问是哪一行报错了?麻烦发一下error stack

@datalee
Copy link
Author

datalee commented Jan 5, 2023

请问是哪一行报错了?麻烦发一下error stack

[2023-01-05 15:39:59,078] [    INFO] - loading configuration file ./checkpoint/xxx/model_best/config.json
[2023-01-05 15:39:59,078] [ WARNING] - You are using a model of type ernie to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
[2023-01-05 15:39:59,079] [    INFO] - Model config PretrainedConfig {
  "architectures": [
    "UIE"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 2048,
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "paddlenlp_version": null,
  "task_type_vocab_size": 16,
  "type_vocab_size": 4,
  "use_task_id": true,
  "vocab_size": 40000
}

[2023-01-05 15:39:59,080] [    INFO] - loading configuration file ./checkpoint/xxx/model_best\config.json
[2023-01-05 15:39:59,081] [    INFO] - Model config ErnieConfig {
  "architectures": [
    "UIE"
  ],
  "attention_probs_dropout_prob": 0.1,
  "enable_recompute": false,
  "fuse": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 2048,
  "model_type": "ernie",
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "pad_token_id": 0,
  "paddlenlp_version": null,
  "pool_act": "tanh",
  "task_id": 0,
  "task_type_vocab_size": 16,
  "type_vocab_size": 4,
  "use_task_id": true,
  "vocab_size": 40000
}

Traceback (most recent call last):
  File "xxx/uie/load_model.py", line 13, in <module>
    model = UIE.from_pretrained(model_path)
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 440, in from_pretrained
    return cls.from_pretrained_v2(pretrained_model_name_or_path, from_hf_hub=from_hf_hub, *args, **kwargs)
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 1261, in from_pretrained_v2
    model = cls(config, *init_args, **model_kwargs)
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "xxx\uie\model.py", line 23, in __init__
    super(UIE, self).__init__()
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "xxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 262, in __init__
    "PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field"
ValueError: PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 5, 2023

从日志里看,这个config被加载了两次,看起来不太对劲。您如果单独跑model = UIE.from_pretrained("uie-base")可以跑通吗?

@datalee
Copy link
Author

datalee commented Jan 5, 2023

从日志里看,这个config被加载了两次,看起来不太对劲。您如果单独跑model = UIE.from_pretrained("uie-base")可以跑通吗?

跑不通,和自定义的是一样的error stack

@datalee
Copy link
Author

datalee commented Jan 5, 2023

从日志里看,这个config被加载了两次,看起来不太对劲。您如果单独跑model = UIE.from_pretrained("uie-base")可以跑通吗?

掉两次是因为我单独用PretrainedConfig.from_pretrained加载了次config,这个可以忽略

[2023-01-05 17:59:42,554] [    INFO] - Model config ErnieConfig {
  "attention_probs_dropout_prob": 0.1,
  "enable_recompute": false,
  "fuse": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 2048,
  "model_type": "ernie",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "paddlenlp_version": null,
  "pool_act": "tanh",
  "task_id": 0,
  "task_type_vocab_size": 3,
  "type_vocab_size": 4,
  "use_task_id": true,
  "vocab_size": 40000
}

Traceback (most recent call last):
  File "xxxxx/uie/load_model.py", line 14, in <module>
    model = UIE.from_pretrained("uie-base")
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 440, in from_pretrained
    return cls.from_pretrained_v2(pretrained_model_name_or_path, from_hf_hub=from_hf_hub, *args, **kwargs)
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 1261, in from_pretrained_v2
    model = cls(config, *init_args, **model_kwargs)
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "D:\py_code_isp_dev\uie\model.py", line 23, in __init__
    super(UIE, self).__init__()
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\utils.py", line 170, in __impl__
    init_func(self, *args, **kwargs)
  File "xxxxx\venv\lib\site-packages\paddlenlp\transformers\model_utils.py", line 262, in __init__
    "PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field"
ValueError: PretrainedConfig instance not found in the arguments, you can set it as args or kwargs with config field

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 5, 2023

跑不通,和自定义的是一样的error stack

这个有点奇怪, 我用v2.4.9跑model = UIE.from_pretrained("uie-base")肯定是没问题的。可以尝试删除一下模型的缓存~/.paddlenlp/models/uie-base/.

@datalee
Copy link
Author

datalee commented Jan 5, 2023

跑不通,和自定义的是一样的error stack

这个有点奇怪, 我用v2.4.9跑model = UIE.from_pretrained("uie-base")肯定是没问题的。可以尝试删除一下模型的缓存~/.paddlenlp/models/uie-base/.

那就是不兼容低版本的模型了?

@datalee
Copy link
Author

datalee commented Jan 5, 2023

我先降到paddlenlp-2.4.5吧,2.4.5就没问题[泪哭]

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 5, 2023

我刚才用2.4.5保存的模型,用2.4.9的paddlenlp读取,也没有任何问题

>>> ls /ssd2/hesijun/workspace/uie-base
model_config.json  model_state.pdparams
>>> python3
>>> from paddlenlp.transformers import UIE
>>> model = UIE.from_pretrained("/ssd2/hesijun/workspace/uie-base")
[2023-01-05 21:05:55,018] [    INFO] - loading configuration file /ssd2/hesijun/workspace/uie-base/model_config.json
[2023-01-05 21:05:55,022] [    INFO] - Model config ErnieConfig {
  "architectures": [
    "UIE"
  ],
  "attention_probs_dropout_prob": 0.1,
  "enable_recompute": false,
  "fuse": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 2048,
  "model_type": "ernie",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "paddlenlp_version": null,
  "pool_act": "tanh",
  "task_id": 0,
  "task_type_vocab_size": 3,
  "type_vocab_size": 4,
  "use_task_id": true,
  "vocab_size": 40000
}

[2023-01-05 21:05:55,024] [    INFO] - Configuration saved in /ssd2/hesijun/workspace/uie-base/config.json
[2023-01-05 21:06:10,390] [    INFO] - All model checkpoint weights were used when initializing UIE.

[2023-01-05 21:06:10,390] [    INFO] - All the weights of UIE were initialized from the model checkpoint at /ssd2/hesijun/workspace/uie-base.
If your task is similar to the task the model of the checkpoint was trained on, you can already use UIE for predictions without further training.

如果您在2.4.9里跑不通model = UIE.from_pretrained("uie-base"), 建议您按我上面说的清空一下缓存试试

@datalee
Copy link
Author

datalee commented Jan 6, 2023

from paddlenlp.transformers import UIE

问题出在UIE模型的定义上,我使用是之前自定义的model

import paddle
import paddle.nn as nn
from paddlenlp.transformers import ErniePretrainedModel


class UIE(ErniePretrainedModel):

    def __init__(self, encoding_model):
        super(UIE, self).__init__()
        self.encoder = encoding_model
        hidden_size = self.encoder.config["hidden_size"]
        self.linear_start = paddle.nn.Linear(hidden_size, 1)
        self.linear_end = paddle.nn.Linear(hidden_size, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, input_ids, token_type_ids, pos_ids, att_mask):
        sequence_output, pooled_output = self.encoder(
            input_ids=input_ids,
            token_type_ids=token_type_ids,
            position_ids=pos_ids,
            attention_mask=att_mask)
        start_logits = self.linear_start(sequence_output)
        start_logits = paddle.squeeze(start_logits, -1)
        start_prob = self.sigmoid(start_logits)
        end_logits = self.linear_end(sequence_output)
        end_logits = paddle.squeeze(end_logits, -1)
        end_prob = self.sigmoid(end_logits)
        return start_prob, end_prob

当我直接从from paddlenlp.transformers import UIE时,不会报上面的错误了,但是加载自定义的模型会出现参数无法初始化的问题:

[2023-01-06 08:37:37,102] [ WARNING] - Some weights of the model checkpoint at ./checkpoint/xxx/model_best were not used when initializing UIE: ['encoder.encoder.layers.4.self_attn.v_proj.weight', 'encoder.embeddings.layer_norm.bias', 'encoder.encoder.layers.0.self_attn.out_proj.bias', 'encoder.encoder.layers.0.linear2.bias', 'encoder.encoder.layers.2.self_attn.q_proj.bias', 'encoder.encoder.layers.4.norm1.weight', 'encoder.encoder.layers.4.linear2.weight', 'encoder.encoder.layers.0.self_attn.out_proj.weight', 'encoder.encoder.layers.4.norm1.bias', 'encoder.embeddings.position_embeddings.weight', 'encoder.encoder.layers.4.norm2.bias', 'encoder.encoder.layers.0.norm2.bias', 'encoder.encoder.layers.1.self_attn.v_proj.weight', 'encoder.encoder.layers.3.linear1.bias', 'encoder.encoder.layers.2.self_attn.k_proj.bias', 'encoder.encoder.layers.2.self_attn.k_proj.weight', 'encoder.encoder.layers.5.linear2.bias', 'encoder.encoder.layers.5.norm2.weight', 'encoder.encoder.layers.4.linear1.bias', 'encoder.encoder.layers.0.self_attn.v_proj.weight', 'encoder.encoder.layers.2.linear2.weight', 'encoder.encoder.layers.4.self_attn.k_proj.bias', 'encoder.encoder.layers.0.self_attn.q_proj.weight', 'encoder.encoder.layers.1.linear2.weight', 'encoder.encoder.layers.4.self_attn.out_proj.bias', 'encoder.encoder.layers.2.self_attn.v_proj.weight', 'encoder.encoder.layers.3.norm1.bias', 'encoder.encoder.layers.4.self_attn.q_proj.bias', 'encoder.encoder.layers.3.self_attn.out_proj.bias', 'encoder.embeddings.layer_norm.weight', 'encoder.encoder.layers.2.self_attn.q_proj.weight', 'encoder.encoder.layers.3.linear1.weight', 'encoder.encoder.layers.5.self_attn.out_proj.bias', 'encoder.pooler.dense.bias', 'encoder.encoder.layers.3.self_attn.q_proj.bias', 'encoder.encoder.layers.5.self_attn.q_proj.weight', 'encoder.encoder.layers.2.self_attn.out_proj.weight', 'encoder.encoder.layers.0.linear1.bias', 'encoder.encoder.layers.0.linear2.weight', 'encoder.embeddings.task_type_embeddings.weight', 'encoder.encoder.layers.0.self_attn.k_proj.weight', 'encoder.encoder.layers.1.linear2.bias', 'encoder.encoder.layers.4.linear2.bias', 'encoder.encoder.layers.5.self_attn.q_proj.bias', 'encoder.encoder.layers.5.norm1.weight', 'encoder.encoder.layers.3.self_attn.v_proj.weight', 'encoder.encoder.layers.1.norm2.weight', 'encoder.encoder.layers.3.self_attn.v_proj.bias', 'encoder.embeddings.word_embeddings.weight', 'encoder.encoder.layers.0.linear1.weight', 'encoder.encoder.layers.1.self_attn.k_proj.bias', 'encoder.encoder.layers.3.norm2.bias', 'encoder.encoder.layers.1.linear1.weight', 'encoder.encoder.layers.0.self_attn.k_proj.bias', 'encoder.encoder.layers.1.norm1.bias', 'encoder.encoder.layers.5.linear1.bias', 'encoder.encoder.layers.3.self_attn.out_proj.weight', 'encoder.encoder.layers.4.self_attn.q_proj.weight', 'encoder.encoder.layers.2.self_attn.v_proj.bias', 'encoder.encoder.layers.1.self_attn.v_proj.bias', 'encoder.encoder.layers.5.self_attn.k_proj.bias', 'encoder.encoder.layers.3.linear2.weight', 'encoder.encoder.layers.2.linear2.bias', 'encoder.encoder.layers.0.norm1.bias', 'encoder.encoder.layers.1.self_attn.q_proj.bias', 'encoder.encoder.layers.5.self_attn.out_proj.weight', 'encoder.encoder.layers.3.norm2.weight', 'encoder.encoder.layers.2.norm1.bias', 'encoder.encoder.layers.2.self_attn.out_proj.bias', 'encoder.encoder.layers.5.linear2.weight', 'encoder.encoder.layers.3.self_attn.k_proj.weight', 'encoder.encoder.layers.5.linear1.weight', 'encoder.encoder.layers.5.self_attn.v_proj.weight', 'encoder.encoder.layers.4.linear1.weight', 'encoder.encoder.layers.1.linear1.bias', 'encoder.encoder.layers.2.norm2.weight', 'encoder.encoder.layers.4.norm2.weight', 'encoder.encoder.layers.2.linear1.weight', 'encoder.encoder.layers.5.self_attn.v_proj.bias', 'encoder.encoder.layers.3.self_attn.k_proj.bias', 'encoder.encoder.layers.0.norm1.weight', 'encoder.encoder.layers.0.self_attn.v_proj.bias', 'encoder.encoder.layers.3.linear2.bias', 'encoder.encoder.layers.0.self_attn.q_proj.bias', 'encoder.encoder.layers.1.self_attn.k_proj.weight', 'encoder.encoder.layers.2.norm2.bias', 'encoder.encoder.layers.4.self_attn.k_proj.weight', 'encoder.encoder.layers.5.self_attn.k_proj.weight', 'encoder.encoder.layers.5.norm2.bias', 'encoder.encoder.layers.2.linear1.bias', 'encoder.encoder.layers.1.norm2.bias', 'encoder.encoder.layers.5.norm1.bias', 'encoder.embeddings.token_type_embeddings.weight', 'encoder.encoder.layers.4.self_attn.out_proj.weight', 'encoder.pooler.dense.weight', 'encoder.encoder.layers.0.norm2.weight', 'encoder.encoder.layers.2.norm1.weight', 'encoder.encoder.layers.1.self_attn.out_proj.weight', 'encoder.encoder.layers.1.norm1.weight', 'encoder.encoder.layers.4.self_attn.v_proj.bias', 'encoder.encoder.layers.3.self_attn.q_proj.weight', 'encoder.encoder.layers.1.self_attn.out_proj.bias', 'encoder.encoder.layers.1.self_attn.q_proj.weight', 'encoder.encoder.layers.3.norm1.weight']

- This IS expected if you are initializing UIE from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing UIE from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2023-01-06 08:37:37,103] [ WARNING] - Some weights of UIE were not initialized from the model checkpoint at ./checkpoint/xxxxx/model_best and are newly initialized: ['encoder.layers.4.linear1.weight', 'encoder.layers.5.self_attn.out_proj.weight', 'encoder.layers.4.linear1.bias', 'encoder.layers.3.norm2.weight', 'encoder.layers.1.linear1.bias', 'encoder.layers.3.self_attn.v_proj.weight', 'encoder.layers.0.linear1.weight', 'encoder.layers.5.self_attn.v_proj.bias', 'encoder.layers.4.self_attn.k_proj.bias', 'encoder.layers.1.linear1.weight', 'encoder.layers.4.self_attn.q_proj.weight', 'encoder.layers.3.self_attn.v_proj.bias', 'encoder.layers.2.self_attn.v_proj.weight', 'encoder.layers.2.norm1.bias', 'encoder.layers.5.self_attn.k_proj.weight', 'encoder.layers.4.norm1.bias', 'encoder.layers.5.norm2.weight', 'encoder.layers.5.linear2.weight', 'encoder.layers.2.self_attn.q_proj.bias', 'encoder.layers.4.self_attn.v_proj.weight', 'encoder.layers.3.self_attn.q_proj.bias', 'encoder.layers.5.self_attn.q_proj.weight', 'encoder.layers.3.norm1.weight', 'encoder.layers.2.norm2.bias', 'encoder.layers.0.norm1.weight', 'encoder.layers.0.self_attn.k_proj.weight', 'encoder.layers.5.norm1.weight', 'encoder.layers.3.linear1.bias', 'encoder.layers.3.norm2.bias', 'encoder.layers.3.self_attn.q_proj.weight', 'encoder.layers.2.self_attn.q_proj.weight', 'encoder.layers.4.linear2.bias', 'encoder.layers.1.norm1.bias', 'encoder.layers.2.self_attn.k_proj.bias', 'encoder.layers.3.norm1.bias', 'encoder.layers.3.self_attn.out_proj.weight', 'encoder.layers.2.norm1.weight', 'pooler.dense.weight', 'encoder.layers.3.self_attn.k_proj.weight', 'embeddings.layer_norm.weight', 'encoder.layers.0.linear2.bias', 'encoder.layers.0.self_attn.v_proj.weight', 'encoder.layers.1.self_attn.v_proj.weight', 'encoder.layers.0.linear1.bias', 'encoder.layers.1.self_attn.k_proj.weight', 'encoder.layers.2.norm2.weight', 'encoder.layers.3.self_attn.k_proj.bias', 'encoder.layers.0.norm1.bias', 'encoder.layers.2.linear2.bias', 'embeddings.layer_norm.bias', 'encoder.layers.0.self_attn.v_proj.bias', 'encoder.layers.0.norm2.bias', 'encoder.layers.4.self_attn.out_proj.bias', 'encoder.layers.5.self_attn.k_proj.bias', 'encoder.layers.3.linear1.weight', 'embeddings.task_type_embeddings.weight', 'encoder.layers.0.self_attn.out_proj.bias', 'encoder.layers.0.self_attn.q_proj.bias', 'encoder.layers.1.linear2.bias', 'encoder.layers.4.norm2.bias', 'encoder.layers.2.self_attn.out_proj.bias', 'encoder.layers.2.linear1.weight', 'encoder.layers.1.self_attn.out_proj.weight', 'encoder.layers.3.self_attn.out_proj.bias', 'encoder.layers.5.linear1.bias', 'encoder.layers.2.linear2.weight', 'encoder.layers.0.self_attn.q_proj.weight', 'encoder.layers.0.norm2.weight', 'encoder.layers.4.self_attn.q_proj.bias', 'encoder.layers.1.self_attn.q_proj.weight', 'encoder.layers.1.norm1.weight', 'embeddings.position_embeddings.weight', 'encoder.layers.2.self_attn.v_proj.bias', 'encoder.layers.5.norm1.bias', 'encoder.layers.4.self_attn.v_proj.bias', 'encoder.layers.3.linear2.bias', 'encoder.layers.4.self_attn.out_proj.weight', 'encoder.layers.5.linear1.weight', 'embeddings.word_embeddings.weight', 'encoder.layers.0.self_attn.out_proj.weight', 'encoder.layers.5.norm2.bias', 'encoder.layers.2.self_attn.k_proj.weight', 'encoder.layers.0.linear2.weight', 'encoder.layers.1.self_attn.q_proj.bias', 'encoder.layers.1.norm2.bias', 'encoder.layers.4.norm1.weight', 'encoder.layers.5.self_attn.q_proj.bias', 'encoder.layers.4.norm2.weight', 'encoder.layers.5.linear2.bias', 'encoder.layers.2.linear1.bias', 'encoder.layers.5.self_attn.v_proj.weight', 'encoder.layers.2.self_attn.out_proj.weight', 'encoder.layers.1.norm2.weight', 'encoder.layers.1.self_attn.k_proj.bias', 'encoder.layers.4.linear2.weight', 'encoder.layers.4.self_attn.k_proj.weight', 'encoder.layers.1.self_attn.out_proj.bias', 'encoder.layers.0.self_attn.k_proj.bias', 'embeddings.token_type_embeddings.weight', 'encoder.layers.1.linear2.weight', 'encoder.layers.3.linear2.weight', 'encoder.layers.1.self_attn.v_proj.bias', 'pooler.dense.bias', 'encoder.layers.5.self_attn.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 6, 2023

@datalee
加载模型参数没有对齐这个问题是原因是您的自定义模型和我们这边的模型有一点不同(您用的是self.encoder, 我们这边是self.ernie),导致权重名字不一样,这个可以解决的。可以尝试以下操作:

import paddle
state_dict = paddle.load("old_model/model_state.pdparams")
for key in list(state_dict.keys()):
    state_dict[key.replace('encoder.encoder', 'ernie.encoder')] = state_dict.pop(key)
paddle.save(state_dict, "new_model/model_state.pdparams")

@datalee
Copy link
Author

datalee commented Jan 6, 2023

,导致权重名字不一样,这个可以解决的。可以尝试以下操作:

好的,明白了,感谢

@datalee datalee closed this as completed Jan 6, 2023
@ZzyChris97
Copy link

我也遇到了同样的问题,我想问一下,2.4.9中通过下面语句加载的UIE模型,与2.4.2版本model文件中的UIE模型有什么区别?

from paddlenlp.transformers import UIE

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 30, 2023

我也遇到了同样的问题,我想问一下,2.4.9中通过下面语句加载的UIE模型,与2.4.2版本model文件中的UIE模型有什么区别?

from paddlenlp.transformers import UIE

在2.4.2版本中,因为UIE刚刚开源,所以并未纳入paddlenlp.transformers中,而是作为一个例子放在model_zoo内。在2.4.9升级后,UIE正式纳入paddlenlp.transformers内,同时模型的定义有了小幅度的改变。如果需要用读取老的UIE模型checkpoint, 可以参考我上面的回答

@ZzyChris97
Copy link

我也遇到了同样的问题,我想问一下,2.4.9中通过下面语句加载的UIE模型,与2.4.2版本model文件中的UIE模型有什么区别?
from paddlenlp.transformers import UIE

在2.4.2版本中,因为UIE刚刚开源,所以并未纳入paddlenlp.transformers中,而是作为一个例子放在model_zoo内。在2.4.9升级后,UIE正式纳入paddlenlp.transformers内,同时模型的定义有了小幅度的改变。如果需要用读取老的UIE模型checkpoint, 可以参考我上面的回答

模型加载我倒是没遇到啥问题,使用paddlenlp.transformers的UIE模型,配套的train和evaluate代码都变了是吧?我发现如果只是简单更改UIE模型,在evaluate阶段准确率和F1值都一直是0

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 30, 2023

我也遇到了同样的问题,我想问一下,2.4.9中通过下面语句加载的UIE模型,与2.4.2版本model文件中的UIE模型有什么区别?
from paddlenlp.transformers import UIE

在2.4.2版本中,因为UIE刚刚开源,所以并未纳入paddlenlp.transformers中,而是作为一个例子放在model_zoo内。在2.4.9升级后,UIE正式纳入paddlenlp.transformers内,同时模型的定义有了小幅度的改变。如果需要用读取老的UIE模型checkpoint, 可以参考我上面的回答

模型加载我倒是没遇到啥问题,使用paddlenlp.transformers的UIE模型,配套的train和evaluate代码都变了是吧?我发现如果只是简单更改UIE模型,在evaluate阶段准确率和F1值都一直是0

都是0的话,可以仔细看一下模型加载进来有没有warning。有可能权重没对上,导致最后模型是from scratch的

@ZzyChris97
Copy link

我也遇到了同样的问题,我想问一下,2.4.9中通过下面语句加载的UIE模型,与2.4.2版本model文件中的UIE模型有什么区别?
from paddlenlp.transformers import UIE

在2.4.2版本中,因为UIE刚刚开源,所以并未纳入paddlenlp.transformers中,而是作为一个例子放在model_zoo内。在2.4.9升级后,UIE正式纳入paddlenlp.transformers内,同时模型的定义有了小幅度的改变。如果需要用读取老的UIE模型checkpoint, 可以参考我上面的回答

模型加载我倒是没遇到啥问题,使用paddlenlp.transformers的UIE模型,配套的train和evaluate代码都变了是吧?我发现如果只是简单更改UIE模型,在evaluate阶段准确率和F1值都一直是0

都是0的话,可以仔细看一下模型加载进来有没有warning。有可能权重没对上,导致最后模型是from scratch的

你说得对,确实加载的时候出现了和上面的老哥相同的问题,所以我训练的时候前面的epoch评估一直是0,大概10个epoch后就开始有结果了。

[2023-01-30 14:54:21,039] [ WARNING] - Some weights of the model checkpoint at uie-base were not used when initializing UIE: ['encoder.encoder.layers.6.self_attn.q_proj.bias', ....]
- This IS expected if you are initializing UIE from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing UIE from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

我想问一下,finetune这块的代码,之前是这样去加载的(此处的UIE是之前自定义的model)

    resource_file_urls = MODEL_MAP[args.model]['resource_file_urls']

    logger.info("Downloading resource files...")
    for key, val in resource_file_urls.items():
        file_path = os.path.join(args.model, key)
        if not os.path.exists(file_path):
            get_path_from_url(val, args.model)

    tokenizer = AutoTokenizer.from_pretrained(args.model)
    model = UIE.from_pretrained(args.model)

现在我的UIE模型更改为paddlenlp.transformer下的UIE模型后,是不是直接使用下面的语句,就可以自动完成预训练模型的下载和加载了?并且不会再有上述的问题了

from paddlenlp.transformers import UIE

model = UIE.from_pretrained(model_args.model_name_or_path)

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 30, 2023

from paddlenlp.transformers import UIE

model = UIE.from_pretrained(model_args.model_name_or_path)

可以使用这个语句,不过需要根据我上面的一个评论提到的处理一下老的权重,就能正确加载了

@ZzyChris97
Copy link

可以使用这个语句,不过需要根据我上面的一个评论提到的处理一下老的权重,就能正确加载了

我在使用的过程中发现,除了encoder.encoder还有encoder.embeddingsencoder.pooler等等诸如此类的参数都需要进行处理,所以我按照你提供的逻辑更改了一下代码(如下),现在参数都可以正确加载了,感谢你的耐心回复

def encoder2ernie(src, dst):
    """
    将模型中的'encoder.*'更换为'ernie.*'
    主要用于解决不同版本的模型参数加载的问题
    """
    state_dict = paddle.load(src)
    for key in list(state_dict.keys()):
        if key is None or len(key) < 7:
            continue
        if key[:7] == 'encoder':
            state_dict['ernie' + key[7:]] = state_dict.pop(key)
    paddle.save(state_dict, dst)

@sijunhe
Copy link
Collaborator

sijunhe commented Jan 30, 2023

可以使用这个语句,不过需要根据我上面的一个评论提到的处理一下老的权重,就能正确加载了

我在使用的过程中发现,除了encoder.encoder还有encoder.embeddingsencoder.pooler等等诸如此类的参数都需要进行处理,所以我按照你提供的逻辑更改了一下代码(如下),现在参数都可以正确加载了,感谢你的耐心回复

不客气~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants