【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel #5595

megemini · 2023-04-10T12:15:39Z

PR types

New features

PR changes

APIs

Description

【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel

Hackathon 4th No.102 这个任务里面有5个模型，我计划每个模型单独提PR，这个PR是处理 clip 模型。

使用的测试模型 hf-internal-testing/tiny-random-CLIPModel

反馈几个问题：

使用 hf-internal-testing/tiny-random-CLIPModel 这个模型，transformers 在 CLIPTextModelWithProjection 和 CLIPVisionModelWithProjection 模型转换会出现：

RuntimeError: Error(s) in loading state_dict for CLIPTextModelWithProjection:
    size mismatch for text_projection.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([32, 32]).
    You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

添加 ignore_mismatched_sizes=True 后初始化可以，但是最后的 torch_logit.text_embeds.shape 不一致，有什么好的处理方式？

使用 openai/clip-vit-base-patch32 模型就行测试，会提示报错：

[2023-04-10 20:10:21,733] [    INFO] - start to convert pytorch weight file</tmp/tmp85dqy6e_/models--openai--clip-vit-base-patch32/snapshots/e6a30b603a447e251fdaca1c3056b2a16cdfebeb/pytorch_model.bin> to paddle weight file</tmp/tmp85dqy6e_/model_state.pdparams> ...
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_13624/2676101560.py in <module>
      5 with tempfile.TemporaryDirectory() as tempdir:
      6     model_id = "openai/clip-vit-base-patch32"
----> 7     clip_p = CLIPModel.from_pretrained(model_id, from_hf_hub=True, cache_dir=tempdir)

/mnt/workspace/PaddleNLP/paddlenlp/transformers/model_utils.py in from_pretrained(cls, pretrained_model_name_or_path, from_hf_hub, subfolder, *args, **kwargs)
    661         if cls.constructed_from_pretrained_config():
    662             return cls.from_pretrained_v2(
--> 663                 pretrained_model_name_or_path, from_hf_hub=from_hf_hub, subfolder=subfolder, *args, **kwargs
    664             )
    665 

/mnt/workspace/PaddleNLP/paddlenlp/transformers/model_utils.py in from_pretrained_v2(cls, pretrained_model_name_or_path, from_hf_hub, subfolder, *args, **kwargs)
   1547                     f"paddle weight file<{os.path.join(cache_dir, PADDLE_WEIGHT_FILE_NAME)}> ..."
   1548                 )
-> 1549                 model_state_dict = cls.convert(model_weight_file, config, cache_dir)
   1550             else:
   1551                 raise ValueError(

/mnt/workspace/PaddleNLP/paddlenlp/transformers/conversion_utils.py in convert(cls, weight_file, config, cache_dir)
    867         name_mappings = cls._get_name_mappings(config)
    868 
--> 869         state_dict = load_torch(weight_file)
    870 
    871         # 3. convert state_dict

/mnt/workspace/PaddleNLP/paddlenlp/utils/serialization.py in load_torch(path, **pickle_load_args)
    208     unpickler_stage = UnpicklerWrapperStage(io.BytesIO(data_iostream), **pickle_load_args)
    209     unpickler_stage.persistent_load = persistent_load
--> 210     state_dict = unpickler_stage.load()
    211     torch_zip.close()
    212     return state_dict

/mnt/workspace/PaddleNLP/paddlenlp/utils/serialization.py in _rebuild_tensor_stage(storage, storage_offset, size, stride, requires_grad, backward_hooks)
    150         order = "C"
    151 
--> 152     return storage.reshape(size, order=order)
    153 
    154 

ValueError: cannot reshape array of size 786432 into shape (512,512)

这里在模型mapping之前，好像是转换模型文件错误。

由于是第一次做这种任务，希望大佬帮忙指导一下，谢谢！：）

…nto clip_auto_converter

paddle-bot · 2023-04-10T12:15:44Z

Thanks for your contribution!

codecov · 2023-04-10T12:54:53Z

Codecov Report

Merging #5595 (6048685) into develop (65db5e5) will increase coverage by 0.20%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #5595      +/-   ##
===========================================
+ Coverage    59.70%   59.91%   +0.20%     
===========================================
  Files          482      481       -1     
  Lines        68138    68113      -25     
===========================================
+ Hits         40685    40809     +124     
+ Misses       27453    27304     -149

Impacted Files	Coverage Δ
paddlenlp/transformers/clip/modeling.py	`66.73% <100.00%> (+3.27%)`	⬆️

... and 6 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

wj-Mcat

目前 paddlenlp 中针对于load_torch 的加载不兼容 HF Hub 上的部分CLIP模型，你可以先不用尝试加载全量模型openai/clip-vit-base-patch32，只需要兼容hf-internal-testing/tiny-random-CLIPModel即可。

当然如果你想在本地测试的话，可以调整一下：paddlenlp.utils.serialization.load_torch方法：

def load_torch(path: str, **pickle_load_args):
    from paddlenlp.utils.import_utils import import_module
    torch_model = import_module("torch")
    if torch_model is not None:
        state_dict = torch_model.load(path)
        return {key: value.cpu().numpy() for key, value in state_dict.items()}

wj-Mcat · 2023-04-11T09:30:07Z

paddlenlp/transformers/clip/modeling.py

+                ["text_model.embeddings.position_embedding.weight", "text_model.positional_embedding.weight"],
+                ["text_model.final_layer_norm.weight", "text_model.ln_final.weight"],
+                ["text_model.final_layer_norm.bias", "text_model.ln_final.bias"],
+                ["text_projection.weight", "text_projection", "transpose"],


这个需要通过 cls判断下是否有text_project这个layer，然后再添加text_projection的映射，如果是没有的话是不需要添加到这里的配置中来。

…nto clip_auto_converter

megemini · 2023-04-12T11:47:54Z

增加了 text/vision projection 的判断，请评审，谢谢！

sijunhe

lgtm. 请顺带删除clip/converter.py

…nto clip_auto_converter

megemini · 2023-04-14T10:21:28Z

lgtm. 请顺带删除clip/converter.py

已删～

sijunhe

lgtm

megemini added 2 commits April 10, 2023 19:55

[Add]Add CLIPModel to AutoConverter

9c00e23

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

340aed4

…nto clip_auto_converter

paddle-bot bot added contributor status: proposed labels Apr 10, 2023

megemini mentioned this pull request Apr 10, 2023

【PaddlePaddle Hackathon 第四期】任务总览 PaddlePaddle/Paddle#51281

Closed

sijunhe requested a review from wj-Mcat April 11, 2023 07:49

wj-Mcat requested changes Apr 11, 2023

View reviewed changes

megemini added 2 commits April 12, 2023 19:30

[Change]Change text/vision projection mapping

b1b699b

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

4b695c5

…nto clip_auto_converter

sijunhe added the hackathon label Apr 13, 2023

sijunhe reviewed Apr 14, 2023

View reviewed changes

megemini added 2 commits April 14, 2023 18:10

[Delete]Delete clip/converter.py

685a945

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

6048685

…nto clip_auto_converter

sijunhe approved these changes Apr 14, 2023

View reviewed changes

sijunhe merged commit 9aae2ff into PaddlePaddle:develop Apr 14, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel #5595

【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel #5595

megemini commented Apr 10, 2023

paddle-bot bot commented Apr 10, 2023

codecov bot commented Apr 10, 2023 •

edited

wj-Mcat left a comment

wj-Mcat Apr 11, 2023 •

edited

megemini commented Apr 12, 2023

sijunhe left a comment

megemini commented Apr 14, 2023

sijunhe left a comment

【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel #5595

【Hackathon 4th No.102】给AutoConverter增加新的模型组网的支持 CLIPModel #5595

Conversation

megemini commented Apr 10, 2023

PR types

PR changes

Description

paddle-bot bot commented Apr 10, 2023

codecov bot commented Apr 10, 2023 • edited

Codecov Report

wj-Mcat left a comment

Choose a reason for hiding this comment

wj-Mcat Apr 11, 2023 • edited

Choose a reason for hiding this comment

megemini commented Apr 12, 2023

sijunhe left a comment

Choose a reason for hiding this comment

megemini commented Apr 14, 2023

sijunhe left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 10, 2023 •

edited

wj-Mcat Apr 11, 2023 •

edited