Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kandinsky2.2 训练支持 #268

Closed
chenjjcccc opened this issue Oct 31, 2023 · 1 comment
Closed

Kandinsky2.2 训练支持 #268

chenjjcccc opened this issue Oct 31, 2023 · 1 comment
Labels
HappyOpenSource Pro 快乐开源issue与PR,更具挑战的任务

Comments

@chenjjcccc
Copy link

Kandinsky2.2 训练支持

任务描述

任务背景

  • PaddleMIX ppdiffusers新增kandinsky2_2训练流程。

完成步骤

  1. 参考代码完成对齐

提交内容:

  1. 提交到目录
@shiyutang shiyutang added the HappyOpenSource Pro 快乐开源issue与PR,更具挑战的任务 label Oct 31, 2023
JunnYu pushed a commit that referenced this issue Jan 15, 2024
## [[Kandinsky2.2 训练支持 · Issue #268 ·
PaddlePaddle/PaddleMIX](#268)

### 1 前200steps loss对齐结果:

- decoder w/o LoRA:
  

![decoder](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/ac52377b-5522-4ffb-8ea8-3ad73668cbc5)

- prior w/o LoRA:
  

![prior](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/af24f7c2-2618-4db0-bdaf-764f72f47c9a)
  
- decoder with LoRA:
  

![decoder_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/231573c1-9d7c-46da-8b16-592a22d248af)

- prior with LoRA:
  

![prior_lora](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/79c166d9-0a08-48b6-84e0-3c802e857ff9)
  
- decoder finue-tune 3k steps results(prompts: A robot pokemon, 4k
photo):
  

![robot-pokemon](https://github.com/PaddlePaddle/PaddleMIX/assets/46399096/a7e8ef2d-08b1-4ef2-80d8-826704340de2)

### 2 其他修改


[[ppdiffusers/models/attention_processor.py/LoRAAttnAddedKVProcessor.call](https://github.com/PaddlePaddle/PaddleMIX/blob/ff0d2f25c79cc6e34e7d9c071328a7ed8bea4bc3/ppdiffusers/ppdiffusers/models/attention_processor.py#L789C57-L789C79)]
: axis = 1 -> axis = 2

修改原因:运行python
train_text_to_image_decoder_lora.py使用LoRAAttnAddedKVProcessor出现concat拼接维度错误。

### 3 对齐说明

- 关闭diffusers和ppdiffusers中dataloader中的shuffle,保证数据顺序一致;
  
- 设置同一随机种子,并将在trainning
loop中造成随机性的noise和timesteps改为由numpy生成统一随机结果(提交代码已删除该逻辑)。
  

### 4 存在问题

-
在ppdiffusers中使用AutoPipelineForText2Image(args.pretrained_decoder_model_name_or_path)出现组件缺失:

```bash
     ValueError: Pipeline <class 'ppdiffusers.pipelines.kandinsky2_2.pipeline_kandinsky2_2_combined.KandinskyV22CombinedPipeline'> expected {'unet', 'prior_image_processor', 'prior_text_encoder', 'prior_image_encoder', 'movq', 'prior_prior', 'prior_scheduler', 'prior_tokenizer', 'scheduler'}, but only {'unet', 'movq', 'scheduler'} were passed.
```

   
只能识别部分组件,无法像diffusers自动识别所有组件。故在提交代码中采取下策:在AutoPipelineForText2Image前逐个定义好后传入,不够简洁。目前原因未定,看到一个[[diffusers的issue](huggingface/diffusers#5044

- 使用pip install ppdiffusers=0.19.4
在下载prior的LoRA权重时会出现PriorTransformer找不到load_attn_procs,
无法使用pipeline.prior_prior.load_attn_procs(args.output_dir),但使用最新develop分支构建ppdiffusers安装包则不会出现这个问题。

----------期待回复与关于合入的建议, Thx :)------------------

---------

Co-authored-by: Tsaiyue <tsaiyue01@gamil.com>
@shiyutang
Copy link
Collaborator

close due to #378.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HappyOpenSource Pro 快乐开源issue与PR,更具挑战的任务
Projects
None yet
Development

No branches or pull requests

2 participants