-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When resuming sdxl dreambooth lora training from a checkpoint, it results in a size mismatch error in the text encoder.
Reproduction
Command being run each time is:
accelerate launch train_dreambooth_lora_sdxl.py --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 --instance_data_dir=/instance_images --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix --output_dir=/output --instance_prompt="timberdog" --mixed_precision=fp16 --resolution=1024 --train_batch_size=1 --gradient_accumulation_steps=4 --learning_rate=1e-06 --lr_scheduler=constant --lr_warmup_steps=0 --max_train_steps=1400 --checkpointing_steps=50 --seed=0 --resume_from_checkpoint=latest --checkpoints_total_limit=1 --train_text_encoder --validation_prompt="timberdog as an ace space pilot, detailed illustration" --validation_epochs=10 --report_to=wandb
If left uninterrupted, training completes successfully, and produces expected outcomes. If interrupted and resumed, the size mismatch error occurs. see logs below.
Logs
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - ***** Running training *****
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Num examples = 36
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Num batches each epoch = 36
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Num Epochs = 156
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Instantaneous batch size per device = 1
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 4
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Gradient Accumulation steps = 4
dreambooth-1 | 02/28/2024 17:52:24 - INFO - __main__ - Total optimization steps = 1400
dreambooth-1 | Resuming from checkpoint checkpoint-50
dreambooth-1 | 02/28/2024 17:52:24 - INFO - accelerate.accelerator - Loading states from /output/checkpoint-50
dreambooth-1 | Traceback (most recent call last):
dreambooth-1 | File "/app/diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py", line 1757, in <module>
dreambooth-1 | main(args)
dreambooth-1 | File "/app/diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py", line 1378, in main
dreambooth-1 | accelerator.load_state(os.path.join(args.output_dir, path))
dreambooth-1 | File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 2908, in load_state
dreambooth-1 | hook(models, input_dir)
dreambooth-1 | File "/app/diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py", line 1078, in load_model_hook
dreambooth-1 | _set_state_dict_into_text_encoder(
dreambooth-1 | File "/app/diffusers/src/diffusers/training_utils.py", line 158, in _set_state_dict_into_text_encoder
dreambooth-1 | set_peft_model_state_dict(text_encoder, text_encoder_state_dict, adapter_name="default")
dreambooth-1 | File "/opt/conda/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 201, in set_peft_model_state_dict
dreambooth-1 | load_result = model.load_state_dict(peft_model_state_dict, strict=False)
dreambooth-1 | File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
dreambooth-1 | raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
dreambooth-1 | RuntimeError: Error(s) in loading state_dict for CLIPTextModel:
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.0.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.1.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.2.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.3.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.4.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.5.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.6.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.7.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.8.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.9.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.10.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.out_proj.lora_A.default.weight: copying a param with shape torch.Size([4, 1280]) from checkpoint, the shape in current model is torch.Size([4, 768]).
dreambooth-1 | size mismatch for text_model.encoder.layers.11.self_attn.out_proj.lora_B.default.weight: copying a param with shape torch.Size([1280, 4]) from checkpoint, the shape in current model is torch.Size([768, 4]).System Info
diffusersversion: 0.26.1- Platform: Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.13
- PyTorch version (GPU?): 2.2.0 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.37.2
- Accelerate version: 0.26.1
- xFormers version: 0.0.24
- Using GPU in script?: yes, RTX 3080 Ti
- Using distributed or parallel set-up in script?: no
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working