SDXL dreambooth LoRA can not train text_encoders in "train_dreambooth_lora_sdxl.py"

### Describe the bug

While enabling  `--train_text_encoder` in the `train_dreambooth_lora_sdxl.py` script, it initializes two text encoder parameters but its require_grad is False. Due to this, the parameters are not being backpropagated and updated.
![image](https://github.com/huggingface/diffusers/assets/54628184/708d3f6e-5ac3-4118-90df-7586bb628089)

It also shows a warning:
`/home/smarjit/miniconda3/envs/diffusers/lib/python3.11/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")`
  
![image](https://github.com/huggingface/diffusers/assets/54628184/2cedd00c-c93f-4f9f-9a4d-1fcc19e9e588)

 I am also attaching the output of the text_encoder LoRA parameters. Here we can see that `text_model.encoder.layers.11.self_attn.v_proj.lora_linear_layer.up.weight` weights are 0 as they are being initialized to 0 and also it's `requires_grad` is False.

### Reproduction

export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export OUTPUT_DIR=""
export INSTANCE_DIR=""

accelerate launch train_dreambooth_lora_sdxl.py \
    --pretrained_model_name_or_path=$MODEL_NAME \
    --instance_data_dir=$INSTANCE_DIR \
    --output_dir=$OUTPUT_DIR \
    --mixed_precision="fp16" \
    --instance_prompt="a photo of sks {subject}" \
    --resolution=1024 \
    --train_batch_size=1 \
    --gradient_accumulation_steps=4 \
    --learning_rate=1e-4 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=1000 \
    --seed="0" \
    --use_8bit_adam \
    --gradient_checkpointing \
    --enable_xformers_memory_efficient_attention \
    --train_text_encoder \
   
   
   Note: Here `subject` is the dataset name.

### Logs

_No response_

### System Info

- `diffusers` version: 0.21.0.dev0
- Platform: Linux-5.4.0-155-generic-x86_64-with-glibc2.31
- Python version: 3.11.5
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Huggingface_hub version: 0.17.1
- Transformers version: 4.33.1
- Accelerate version: 0.22.0
- xFormers version: 0.0.21
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

Questions on LoRA text encoder training with SDXL: @williamberman, @patrickvonplaten, and @sayakpaul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SDXL dreambooth LoRA can not train text_encoders in "train_dreambooth_lora_sdxl.py" #5016

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SDXL dreambooth LoRA can not train text_encoders in "train_dreambooth_lora_sdxl.py" #5016

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions