- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
This .to() cast on the text encoder:
diffusers/src/diffusers/loaders/lora_base.py
Line 421 in 9836f0e
| text_encoder.to(device=text_encoder.device, dtype=text_encoder.dtype) | 
Is invalid when working with an SD1.5 / SDXL pipeline that has a bitsandbytes quantization config used on the text encoder
perhaps something like this would fix?
if is_bitsandbytes_available():
    quant, is_4bit, _ = _check_bnb_status(text_encoder)
else:
    quant, is_4bit = False, False
if not quant:
    text_encoder.to(device=text_encoder.device, dtype=text_encoder.dtype)
elif is_4bit:
    text_encoder.to(device=text_encoder.device)This problem does not seem to affect flux / sd3, so I am not sure how this would affect other pipelines?
Reproduction
import torch
import transformers
import diffusers
import diffusers.quantizers.quantization_config as _qc
text_encoder = transformers.CLIPTextModel.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0', subfolder='text_encoder', variant='fp16',
    torch_dtype=torch.float16, quantization_config=_qc.BitsAndBytesConfig(load_in_8bit=True))
pipeline = diffusers.StableDiffusionXLPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
    variant='fp16',
    torch_dtype=torch.float16,
    text_encoder=text_encoder
)
pipeline.load_lora_weights('Norod78/sdxl-emoji-lora')
pipeline.to('cuda')
pipeline(prompt='test')Logs
REDACT\diffusers\venv\Scripts\python.exe REDACT\diffusers\test.py 
WARNING:torchao.kernel.intmm:Warning: Detected no triton, on systems without Triton certain kernels will not work
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Loading pipeline components...: 100%|██████████| 7/7 [00:00<00:00, 15.86it/s]
Traceback (most recent call last):
  File "REDACT\diffusers\test.py", line 18, in <module>
    pipeline.load_lora_weights('Norod78/sdxl-emoji-lora')
  File "REDACT\diffusers\src\diffusers\loaders\lora_pipeline.py", line 657, in load_lora_weights
    self.load_lora_into_text_encoder(
  File "REDACT\diffusers\src\diffusers\loaders\lora_pipeline.py", line 894, in load_lora_into_text_encoder
    _load_lora_into_text_encoder(
  File "REDACT\diffusers\src\diffusers\loaders\lora_base.py", line 430, in _load_lora_into_text_encoder
    text_encoder.to(device=text_encoder.device, dtype=text_encoder.dtype)
  File "REDACT\diffusers\venv\Lib\site-packages\transformers\modeling_utils.py", line 3089, in to
    raise ValueError(
ValueError: You cannot cast a bitsandbytes model in a new `dtype`. Make sure to load the model using `from_pretrained` using the desired `dtype` by passing the correct `torch_dtype` argument.
Process finished with exit code 1System Info
diffusers == 0.34.0.dev0
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working