Add update expected / unexpected keys api to DiffusersQuantizer #12471

Disty0 · 2025-10-11T14:58:03Z

What does this PR do?

Adds update_expected_keys and update_unexpected_keys APIs to DiffusersQuantizer.
Makes load_model_dict_into_meta compatible with updated unexpected / expected keys added in DiffusersQuantizer.

Fixes #12470

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Core library:

General functionalities: @sayakpaul @yiyixuxu @DN6

Disty0 · 2025-10-11T20:11:07Z

Here is a use case of this PR with SDNQ:

pip install git+https://github.com/Disty0/sdnq

import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers

pipe = diffusers.FluxPipeline.from_pretrained("Disty0/FLUX.1-dev-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)

if (
    hasattr(pipe.text_encoder_2.encoder.block[0].layer[0].SelfAttention.k, "scale")
    and pipe.text_encoder_2.encoder.block[0].layer[0].SelfAttention.k.scale.device.type != "meta"
    and hasattr(pipe.transformer.single_transformer_blocks[0].attn.to_k, "scale")
    and pipe.transformer.single_transformer_blocks[0].attn.to_k.scale.device.type != "meta"
):
    print("SDNQ model loaded succesfully")
else:
    print("SDNQ model failed to load")
    exit()

pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev-sdnq-uint4-svd-r32.png")

(This result is generated with an Intel ARC A770 with FP64 Rope downcasted to FP32 to make it work on Alchemist as it doesn't support FP64.)

sayakpaul · 2025-10-13T04:12:11Z

Thanks for the work!

I am still a bit confused about the utility of the APIs. Possible to explain it in simpler terms?

Cc: @SunMarc as well.

Disty0 · 2025-10-13T09:08:36Z

Related PR on Transformers: huggingface/transformers#41138

Currently i have to acces the state dict to load the newly created params in quantization. And this can break if the state dict is sharded. And Diffusers will still throw unexpected keys warning regardles of if they were actually used or not as seen in the screenshot:

This PR makes it so we can update the unexpected keys and expected keys so the params that will be added in quantization won't be skipped by Diffusers and Diffusers won't throw unnecessary unexpected keys warning.

sayakpaul

Looks okay to me! Thanks!

@SunMarc could you review as well?

sayakpaul · 2025-10-13T11:15:34Z

src/diffusers/models/model_loading_utils.py


    for param_name, param in state_dict.items():
-        if param_name not in empty_state_dict:
+        if param_name in unexpected_keys:


Should this not be?

Suggested change

if param_name in unexpected_keys:

if param_name not in empty_state_dict or param_name in unexpected_keys:

Parameters that will be added in quantization isn't in the empty_state_dict yet. They will be added to the model in create_quantized_param within this loop.

Transformers uses param_name not in expected_keys for this check. I used the unexpected keys here instead because diffusers doesn't pass the expected keys to this loop.

sayakpaul · 2025-10-13T11:16:26Z

src/diffusers/models/model_loading_utils.py

+            # hf_quantizer can add parameters that doesn't exist yet
+            # they will be in the loaded_state_dict when pre_quantized


I would also provide more details when this can arise.

Pushed a new commit that fixes the failing pipeline tests when unexpected_keys is None. Also added more details to this comment lines.

Disty0 · 2025-10-13T11:51:55Z

Also, Transformers has requires_parameters_quantization flag for HfQuantizer classes that require creation of a new parameter that doesn't exist in the model before the create_quantized_param step. We can add this flag to the DiffusersQuantizer as well.

From Transformers:
requires_parameters_quantization (bool):
Whether the quantization method requires to create a new Parameter. For example, for bitsandbytes, it is required to create a new xxxParameter in order to properly quantize the model.

SunMarc

Thanks for adding this ! Eager to see the integration with SDNQ !

SunMarc · 2025-10-13T15:01:56Z

src/diffusers/models/model_loading_utils.py

+        if param_name in empty_state_dict:
+            old_param = model
+            splits = param_name.split(".")
+            for split in splits:
+                old_param = getattr(old_param, split)
+        else:
+            # hf_quantizer can add parameters that doesn't exist yet in the model and the empty_state_dict
+            # they will be created in create_quantized_param and hf_quantizer should handle the loading of these parameters 
+            # these parameters will be in the loaded_state_dict from the model file instead when loading a pre_quantized model
+            old_param = None


yeah indeed this is kind of what we did in _infer_parameter_dtype in transformers

SunMarc · 2025-10-13T15:03:09Z

src/diffusers/models/model_loading_utils.py

        # bnb params are flattened.
        # gguf quants have a different shape based on the type of quantization applied
-        if empty_state_dict[param_name].shape != param.shape:
+        if param_name in empty_state_dict and empty_state_dict[param_name].shape != param.shape:


just add a small comment for that as we will probably refactor the loading at some point to match what we have in transformers

SunMarc · 2025-10-13T15:04:01Z

src/diffusers/models/model_loading_utils.py


    for param_name, param in state_dict.items():
-        if param_name not in empty_state_dict:
+        if unexpected_keys is not None and param_name in unexpected_keys:


yeah that's better, actually in transformers we rely on unexpected keys

SunMarc · 2025-10-13T15:09:10Z

Also, Transformers has requires_parameters_quantization flag for HfQuantizer classes that require creation of a new parameter that doesn't exist in the model before the create_quantized_param step. We can add this flag to the DiffusersQuantizer as well.

From Transformers:
requires_parameters_quantization (bool):
Whether the quantization method requires to create a new Parameter. For example, for bitsandbytes, it is required to create a new xxxParameter in order to properly quantize the model.

Acutally this is not that useful and we will probably remove it in transformers, check_quantized_param should be enough. This is why we didn't add it here

Add update expected / unexpected keys api to DiffusersQuantizer

ac06719

Disty0 mentioned this pull request Oct 11, 2025

Add update_unexpected_keys and update_expected_keys APIs to DiffusersQuantizer #12470

Open

sayakpaul reviewed Oct 13, 2025

View reviewed changes

sayakpaul requested review from DN6 and SunMarc October 13, 2025 11:17

fix unexpected_keys is None case and add better comment

ed8be97

SunMarc approved these changes Oct 13, 2025

View reviewed changes

	if param_name in unexpected_keys:
	if param_name not in empty_state_dict or param_name in unexpected_keys:

		# hf_quantizer can add parameters that doesn't exist yet
		# they will be in the loaded_state_dict when pre_quantized

Add update expected / unexpected keys api to DiffusersQuantizer #12471

Are you sure you want to change the base?

Add update expected / unexpected keys api to DiffusersQuantizer #12471

Conversation

Disty0 commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Disty0 commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Disty0 commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Disty0 Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayakpaul Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Disty0 Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Disty0 commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

SunMarc Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

SunMarc Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

SunMarc Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

SunMarc commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Disty0 commented Oct 11, 2025 •

edited

Loading

Disty0 commented Oct 11, 2025 •

edited

Loading

sayakpaul commented Oct 13, 2025 •

edited

Loading

Disty0 commented Oct 13, 2025 •

edited

Loading

Disty0 Oct 13, 2025 •

edited

Loading

Disty0 commented Oct 13, 2025 •

edited

Loading