Add diffusion README #923

mengniwang95 · 2025-10-21T02:19:09Z

No description provided.

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2025-10-27T01:37:34Z

If the PR is ready, please ask others to review it directly.

wenhuach21 · 2025-10-27T01:38:09Z

auto_round/compressors/diffusion/README.md

+pipe = AutoPipelineForText2Image.from_pretrained(model_name, torch_dtype=torch.bfloat16)
+
+## quantize the model
+autoround = AutoRoundDiffusion(


use AutoRound interface if possible.

wenhuach21 · 2025-10-27T01:38:40Z

auto_round/compressors/diffusion/README.md

+from auto_round import AutoRoundDiffusion
+from diffusers import AutoPipelineForText2Image
+
+## load the model


better follow google style, single # and upper case

wenhuach21 · 2025-10-27T01:38:52Z

auto_round/compressors/diffusion/README.md

@@ -0,0 +1,87 @@
+# AutoRound for Diffusion models
+
+This feature is experimental and may be subject to changes, including potential bug fixes, API modifications, or adjustments to default parameters.


link this file to step_by_step.md

wenhuach21 · 2025-10-27T01:40:14Z

auto_round/compressors/diffusion/README.md

@@ -0,0 +1,87 @@
+# AutoRound for Diffusion models


how about adding an experimental in the title

wenhuach21 · 2025-10-27T01:40:38Z

auto_round/compressors/diffusion/README.md

+
+for more hyperparameters introduction, please refer [Homepage Detailed Hyperparameters](../../README.md#api-usage-gaudi2cpugpu)
+
+### Basic Usage


Signed-off-by: Mengni Wang <mengni.wang@intel.com>

wenhuach21 · 2025-10-27T03:16:49Z

docs/step_by_step.md

 ============

-This document presents step-by-step instructions for auto-round llm quantization. For vlms quantization, please refer to [vlms user guide](../auto_round/mllm/README.md)
+This document presents step-by-step instructions for auto-round llm quantization. You can refer to [vlms user guide](../auto_round/compressors/mllm/README.md) for vlms quantization and [diffusions user guide](../auto_round/compressors/diffusion/README.md) for diffusions quantization.


@n1ck-guo please refine the mllm doc following the comments in this pr

* Fix rtn tuning_device issue (#893) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> * fix vlm gguf ut (#895) Signed-off-by: n1ck-guo <heng.guo@intel.com> * update alg_ext.abi3.so with python compatible version (#894) * move ste from quant to round for nvfp4 (#889) Signed-off-by: He, Xin3 <xin3.he@intel.com> * Add GPT-OSS quant support (#887) * better help printing information (#883) * better help printing information Signed-off-by: n1ck-guo <heng.guo@intel.com> * speedup quant and evaluation, fix recompile issue (#897) * rewrite the implementation for ease-of-maintain Signed-off-by: He, Xin3 <xin3.he@intel.com> * fix bug Signed-off-by: He, Xin3 <xin3.he@intel.com> * fix quant performance Signed-off-by: He, Xin3 <xin3.he@intel.com> * Update auto_round/compressors/base.py --------- Signed-off-by: He, Xin3 <xin3.he@intel.com> * fix nvfp act quantization bug (#891) * fix nvfp act quantization bug Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * add cuda ut for moe nvfp quantize Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * add cpu UT, refine cuda UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix ut typo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix cpu ut Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * enhance experts amax match, refine UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * support automatic mixed bits assignment (#851) * try to fix gguf issue (#886) * remove numba from requirments (#905) Signed-off-by: yiliu30 <yi4.liu@intel.com> * Extend mxfp loading dtypes (#907) * block dataset logger info (#908) Signed-off-by: n1ck-guo <heng.guo@intel.com> * fix torch compile issue in AutoScheme (#909) * Revert "Extend mxfp loading dtypes (#907)" (#915) This reverts commit 0c2619c. * support disable_opt_rtn in auto-scheme (#913) * fix llama 4 ut (#896) * fix ut of llama 4 Signed-off-by: n1ck-guo <heng.guo@intel.com> * add numba for cpu lib (#919) Signed-off-by: yiliu30 <yi4.liu@intel.com> * Loosen the packing restrictions for mxfp&nvfp (#911) * Loosen the packing restrictions for mxfp&nvfp, enable Qwen1.5-MoE-A2.7B quantize Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine mxfp&nvfp layer checker Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix pylint Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Extend mxfp loading dtypes (#916) Signed-off-by: root <root@clx5673.ra.intel.com> Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: root <root@clx5673.ra.intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix act config exporting for mixed schemes (#903) * fp8 exporting bugfix Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix act related config saving Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add ut for act_config check Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine extra_config saving, add UTs Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix ut typo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix ut typo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fixtypo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix CI Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix scan issue Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix scan issue Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * rm global variable Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rerun ut Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine ut Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * optimize rtn for int woq (#924) * fix bug of gguf and support for LiquidAI/LFM2-1.2B (#927) Signed-off-by: n1ck-guo <heng.guo@intel.com> * remove numpy<2.0 limitation (#921) * enable regex quantization config saving for mixed bits (#825) * enable dynamic quantization config saving Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixtypo Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rebase code, refine config saving Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine ut Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * fix UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * enable hf loading for regex, add UTs Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine export, enhance gptq UT Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix Flux tuning issue (#936) Signed-off-by: Mengni Wang <mengni.wang@intel.com> * gguf support for inclusionAI/Ling-flash-2.0 (#940) * remove low_cpu_mem (#934) * Add compatibility test (#918) * Add commit hash to version (#941) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * gguf weight type align with original, output.weight, token_embed (#900) * support attention mask in user's dataset (#930) * Add diffusion README (#923) * update readme (#949) * refactor utils file (#943) * refact utils Signed-off-by: n1ck-guo <heng.guo@intel.com> * update readme for sglang support (#953) * update readme for sglang support Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * refine doc Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> * Update README.md --------- Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Co-authored-by: Wenhua Cheng <wenhua.cheng@intel.com> * update gguf and support for CompressedLinear (#950) * Reduce AutoSchem VRAM usage by up to 10X (#944) * add self attribution and fix avg_bits error (#956) * add self attribution and fix avg_bits error --------- Signed-off-by: He, Xin3 <xin3.he@intel.com> Co-authored-by: Wenhua Cheng <wenhua.cheng@intel.com> * add logo (#960) * refine AutoScheme readme/code (#958) * update readme (#962) * fix critic disable_opt_rtn regression (#963) * [1/N] Initial vllm-ext evaluation support (MXFP4 MOE) (#935) Signed-off-by: yiliu30 <yi4.liu@intel.com> * fix bug of imatrix contains 0 (#955) * fix rtn bug (#966) * enhance flux doc (#967) * clean code (#968) * support for model scope (#957) * support for model scope Signed-off-by: n1ck-guo <heng.guo@intel.com> * merge main branch to alg_ext (#970) * fix cuda CI backend issue, fixtypo (#974) * disable compile packing by default (#975) Signed-off-by: yiliu30 <yi4.liu@intel.com> * enhance auto device map and support XPU (#961) * enhance auto device map and support XPU --------- Signed-off-by: He, Xin3 <xin3.he@intel.com> * refine readme (#978) * cli support for positional arguments model (#979) Signed-off-by: n1ck-guo <heng.guo@intel.com> * update bits (#986) Signed-off-by: He, Xin3 <xin3.he@intel.com> * fix guff scheme and device_map bug (#969) * add support for Magistral-Small (#980) * support model_dtype and fix bug of scheme contains quotes, mllm eval (#985) --------- Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Signed-off-by: n1ck-guo <heng.guo@intel.com> Signed-off-by: He, Xin3 <xin3.he@intel.com> Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com> Signed-off-by: yiliu30 <yi4.liu@intel.com> Signed-off-by: root <root@clx5673.ra.intel.com> Signed-off-by: Mengni Wang <mengni.wang@intel.com> Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> Co-authored-by: Tang Kaihui <kaihui.tang@intel.com> Co-authored-by: Heng Guo <heng.guo@intel.com> Co-authored-by: Xin He <xin3.he@intel.com> Co-authored-by: Yi Liu <yi4.liu@intel.com> Co-authored-by: Weiwei <weiwei1.zhang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Wenhua Cheng <wenhua.cheng@intel.com> Co-authored-by: root <root@clx5673.ra.intel.com> Co-authored-by: Wang, Mengni <mengni.wang@intel.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com>

mengniwang95 and others added 5 commits October 21, 2025 02:18

add diffusion README

ec1ae2b

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

ec61206

for more information, see https://pre-commit.ci

Update README.md

a5ce1a1

[pre-commit.ci] auto fixes from pre-commit.com hooks

5760930

for more information, see https://pre-commit.ci

Merge branch 'main' into mengni/diffusion_readme

a49c09f

wenhuach21 reviewed Oct 27, 2025

View reviewed changes

mengniwang95 added 2 commits October 26, 2025 23:10

update readme

25611f2

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

Merge branch 'main' into mengni/diffusion_readme

fe1ca90

wenhuach21 reviewed Oct 27, 2025

View reviewed changes

wenhuach21 approved these changes Oct 27, 2025

View reviewed changes

wenhuach21 merged commit 97512a4 into main Oct 27, 2025
9 checks passed

wenhuach21 deleted the mengni/diffusion_readme branch October 27, 2025 03:17

		@@ -0,0 +1,87 @@
		# AutoRound for Diffusion models

		This feature is experimental and may be subject to changes, including potential bug fixes, API modifications, or adjustments to default parameters.


		for more hyperparameters introduction, please refer [Homepage Detailed Hyperparameters](../../README.md#api-usage-gaudi2cpugpu)

		### Basic Usage

Add diffusion README #923

Add diffusion README #923

Uh oh!

Conversation

mengniwang95 commented Oct 21, 2025

Uh oh!

wenhuach21 commented Oct 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants