Support GLM-Image model quantizaiton#1512
Conversation
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…-round into lvl/support_glm_image Signed-off-by: lvliang-intel <liang1.lv@intel.com>
There was a problem hiding this comment.
Pull request overview
Adds GLM-Image (zai-org/GLM-Image) support for w4a16 quantization workflows, including pipeline-style (model_index.json) loading and correct export layout for vLLM-Omni-style directories.
Changes:
- Add GLM-Image multimodal block discovery and register
glm_imagein MLLM handling/template registries. - Support loading models from diffusers-style pipeline directories (local + remote) and propagate pipeline subfolder metadata for downstream export/sharding.
- Update export paths to save model weights under the pipeline’s model component subfolder while saving processor/tokenizer assets in the appropriate location.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/test_cpu/models/test_glm_image.py | Adds unit tests for GLM-Image helpers and pipeline subfolder discovery (but currently imports a missing module). |
| auto_round/utils/model.py | Adds local/remote pipeline subfolder resolution and uses it in mllm_load_model; passes subfolder to from_pretrained; tags loaded models with pipeline component metadata. |
| auto_round/utils/common.py | Adds vqmodel to multimodal key list. |
| auto_round/special_model_handler.py | Registers glm_image in limited-batch and only-text quantization lists; adds GLM-Image block-name helper to SPECIAL_MULTIMODAL_BLOCK. |
| auto_round/export/utils.py | Adds pipeline directory detection and export layout resolution; copies pipeline artifacts to output layout. |
| auto_round/export/export_to_autoround/export.py | Saves tokenizer/processor to processor output dir; saves model weights to model component dir when exporting pipeline models. |
| auto_round/export/export_to_autogptq/export.py | Same as above for AutoGPTQ export paths. |
| auto_round/compressors/shard_writer.py | Writes shard outputs under the pipeline model component subfolder when applicable. |
| auto_round/compressors/mllm/utils.py | Adds vqmodel to VISUAL_KEYS. |
| auto_round/compressors/mllm/template.py | Registers glm_image template using the HF processor. |
| auto_round/compressors/mllm/compressor.py | Ensures image_processor is forwarded into export/save path. |
| auto_round/autoround.py | Switches to MLLM mode when processor/image_processor are provided via kwargs. |
Inference test with quantized model using transformers:1 T2I test Prompt: A watercolor fox reading a book 2 I2I test CUDA_VISIBLE_DEVICES=5 python run_glm_image.py --model-dir tmp_glm_image_w4a16 --i2i-demo No reference image provided; generating a synthetic condition image ... |
|
Currently only quantize autoregressive part, I'm working on designing a hybrid mode for quantizing both autoregressive and diffusion part. |
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…upport_glm_image Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…upport_glm_image
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…upport_glm_image
for more information, see https://pre-commit.ci
Verified with vllm-omni (based on PR vllm-project/vllm-omni#1777 and some adaptions):1 Original GLM-Image Model
2 Quantized GLM-Image Model CUDA_VISIBLE_DEVICES=5 python end2end.py --model-path /mnt/disk1/lvl/auto-round-main/tmp_glm_image_w4a16/ --config-path /mnt/disk1/lvl/vllm-omni/vllm_omni/model_executor/stage_configs/glm_image.yaml --prompt "A cat sitting on the table" --output cat_quantized.png
|
for more information, see https://pre-commit.ci
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…upport_glm_image
for more information, see https://pre-commit.ci


Description
Quantize/save/evaluate the zai-org/GLM-Image in w4a16 format.
Model: https://huggingface.co/zai-org/GLM-Image
Target dtypes: w4a16
Save the quantized model for vllm-omni.
Type of Change
Related Issues
#1509
Checklist Before Submitting