Add CogView4 model support #7770

RyanJDick · 2025-03-12T14:25:34Z

Summary

Add support for the CogView4 model in nodes.

Example workflows:

Text-to-image: cogview4.json
Inpainting: cogview4_inpaint.json

Example

Expanded prompt:

A whimsical stuffed gnome sits on a golden sandy beach, its plush fabric slightly textured and well-worn. The gnome has a round, cheerful face with a fluffy white beard, a bulbous nose, and a tall, slightly floppy red hat with a few decorative stitching details. It wears a tiny blue vest over a soft, earthy-toned tunic, and its stubby arms grasp a ripe yellow banana with a few brown speckles. The ocean waves gently roll onto the shore in the background, with turquoise water reflecting the warm glow of the late afternoon sun. A few scattered seashells and driftwood pieces are near the gnome, while a colorful beach umbrella and footprints in the sand hint at a lively beach scene. The sky is a soft pastel blend of pink, orange, and light blue, with wispy clouds stretching across the horizon.

Result:

Follow-up work

Linear UI integration
CogView4 performs best with expanded prompts. Add an LLM preprocessing node for prompt expansion. See the system prompts here for reference: https://github.com/THUDM/CogView4/blob/962816cc760188032713dc5293c4588d42fe88e5/inference/prompt_optimize.py

Related Issues / Discussions

N/A

QA Instructions

Merge Plan

This PR pins an arbitrary diffusers commit to get access to the CogView4 model. We must decide if we are ok with this, or want to wait for an official release.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

RyanJDick · 2025-03-12T15:08:56Z

pyproject.toml

@@ -38,7 +38,7 @@ dependencies = [
  "clip_anytorch==2.6.0",       # replacing "clip @ https://github.com/openai/CLIP/archive/eaa22acb90a5876642d0507623e859909230a52d.zip",
  "compel==2.0.2",
  "controlnet-aux==0.0.7",
-  "diffusers[torch]==0.31.0",
+  "diffusers[torch] @ git+https://github.com/huggingface/diffusers.git@fbf6b856cc61fd22ad8635547bff4aafe05723f3", # We are pinning to a commit to get access to CogView4, which hasn't been released yet.


We need to decide if we are comfortable with this, or want to wait for the next diffusers release.

psychedelicious · 2025-03-18T03:12:16Z

While reviewing the CogView4 HF repo, I noticed this inference restriction:

Resolution: Width and height must be between 512px and 2048px, divisible by 32, and ensure the maximum number of pixels does not exceed 2^21 px.

See: https://huggingface.co/THUDM/CogView4-6B#inference-requirements-and-model-introduction

This introduces a new type of constraint. You'd expect the max dimensions to be 2048 x 2048, but that is 4,194,304 pixels, which exceeds max pixel count of 2 ^ 21 = 2,097,152. So we may need to make some changes to dimension constraints to support CogView4.

Also, image sizes must be divisible by 32. This needs to be handled in a number of areas.

Note: I'm still downloading the model - slow internet today - so I haven't actually tested yet. Maybe this is a non-issue. Just reviewing the model docs and taking notes.

psychedelicious · 2025-03-18T04:13:53Z

The max number of pixels requirement seems to be fake news. I can generate largeer images than 1024x2048, though I OOM with 24GB VRAM around 1700x2000 on VAE decode.

I've added checks for the dimensions.

…nt is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement.

…workflow running (though quality is still below expectations).

…padding).

…-quality images.

…mestep schedule slipping.

… and update usage

This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.

psychedelicious · 2025-03-18T04:41:23Z

Tested text to image, image to image, inpainting/outpainting - all working well.
Added CogView4 Text to Image to default workflows.

This PR has diffusers pinned to a pre-release commit. We are on diffusers==0.31.0 right now - about 5 months old.

Feels risky to merge this and release w/ an unreleased, potentially unstable diffusers dep. Let's wait for the next stable diffusers release and do thorough testing before merging this PR.

Marked as draft to prevent premature merge.

a-r-r-o-w · 2025-03-22T19:18:34Z

Great to see InvokeAI support for CogView4! We'll try to do a diffusers release asap to unblock this (hopefully next week) 🤗

github-actions bot added python Root invocations backend services frontend python-deps labels Mar 12, 2025

RyanJDick commented Mar 12, 2025

View reviewed changes

RyanJDick marked this pull request as ready for review March 13, 2025 13:20

RyanJDick requested review from psychedelicious, blessedcoolant, maryhipp, hipsterusername, lstein, brandonrising, jazzhaiku and ebr as code owners March 13, 2025 13:20

RyanJDick force-pushed the ryan/cogview4 branch from 251044c to febcb72 Compare March 14, 2025 16:10

psychedelicious force-pushed the ryan/cogview4 branch from e5e6f28 to 6586293 Compare March 18, 2025 04:12

RyanJDick added 9 commits March 18, 2025 14:26

Add CogView4 model probing.

025262e

Add CogView4TextEncoderInvocation

02d02d0

Bump diffusers to dev version with CogView4 support.

1925bfc

WIP - CogView4DenoiseInvocation.

a338fb3

Simplify CogView4 timestep schedule initialization.

55e9e96

Completed first pass of CogView4Denoise.

47422bb

Add CogView4LatentsToImageInvocation.

e0ec833

Require the cogview4 height/width are multiples of 32. This requireme…

00f8e85

…nt is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement.

Add CogView4ModelLoaderInvocation. (Not wired up with frontend yet.)

8b870dc

RyanJDick and others added 23 commits March 18, 2025 14:26

Add CogView4 model loader. And various other fixes to get a CogView4 …

9b80559

…workflow running (though quality is still below expectations).

Switch to sequential CFG for CogView4 (for now, until I sort out the …

7255aab

…padding).

Fix bug in CogView4 noise schedule handling that was resulting in low…

0df348b

…-quality images.

Add CogView4ImageToLatentsInvocation.

6541e1d

Simplify CogView4 timesteps schedule generation in preparation for ti…

45f1678

…mestep schedule slipping.

Update CogView4Denoise to support image-to-image.

7f41518

Support cfg_scale list in CogView4Denoise.

3d7270c

Consolidate InpaintExtension implementations for SD3 and FLUX.

612a762

Add inpainting to CogView4DenoiseInvocation.

105de3d

Add CogView4 VAE approximation for progress images.

2c11afc

typegen

d2ebc52

Fix lint error.

2494965

add generation modes for cogview linear

930b989

build graph for cogview4

1ae4ed6

create hook for managing entity type enabledness for given base model…

5917e88

… and update usage

update available params for cogview4

02ad11a

feat(app): add cogview4 default workflows

459722a

feat(ui): add cogview4 and inpainting tags to library

ee41e58

feat(nodes): rename CogView4 nodes to match naming format

5ed3650

fix(ui): add checks for cogview4's dimension restrictions

e6dc7b0

refactor(ui): simplify useIsEntityTypeEnabled

08af54e

chore(ui): lint

7c2a00e

chore(ui): typegen

c2c5766

psychedelicious force-pushed the ryan/cogview4 branch from e3b4b29 to c2c5766 Compare March 18, 2025 04:27

psychedelicious approved these changes Mar 18, 2025

View reviewed changes

psychedelicious marked this pull request as draft March 18, 2025 04:28

psychedelicious added 2 commits March 18, 2025 14:34

feat(app): update cogview4 t2i workflow w/ form

37791b6

feat(app): remove cogview4 inpaint workflow

a9e5977

This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CogView4 model support #7770

Add CogView4 model support #7770

RyanJDick commented Mar 12, 2025 •

edited

Loading

RyanJDick Mar 12, 2025

psychedelicious commented Mar 18, 2025 •

edited

Loading

psychedelicious commented Mar 18, 2025

psychedelicious commented Mar 18, 2025

a-r-r-o-w commented Mar 22, 2025

Add CogView4 model support #7770

Are you sure you want to change the base?

Add CogView4 model support #7770

Conversation

RyanJDick commented Mar 12, 2025 • edited Loading

Summary

Example

Follow-up work

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

RyanJDick Mar 12, 2025

Choose a reason for hiding this comment

psychedelicious commented Mar 18, 2025 • edited Loading

psychedelicious commented Mar 18, 2025

psychedelicious commented Mar 18, 2025

a-r-r-o-w commented Mar 22, 2025

RyanJDick commented Mar 12, 2025 •

edited

Loading

psychedelicious commented Mar 18, 2025 •

edited

Loading