-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CogView4 model support #7770
base: main
Are you sure you want to change the base?
Conversation
@@ -38,7 +38,7 @@ dependencies = [ | |||
"clip_anytorch==2.6.0", # replacing "clip @ https://github.com/openai/CLIP/archive/eaa22acb90a5876642d0507623e859909230a52d.zip", | |||
"compel==2.0.2", | |||
"controlnet-aux==0.0.7", | |||
"diffusers[torch]==0.31.0", | |||
"diffusers[torch] @ git+https://github.com/huggingface/diffusers.git@fbf6b856cc61fd22ad8635547bff4aafe05723f3", # We are pinning to a commit to get access to CogView4, which hasn't been released yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to decide if we are comfortable with this, or want to wait for the next diffusers release.
251044c
to
febcb72
Compare
While reviewing the CogView4 HF repo, I noticed this inference restriction:
See: https://huggingface.co/THUDM/CogView4-6B#inference-requirements-and-model-introduction This introduces a new type of constraint. You'd expect the max dimensions to be 2048 x 2048, but that is 4,194,304 pixels, which exceeds max pixel count of 2 ^ 21 = 2,097,152. So we may need to make some changes to dimension constraints to support CogView4. Also, image sizes must be divisible by 32. This needs to be handled in a number of areas. Note: I'm still downloading the model - slow internet today - so I haven't actually tested yet. Maybe this is a non-issue. Just reviewing the model docs and taking notes. |
e5e6f28
to
6586293
Compare
The max number of pixels requirement seems to be fake news. I can generate largeer images than 1024x2048, though I OOM with 24GB VRAM around 1700x2000 on VAE decode. I've added checks for the dimensions. |
…nt is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement.
…workflow running (though quality is still below expectations).
…mestep schedule slipping.
… and update usage
e3b4b29
to
c2c5766
Compare
This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.
This PR has Feels risky to merge this and release w/ an unreleased, potentially unstable Marked as draft to prevent premature merge. |
Great to see InvokeAI support for CogView4! We'll try to do a diffusers release asap to unblock this (hopefully next week) 🤗 |
Summary
Add support for the CogView4 model in nodes.
Example workflows:
Example
Expanded prompt:
Result:

Follow-up work
Related Issues / Discussions
N/A
QA Instructions
Merge Plan
Checklist
What's New
copy (if doing a release after this PR)