Conversation
|
I think there is some good stuff we can pull out of here (: btw, gemma-3-12b-it-qat-IQ4_XS.gguf why |
|
@Green-Sky that's a very good question, probably "because I wasn't thinking about it" is the proper answer ;) |
|
Great work. How does this perform compared to ComfyUI? |
|
Haven't compared yet but gonna optimize further. |
|
Slightly funky still, so guess there's a subtle error somewhere, but I added fitting, so I managed to get 80 frames at 720p ("a black cat jumping at a brown mouse on green grass"): ltx2_cat_mouse_720p.webm |
|
@mudler yours looks much better, wonder if that's quants or if my implementation has a bug somewhere. Edit: might be distilled vs full too though. |
|
Probably FA is the culprit here - I'm running this on 26 GB VRAM total (3080 10 GB + 5060 16 GB), so really struggling to get anything reasonable :) |
| // SD_CUDA_DEVICE_VAE VAE (falls back to SD_CUDA_DEVICE) | ||
| // SD_CUDA_DEVICE_CONTROL ControlNet (falls back to SD_CUDA_DEVICE) | ||
| // SD_VK_DEVICE same pattern for the Vulkan build | ||
| // Setting any of these to -1 forces CPU for that component. |
There was a problem hiding this comment.
Just as a reminder: this should be coordinated with #1184 .
There was a problem hiding this comment.
Yeah, this is just a rough PoC for now.
I'm using the distilled model: |
|
@mudler yeah I'm doing full for some reason (probably the same one that caused me to pick IQ4_XS :D) |
|
So apparently there are some major divergences between CPU and CUDA Gemma3, which is a bit surprising (and it happens on both Q4_0 and the IQ4_XS quants). |

Please have mercy, had to murder my Claude Code to get this working.
ltx2_smoke_v2.webm