add chroma radiance support #910

leejet · 2025-10-22T17:08:14Z

.\bin\Release\sd.exe --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ..\..\ComfyUI\models\clip\t5xxl_fp16.safetensors  -p "a lovely cat holding a sign says 'chroma  radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v

stduhpf · 2025-10-22T23:26:36Z

Vulkan backend shows some deep-frying issue of its own that is not happening on ROCm:

(same prompt and settings as the example)

It looks kinda cool, but that's obviously not the expected result.

But at least it doesn't crash at higher resolutions:

Edit: running it with previews, the first couple of steps look fine, but then as the denoising progresses, the image gets more and more deep fried at every step.

Green-Sky · 2025-10-23T14:25:47Z

Even when using the Q4_K_S quant (v0.3) I had to close down every other process, to make it fit into 8gig vram.

same settings as op, but clip-on-cpu + offload

and with smoothstep schedule

(so 20steps 4.0cfg and euler)

leejet · 2025-10-23T14:45:37Z

.\bin\Release\sd.exe --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ..\..\ComfyUI\models\clip\t5xxl_fp16.safetensors  -p "a lovely cat holding a sign says 'chroma  radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v -H 1024 -W 1024 --diffusion-fa --chroma-disable-dit-mask

leejet · 2025-10-23T14:46:36Z

I implemented a simple workaround, and now large images can be generated correctly.

leejet · 2025-10-23T15:02:26Z

This workaround is temporary and can be removed after the PR ggml-org/llama.cpp#16744 is merged.

leejet · 2025-10-23T15:53:14Z

.\bin\Release\sd.exe --diffusion-model  ..\..\ComfyUI\models\diffusion_models\Chroma1-Radiance-v0.4-Q8_0.gguf --t5xxl ..\..\ComfyUI\models\clip\t5xxl_fp16.safetensors  -p "a lovely cat holding a sign says 'chroma  radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v -H 1024 -W 1024 --diffusion-fa --chroma-disable-dit-maskoma  radiance cpp'" --cfg-scale 4.0 --sampling-method euler -v -H 1024 -W 1024 --diffusion-fa --chroma-disable-dit-mask

Paste error in the command.

Fixed.

leejet · 2025-10-23T16:49:45Z

I can generate images using the Vulkan backend that are similar to those produced by the CUDA backend. @stduhpf could you try again? Use the latest code.

stduhpf · 2025-10-23T18:00:20Z

@leejet I still have the exact same issue on Vulkan as before. Maybe it's driver-related?

leejet · 2025-10-24T16:59:09Z

@leejet I still have the exact same issue on Vulkan as before. Maybe it's driver-related?

@stduhpf I wonder if you’ve tried updating the Vulkan SDK or your graphics driver to the latest version?

stduhpf · 2025-10-24T17:17:57Z

@stduhpf I wonder if you’ve tried updating the Vulkan SDK or your graphics driver to the latest version?

I haven't.

> vulkaninfo --summary
WARNING: [Loader Message] Code 0 : Layer VK_LAYER_OBS_HOOK uses API version 1.3 which is older than the application specified API version of 1.4. May cause issues.
WARNING: [Loader Message] Code 0 : Layer VK_LAYER_RTSS uses API version 1.3 which is older than the application specified API version of 1.4. May cause issues.
==========
VULKANINFO
==========

Vulkan Instance Version: 1.4.304


Instance Extensions: count = 13
-------------------------------
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_swapchain_colorspace            : extension revision 5
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_win32_surface                   : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 17
---------------------------
VK_LAYER_AMD_switchable_graphics    AMD switchable graphics layer                 1.4.308  version 1
VK_LAYER_EOS_Overlay                Vulkan overlay layer for Epic Online Services 1.2.136  version 1
VK_LAYER_EOS_Overlay                Vulkan overlay layer for Epic Online Services 1.2.136  version 1
VK_LAYER_KHRONOS_profiles           Khronos Profiles layer                        1.3.283  version 1
VK_LAYER_KHRONOS_shader_object      Khronos Shader object layer                   1.3.283  version 1
VK_LAYER_KHRONOS_synchronization2   Khronos Synchronization2 layer                1.3.283  version 1
VK_LAYER_KHRONOS_validation         Khronos Validation Layer                      1.3.283  version 1
VK_LAYER_LUNARG_api_dump            LunarG API dump layer                         1.3.283  version 2
VK_LAYER_LUNARG_gfxreconstruct      GFXReconstruct Capture Layer Version 1.0.4    1.3.283  version 4194308
VK_LAYER_LUNARG_monitor             Execution Monitoring Layer                    1.3.283  version 1
VK_LAYER_LUNARG_screenshot          LunarG image capture layer                    1.3.283  version 1
VK_LAYER_OBS_HOOK                   Open Broadcaster Software hook                1.3.216  version 1
VK_LAYER_RENDERDOC_Capture          Debugging capture layer for RenderDoc         1.2.131  version 17
VK_LAYER_ROCKSTAR_GAMES_social_club Rockstar Games Social Club Layer              1.0.70   version 1
VK_LAYER_RTSS                       RTSS overlay hook bootstrap                   1.3.224  version 1
VK_LAYER_VALVE_steam_fossilize      Steam Pipeline Caching Layer                  1.4.303  version 1
VK_LAYER_VALVE_steam_overlay        Steam Overlay Layer                           1.3.207  version 1

Devices:
========
GPU0:
        apiVersion         = 1.4.308
        driverVersion      = 2.0.342
        vendorID           = 0x1002
        deviceID           = 0x731f
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD Radeon RX 5700 XT
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 25.6.1 (AMD proprietary shader compiler)
        conformanceVersion = 1.4.0.0
        deviceUUID         = 00000000-2700-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000
GPU1:
        apiVersion         = 1.4.308
        driverVersion      = 2.0.342
        vendorID           = 0x1002
        deviceID           = 0x73bf
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD Radeon RX 6800
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 25.6.1 (AMD proprietary shader compiler)
        conformanceVersion = 1.4.0.0
        deviceUUID         = 00000000-2a00-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000

I know my drivers are a few months out of date, but I don't think that should matter too much?

leejet · 2025-10-24T17:38:23Z

I'm not quite sure because I didn't have any issues when testing Vulkan on my end.

stduhpf · 2025-10-24T20:18:33Z

> test-backend-ops.exe | Select-String -pattern "FAIL"
ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
ggml_vulkan: 1 = AMD Radeon RX 6800 (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none

[SSM_SCAN] NMSE = 0.006895084 > 0.000000100
SSM_SCAN(type=f32,d_state=128,head_dim=64,n_head=16,n_group=2,n_seq_tokens=32,n_seqs=4): FAIL
[SSM_SCAN] NMSE = 0.018997427 > 0.000000100
SSM_SCAN(type=f32,d_state=256,head_dim=64,n_head=8,n_group=2,n_seq_tokens=32,n_seqs=4): FAIL
  Backend Vulkan0: FAIL
[SSM_SCAN] NMSE = 0.014833752 > 0.000000100
SSM_SCAN(type=f32,d_state=128,head_dim=64,n_head=16,n_group=2,n_seq_tokens=32,n_seqs=4): FAIL
[SSM_SCAN] NMSE = 0.018720127 > 0.000000100
SSM_SCAN(type=f32,d_state=256,head_dim=64,n_head=8,n_group=2,n_seq_tokens=32,n_seqs=4): FAIL
  Backend Vulkan1: FAIL
FAIL

I don't think sd.cpp uses SSM_SCAN anywhere?

leejet · 2025-10-25T12:25:36Z

Yes, ssm_scan isn’t used in sd.cpp. Since I can’t reproduce your issue and both Vulkan and CUDA work fine in my tests, I think this PR can be merged.

stduhpf · 2025-10-25T14:49:14Z

@leejet I just noticed a somewhat similar (but less obvious) issue with Qwen-Image on Vulkan.

ROCm	Vulkan

I'm thinking it could be the ggml_chunk operation that is behaving strangely on my end. As far as I know, only Chroma Radiance, Qwen Image and Wan use that op, right?

Edit: There is a (slight) diffrence between ROCm and Vulkan builds on most models I tested so far. Only for Qwen and especially Chroma Radiance the image looks significantly worse on Vulkan.
Wan 14B is the only one I got so far where the images were absolutely identical (tested sd1.x, Flux Dev, Wan 14b and 5b, Qwen Image, Chroma HD, and Chroma Radiance).

leejet · 2025-10-25T15:54:29Z

As far as I know, only Chroma Radiance, Qwen Image and Wan use that op, right?

Yes. If the issue is really caused by ggml_chunk, I’m a bit suspicious that it might actually be a buffer management problem with Vulkan on certain devices.

stduhpf · 2025-10-25T16:01:03Z

At least, whatever causes it, the error is deterministic. I get the same broken image everytime.

leejet · 2025-10-25T16:05:33Z

My device can't reproduce this issue, which makes it very difficult for me to locate and fix this bug.

stduhpf · 2025-10-25T16:11:52Z

I'm mostly using ROCm anyways, I just tried Vulkan because I wanted to test the model at higher resolutions before you implemented the workaround. I'm going to try how it behaves on my "GPU0", it will take a while.
Edit: I won't be able to test it before Monday, I'm away for the weekend and forgot to set up my PC for remote access.

add chroma radiance support

6a46206

leejet mentioned this pull request Oct 22, 2025

Chroma Radiance (v0.4) #807

Open

leejet added 3 commits October 23, 2025 01:14

fix ci

c916a6b

Merge branch 'master' into chroma_radiance

2c8b907

simply generate_init_latent

18a2804

stduhpf approved these changes Oct 22, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

stduhpf mentioned this pull request Oct 23, 2025

Eval bug: ROCm error: CUBLAS_STATUS_INTERNAL_ERROR ggml-org/llama.cpp#15244

Closed

workaround: avoid ggml cuda error

27272ef

format code

458c365

This comment has been minimized.

Sign in to view

Merge branch 'master' into chroma_radiance

0d9d5e0

leejet added 2 commits October 25, 2025 23:41

Merge branch 'master' into chroma_radiance

c052f03

add chroma radiance doc

a64034e

leejet merged commit 9e28be6 into master Oct 25, 2025
8 checks passed

Uh oh!

add chroma radiance support #910

add chroma radiance support #910

Uh oh!

Conversation

leejet commented Oct 22, 2025

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

stduhpf commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Green-Sky commented Oct 23, 2025

same settings as op, but clip-on-cpu + offload

and with smoothstep schedule

Uh oh!

leejet commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Oct 23, 2025

Uh oh!

leejet commented Oct 23, 2025

Uh oh!

This comment has been minimized.

leejet commented Oct 23, 2025

Uh oh!

leejet commented Oct 23, 2025

Uh oh!

stduhpf commented Oct 23, 2025

Uh oh!

leejet commented Oct 24, 2025

Uh oh!

stduhpf commented Oct 24, 2025

Uh oh!

leejet commented Oct 24, 2025

Uh oh!

stduhpf commented Oct 24, 2025

Uh oh!

leejet commented Oct 25, 2025

Uh oh!

stduhpf commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Oct 25, 2025

Uh oh!

Uh oh!

stduhpf commented Oct 25, 2025

Uh oh!

leejet commented Oct 25, 2025

Uh oh!

stduhpf commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stduhpf commented Oct 22, 2025 •

edited

Loading

leejet commented Oct 23, 2025 •

edited

Loading

stduhpf commented Oct 25, 2025 •

edited

Loading

stduhpf commented Oct 25, 2025 •

edited

Loading