-
Notifications
You must be signed in to change notification settings - Fork 512
Description
Git commit
Operating System & Version
windows11
GGML backends
CUDA
Command-line arguments used
--diffusion-model z_image_turbo-Q4_K.gguf --vae ae.safetensors -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 1.0 -v --offload-to-cpu --diffusion-fa
Steps to reproduce
just download release versions and models then run the example in the wiki
What you expected to happen
Could not run the same as wiki
What actually happened
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.embed_tokens.weight' not in model file
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.norm.weight' not in model file
[ERROR] stable-diffusion.cpp:782 - load tensors from model loader failed
[INFO ] main.cpp:694 - new_sd_ctx_t failed
Logs / error messages / stack trace
.\bin\sd-cli.exe --diffusion-model z_image_turbo-Q4_K.gguf --vae ae.safetensors -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 1.0 -v --offload-to-cpu --diffusion-fa
[DEBUG] main.cpp:500 - version: stable-diffusion.cpp version unknown, commit 5e4579c
[DEBUG] main.cpp:501 - System Info:
SSE3 = 1 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | VSX = 0 |
[DEBUG] main.cpp:502 - SDCliParams {
mode: img_gen,
output_path: "output.png",
verbose: true,
color: false,
canny_preprocess: false,
convert_name: false,
preview_method: none,
preview_interval: 1,
preview_path: "preview.png",
preview_fps: 16,
taesd_preview: false,
preview_noisy: false
}
[DEBUG] main.cpp:503 - SDContextParams {
n_threads: 16,
model_path: "",
clip_l_path: "",
clip_g_path: "",
clip_vision_path: "",
t5xxl_path: "",
llm_path: "",
llm_vision_path: "",
diffusion_model_path: "z_image_turbo-Q4_K.gguf",
high_noise_diffusion_model_path: "",
vae_path: "ae.safetensors",
taesd_path: "",
esrgan_path: "",
control_net_path: "",
embedding_dir: "",
embeddings: {
}
wtype: NONE,
tensor_type_rules: "",
lora_model_dir: "",
photo_maker_path: "",
rng_type: cuda,
sampler_rng_type: NONE,
flow_shift: INF
offload_params_to_cpu: true,
enable_mmap: false,
control_net_cpu: false,
clip_on_cpu: false,
vae_on_cpu: false,
diffusion_flash_attn: true,
diffusion_conv_direct: false,
vae_conv_direct: false,
circular: false,
circular_x: false,
circular_y: false,
chroma_use_dit_mask: true,
qwen_image_zero_cond_t: false,
chroma_use_t5_mask: false,
chroma_t5_mask_pad: 1,
prediction: NONE,
lora_apply_mode: auto,
vae_tiling_params: { 0, 0, 0, 0.5, 0, 0 },
force_sdxl_vae_conv_scale: false
}
[DEBUG] main.cpp:504 - SDGenerationParams {
loras: "{
}",
high_noise_loras: "{
}",
prompt: "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic",
negative_prompt: "",
clip_skip: -1,
width: -1,
height: -1,
batch_count: 1,
init_image_path: "",
end_image_path: "",
mask_image_path: "",
control_image_path: "",
ref_image_paths: [],
control_video_path: "",
auto_resize_ref_image: true,
increase_ref_index: false,
pm_id_images_dir: "",
pm_id_embed_path: "",
pm_style_strength: 20,
skip_layers: [7, 8, 9],
sample_params: (txt_cfg: 1.00, img_cfg: 1.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
high_noise_skip_layers: [7, 8, 9],
high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
custom_sigmas: [],
cache_mode: "",
cache_option: "",
cache: disabled (threshold=1, start=0.15, end=0.95),
moe_boundary: 0.875,
video_frames: 1,
fps: 16,
vace_strength: 1,
strength: 0.75,
control_strength: 0.9,
seed: 42,
upscale_repeats: 1,
upscale_tile_size: 128,
}
[DEBUG] stable-diffusion.cpp:164 - Using CUDA backend
[INFO ] ggml_extend.hpp:78 - ggml_cuda_init: found 1 CUDA devices:
[INFO ] ggml_extend.hpp:78 - Device 0: NVIDIA GeForce RTX 5060 Laptop GPU, compute capability 12.0, VMM: yes
[INFO ] stable-diffusion.cpp:258 - loading diffusion model from 'z_image_turbo-Q4_K.gguf'
[INFO ] model.cpp:370 - load z_image_turbo-Q4_K.gguf using gguf format
[DEBUG] model.cpp:416 - init from 'z_image_turbo-Q4_K.gguf'
[INFO ] stable-diffusion.cpp:319 - loading vae from 'ae.safetensors'
[INFO ] model.cpp:373 - load ae.safetensors using safetensors format
[DEBUG] model.cpp:507 - init from 'ae.safetensors', prefix = 'vae.'
[INFO ] stable-diffusion.cpp:335 - Version: Z-Image
[INFO ] stable-diffusion.cpp:363 - Weight type stat: f32: 495 | q8_0: 22 | q4_K: 180
[INFO ] stable-diffusion.cpp:364 - Conditioner weight type stat:
[INFO ] stable-diffusion.cpp:365 - Diffusion model weight type stat: f32: 251 | q8_0: 22 | q4_K: 180
[INFO ] stable-diffusion.cpp:366 - VAE weight type stat: f32: 244
[DEBUG] stable-diffusion.cpp:368 - ggml tensor size = 400 bytes
[DEBUG] llm.hpp:285 - merges size 151387
[DEBUG] llm.hpp:317 - vocab size: 151669
[DEBUG] llm.hpp:1139 - llm: num_layers = 0, vocab_size = 152064, hidden_size = 3584, intermediate_size = 18944
[INFO ] stable-diffusion.cpp:573 - Using flash attention in the diffusion model
[DEBUG] ggml_extend.hpp:1922 - qwen3 params backend buffer size = 2079.01 MB(RAM) (2 tensors)
[DEBUG] ggml_extend.hpp:1922 - z_image params backend buffer size = 3685.21 MB(RAM) (453 tensors)
[DEBUG] ggml_extend.hpp:1922 - vae params backend buffer size = 94.57 MB(RAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:752 - loading weights
[DEBUG] model.cpp:1381 - using 16 threads for model loading
[DEBUG] model.cpp:1403 - loading tensors from z_image_turbo-Q4_K.gguf
|================================> | 453/697 - 341.63it/s
[DEBUG] model.cpp:1403 - loading tensors from ae.safetensors
|==================================================| 697/697 - 454.96it/s
[INFO ] model.cpp:1629 - loading tensors completed, taking 1.53s (process: 0.00s, read: 1.35s, memcpy: 0.00s, convert: 0.01s, copy_to_backend: 0.00s)
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.embed_tokens.weight' not in model file
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.norm.weight' not in model file
[ERROR] stable-diffusion.cpp:782 - load tensors from model loader failed
[INFO ] main.cpp:694 - new_sd_ctx_t failed
Additional context / environment details
RTX5060