Skip to content

[Bug] cannot run z-image #1225

@sdfdewewewd14-lgtm

Description

@sdfdewewewd14-lgtm

Git commit

fa61ea7

Operating System & Version

windows11

GGML backends

CUDA

Command-line arguments used

--diffusion-model z_image_turbo-Q4_K.gguf --vae ae.safetensors -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 1.0 -v --offload-to-cpu --diffusion-fa

Steps to reproduce

just download release versions and models then run the example in the wiki

What you expected to happen

Could not run the same as wiki

What actually happened

[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.embed_tokens.weight' not in model file
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.norm.weight' not in model file
[ERROR] stable-diffusion.cpp:782 - load tensors from model loader failed
[INFO ] main.cpp:694 - new_sd_ctx_t failed

Logs / error messages / stack trace

.\bin\sd-cli.exe --diffusion-model  z_image_turbo-Q4_K.gguf --vae ae.safetensors -p "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic" --cfg-scale 1.0 -v --offload-to-cpu --diffusion-fa
[DEBUG] main.cpp:500  - version: stable-diffusion.cpp version unknown, commit 5e4579c
[DEBUG] main.cpp:501  - System Info:
    SSE3 = 1 |     AVX = 1 |     AVX2 = 1 |     AVX512 = 0 |     AVX512_VBMI = 0 |     AVX512_VNNI = 0 |     FMA = 1 |     NEON = 0 |     ARM_FMA = 0 |     F16C = 1 |     FP16_VA = 0 |     WASM_SIMD = 0 |     VSX = 0 |
[DEBUG] main.cpp:502  - SDCliParams {
  mode: img_gen,
  output_path: "output.png",
  verbose: true,
  color: false,
  canny_preprocess: false,
  convert_name: false,
  preview_method: none,
  preview_interval: 1,
  preview_path: "preview.png",
  preview_fps: 16,
  taesd_preview: false,
  preview_noisy: false
}
[DEBUG] main.cpp:503  - SDContextParams {
  n_threads: 16,
  model_path: "",
  clip_l_path: "",
  clip_g_path: "",
  clip_vision_path: "",
  t5xxl_path: "",
  llm_path: "",
  llm_vision_path: "",
  diffusion_model_path: "z_image_turbo-Q4_K.gguf",
  high_noise_diffusion_model_path: "",
  vae_path: "ae.safetensors",
  taesd_path: "",
  esrgan_path: "",
  control_net_path: "",
  embedding_dir: "",
  embeddings: {
  }
  wtype: NONE,
  tensor_type_rules: "",
  lora_model_dir: "",
  photo_maker_path: "",
  rng_type: cuda,
  sampler_rng_type: NONE,
  flow_shift: INF
  offload_params_to_cpu: true,
  enable_mmap: false,
  control_net_cpu: false,
  clip_on_cpu: false,
  vae_on_cpu: false,
  diffusion_flash_attn: true,
  diffusion_conv_direct: false,
  vae_conv_direct: false,
  circular: false,
  circular_x: false,
  circular_y: false,
  chroma_use_dit_mask: true,
  qwen_image_zero_cond_t: false,
  chroma_use_t5_mask: false,
  chroma_t5_mask_pad: 1,
  prediction: NONE,
  lora_apply_mode: auto,
  vae_tiling_params: { 0, 0, 0, 0.5, 0, 0 },
  force_sdxl_vae_conv_scale: false
}
[DEBUG] main.cpp:504  - SDGenerationParams {
  loras: "{
  }",
  high_noise_loras: "{
  }",
  prompt: "A cinematic, melancholic photograph of a solitary hooded figure walking through a sprawling, rain-slicked metropolis at night. The city lights are a chaotic blur of neon orange and cool blue, reflecting on the wet asphalt. The scene evokes a sense of being a single component in a vast machine. Superimposed over the image in a sleek, modern, slightly glitched font is the philosophical quote: 'THE CITY IS A CIRCUIT BOARD, AND I AM A BROKEN TRANSISTOR.' -- moody, atmospheric, profound, dark academic",
  negative_prompt: "",
  clip_skip: -1,
  width: -1,
  height: -1,
  batch_count: 1,
  init_image_path: "",
  end_image_path: "",
  mask_image_path: "",
  control_image_path: "",
  ref_image_paths: [],
  control_video_path: "",
  auto_resize_ref_image: true,
  increase_ref_index: false,
  pm_id_images_dir: "",
  pm_id_embed_path: "",
  pm_style_strength: 20,
  skip_layers: [7, 8, 9],
  sample_params: (txt_cfg: 1.00, img_cfg: 1.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
  high_noise_skip_layers: [7, 8, 9],
  high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: NONE, sample_method: NONE, sample_steps: 20, eta: 0.00, shifted_timestep: 0),
  custom_sigmas: [],
  cache_mode: "",
  cache_option: "",
  cache: disabled (threshold=1, start=0.15, end=0.95),
  moe_boundary: 0.875,
  video_frames: 1,
  fps: 16,
  vace_strength: 1,
  strength: 0.75,
  control_strength: 0.9,
  seed: 42,
  upscale_repeats: 1,
  upscale_tile_size: 128,
}
[DEBUG] stable-diffusion.cpp:164  - Using CUDA backend
[INFO ] ggml_extend.hpp:78   - ggml_cuda_init: found 1 CUDA devices:
[INFO ] ggml_extend.hpp:78   -   Device 0: NVIDIA GeForce RTX 5060 Laptop GPU, compute capability 12.0, VMM: yes
[INFO ] stable-diffusion.cpp:258  - loading diffusion model from 'z_image_turbo-Q4_K.gguf'
[INFO ] model.cpp:370  - load z_image_turbo-Q4_K.gguf using gguf format
[DEBUG] model.cpp:416  - init from 'z_image_turbo-Q4_K.gguf'
[INFO ] stable-diffusion.cpp:319  - loading vae from 'ae.safetensors'
[INFO ] model.cpp:373  - load ae.safetensors using safetensors format
[DEBUG] model.cpp:507  - init from 'ae.safetensors', prefix = 'vae.'
[INFO ] stable-diffusion.cpp:335  - Version: Z-Image
[INFO ] stable-diffusion.cpp:363  - Weight type stat:                      f32: 495  |    q8_0: 22   |    q4_K: 180
[INFO ] stable-diffusion.cpp:364  - Conditioner weight type stat:
[INFO ] stable-diffusion.cpp:365  - Diffusion model weight type stat:      f32: 251  |    q8_0: 22   |    q4_K: 180
[INFO ] stable-diffusion.cpp:366  - VAE weight type stat:                  f32: 244
[DEBUG] stable-diffusion.cpp:368  - ggml tensor size = 400 bytes
[DEBUG] llm.hpp:285  - merges size 151387
[DEBUG] llm.hpp:317  - vocab size: 151669
[DEBUG] llm.hpp:1139 - llm: num_layers = 0, vocab_size = 152064, hidden_size = 3584, intermediate_size = 18944
[INFO ] stable-diffusion.cpp:573  - Using flash attention in the diffusion model
[DEBUG] ggml_extend.hpp:1922 - qwen3 params backend buffer size =  2079.01 MB(RAM) (2 tensors)
[DEBUG] ggml_extend.hpp:1922 - z_image params backend buffer size =  3685.21 MB(RAM) (453 tensors)
[DEBUG] ggml_extend.hpp:1922 - vae params backend buffer size =  94.57 MB(RAM) (138 tensors)
[DEBUG] stable-diffusion.cpp:752  - loading weights
[DEBUG] model.cpp:1381 - using 16 threads for model loading
[DEBUG] model.cpp:1403 - loading tensors from z_image_turbo-Q4_K.gguf
  |================================>                 | 453/697 - 341.63it/s
[DEBUG] model.cpp:1403 - loading tensors from ae.safetensors
  |==================================================| 697/697 - 454.96it/s
[INFO ] model.cpp:1629 - loading tensors completed, taking 1.53s (process: 0.00s, read: 1.35s, memcpy: 0.00s, convert: 0.01s, copy_to_backend: 0.00s)
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.embed_tokens.weight' not in model file
[ERROR] model.cpp:1697 - tensor 'text_encoders.llm.model.norm.weight' not in model file
[ERROR] stable-diffusion.cpp:782  - load tensors from model loader failed
[INFO ] main.cpp:694  - new_sd_ctx_t failed

Additional context / environment details

RTX5060

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions