Skip to content

v0.44.0 — SD 3.5-medium rescued + corpus breadth

Choose a tag to compare

@vulogov vulogov released this 06 Jun 07:57
· 690 commits to main since this release

SD 3.5-medium: broken → working, verified

SD 3.5-medium was listed as supported since the SD3 line landed but had never once rendered an image — the sixth "shape-tested, never verified" model the proof corpus has caught. It now generates end to end on a 24 GB Mac (BF16-native on Metal).

It didn't even load: plakat's MMDiT loader expects the SAI single-file layout (fused joint_blocks QKV); Stability ships the diffusers transformer (split transformer_blocks Q/K/V). A diffusers→SAI remapper fixed the load. The forward + conditioning hid five more bugs, none catchable by a single-forward check:

  • pooled-y concatenated [CLIP-G, CLIP-L] vs diffusers' [CLIP-L, CLIP-G] — scrambled the adaLN conditioning so it never denoised
  • flow-match timestep passed raw [0,1] instead of ×1000
  • AdaLayerNormContinuous read (shift, scale) vs diffusers' (scale, shift) (×2) — a 2700-magnitude outlier
  • missing QK-norm on the context-qkv-only block; F16 timestep embed; sd35-medium variant mis-detect

The MMDiT now matches diffusers' SD3Transformer2DModel at corr 1.0.

Proof-corpus breadth (19 → 38 entries)

  • sd35.hjson — SD 3.5-medium, incl. a legible "FRESH BREAD" sign
  • upscale.sh — ML upscale (Real-ESRGAN ×2), opens the Transforms & post category
  • img2img.sh — prompt-steered style transfer (photo → oil / watercolour / sumi-e)
  • portrait.sh + portrait.hjson — text personas + a reference-photo lookalike (IP-Adapter-Plus-Face)
  • weather-scene.hjson — one area re-lit + re-weathered across the scenario scene × weather axes

Regression held at 1269 lib tests throughout; no other model affected. Published to crates.io.