-
-
Notifications
You must be signed in to change notification settings - Fork 562
Ideogram
Quote: Ideogram 4 is Ideogram's first open-weight text-to-image model. It is a state-of-the-art foundation model trained from scratch — not a fine-tune of any existing model. It introduces a structured JSON prompting interface, with multilingual text rendering, deep language understanding, explicit bounding-box layout and color-palette controls, and native 2k resolution images.
The original Ideogram-4 model was released in FP8 and NF4 variants. SD.Next ships it with BF16 weights so users can experiment with any type of quantization using native quantization-during-load features.
Warning
Ideogram-4 is a large model and aggressive quantization is highly recommended
Due to both RAM and VRAM requirements, it is not usable without quantization on most consumer hardware
Ideogram-4 consists of:
-
text-encoder: qwen-3-vl-8b -
transformer: primary 9b transformer -
unconditional_transformer: second 9b transformer -
vae: flux-2-vae
Note
A typical model uses the same UNet or transformer for both positive and negative guidance.
For Ideogram-4, there is no negative prompt as such. Instead, the second transformer denoises with zeroed text features. Using two separate transformers during each step enables stronger guidance and better prompt adherence, but it also raises VRAM requirements above typical models.
Tip
This can cause VRAM swapping if both transformers are not pinned to VRAM. Pin both transformers to VRAM if you have enough memory.
Experimental: Settings -> Model options -> Ideogram 4 -> Pin transformers to VRAM
Tip
SD.Next provides an experimental option to disable the second transformer. This saves VRAM, but quality drops and noise increases, so keep it enabled if you can afford it.
Experimental: Settings -> Model options -> Ideogram 4 -> Enable conditional guidance
By default, Ideogram-4 uses adaptive guidance scheduling: 7.0 for the first 90% of steps, then 3.0 for the final steps
Tip
If guidance scale is 0 or 1, negative guidance is not calculated and the second transformer is not used. This is equivalent to running a TURBO-style model.
Ideogram-4 requires a large number of steps to generate good results
The authors' default is 48
Warning
Ideogram-4 is not usable without its specific JSON-based prompt format.
Structure:
{
"high_level_description": "",
"compositional_deconstruction": {
"background": "",
"elements": [
{
"type": "obj",
"desc": ""
}
]
}
}Tip
SD.Next provides an LLM-based prompt enhancer that can rewrite a normal text prompt into Ideogram-4 JSON format
Or you can also write the JSON prompt yourself and pass it directly
Settings: Model options -> Ideogram 4 -> Enable prompt-enhance
This enables or disables the prompt enhancer, which uses the same Qwen-3-VL-8B model that Ideogram-4 uses for text encoding.
The enhancer expands and rewrites prompts into Ideogram-4 JSON format.
It is enabled by default because Ideogram-4 is not usable without a JSON prompt.
It is not fast—approximately 30 seconds on an RTX 4090.
If you pass a detailed JSON prompt, the enhancer detects it and skips enhancement. If prompt enhance is disabled and you pass a non-JSON prompt, SD.Next converts it naively into Ideogram-4 JSON format.