Skip to content

Implement Ernie Image model.#13369

Merged
comfyanonymous merged 1 commit into
masterfrom
temp_pr
Apr 12, 2026
Merged

Implement Ernie Image model.#13369
comfyanonymous merged 1 commit into
masterfrom
temp_pr

Conversation

@comfyanonymous
Copy link
Copy Markdown
Member

No description provided.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

📝 Walkthrough

Walkthrough

This pull request introduces support for ERNIE image diffusion models and integrates a Ministral 3.3B text encoder. The changes include a new ERNIE transformer implementation with rotary position embeddings, patch embeddings, and adaptive layer normalization for diffusion conditioning; a corresponding model class registered in the supported models framework; text encoder integration via new tokenizer and model wrapper classes for Ministral; and model detection logic to identify ERNIE checkpoints by their weight key patterns. The modifications span image model architecture, text encoder configuration, model registry, and checkpoint detection.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive No pull request description was provided, making it impossible to assess whether a description exists that relates to the changeset. Add a pull request description explaining the changes, motivation, and any relevant context for reviewers.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing a new Ernie Image model with core components and integration across multiple modules.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@comfyanonymous comfyanonymous merged commit 31283d2 into master Apr 12, 2026
15 of 16 checks passed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/sd.py`:
- Around line 1306-1307: The detector currently returns TEModel.MINISTRAL_3_3B
whenever weight.shape[0] == 3072 which is too broad; update the branch to first
check that the tensor key corresponds to a Ministral-specific parameter (e.g.
the key string equals or contains "model.layers.0.linear_attn.in_proj_a.weight")
and only then use the shape check as a secondary guard before returning
TEModel.MINRAL_3_3B (or TEModel.MINISTRAL_3_3B). Locate the check around
weight.shape[0] == 3072 and add the key/name presence test so only tensors from
Ministral's in_proj_a weight path are routed to the Ministral loader.

In `@comfy/text_encoders/ernie.py`:
- Around line 29-38: The wrapper factory te() defines a subclass ErnieTEModel_
that applies dtype_llama and llama_quantization_metadata but currently returns
the original ErnieTEModel; change the return to the configured subclass by
returning ErnieTEModel_ from te() so the overridden __init__ (and its applied
dtype_llama and model_options["quantization_metadata"]) is used when consumers
instantiate the model.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7a2120a5-e41a-48d1-9d29-dfdd68f39c22

📥 Commits

Reviewing files that changed from the base of the PR and between 55ebd28 and 27e7ad3.

📒 Files selected for processing (8)
  • comfy/ldm/ernie/model.py
  • comfy/model_base.py
  • comfy/model_detection.py
  • comfy/sd.py
  • comfy/supported_models.py
  • comfy/text_encoders/ernie.py
  • comfy/text_encoders/flux.py
  • comfy/text_encoders/llama.py

Comment thread comfy/sd.py
Comment on lines +1306 to +1307
if weight.shape[0] == 3072:
return TEModel.MINISTRAL_3_3B
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a Ministral-specific key in the detector.

Line 1306 only checks for hidden size 3072, which is too broad for model selection. Any 3072-wide decoder-only TE that reaches this branch will now be routed through the ERNIE loader. Please gate this on a Ministral-specific key such as model.layers.0.linear_attn.in_proj_a.weight and keep the shape check as a secondary guard.

🐛 Proposed fix
-        if weight.shape[0] == 3072:
+        if weight.shape[0] == 3072 and "model.layers.0.linear_attn.in_proj_a.weight" in sd:
             return TEModel.MINISTRAL_3_3B
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/sd.py` around lines 1306 - 1307, The detector currently returns
TEModel.MINISTRAL_3_3B whenever weight.shape[0] == 3072 which is too broad;
update the branch to first check that the tensor key corresponds to a
Ministral-specific parameter (e.g. the key string equals or contains
"model.layers.0.linear_attn.in_proj_a.weight") and only then use the shape check
as a secondary guard before returning TEModel.MINRAL_3_3B (or
TEModel.MINISTRAL_3_3B). Locate the check around weight.shape[0] == 3072 and add
the key/name presence test so only tensors from Ministral's in_proj_a weight
path are routed to the Ministral loader.

Comment on lines +29 to +38
def te(dtype_llama=None, llama_quantization_metadata=None):
class ErnieTEModel_(ErnieTEModel):
def __init__(self, device="cpu", dtype=None, model_options={}):
if dtype_llama is not None:
dtype = dtype_llama
if llama_quantization_metadata is not None:
model_options = model_options.copy()
model_options["quantization_metadata"] = llama_quantization_metadata
super().__init__(device=device, dtype=dtype, model_options=model_options)
return ErnieTEModel
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return the configured subclass from te().

Line 38 returns ErnieTEModel instead of ErnieTEModel_, so dtype_llama and llama_quantization_metadata are never applied. That makes the new loader ignore the detection info passed in from both comfy/sd.py and comfy/supported_models.py.

🐛 Proposed fix
 def te(dtype_llama=None, llama_quantization_metadata=None):
     class ErnieTEModel_(ErnieTEModel):
         def __init__(self, device="cpu", dtype=None, model_options={}):
             if dtype_llama is not None:
                 dtype = dtype_llama
             if llama_quantization_metadata is not None:
                 model_options = model_options.copy()
                 model_options["quantization_metadata"] = llama_quantization_metadata
             super().__init__(device=device, dtype=dtype, model_options=model_options)
-    return ErnieTEModel
+    return ErnieTEModel_
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/text_encoders/ernie.py` around lines 29 - 38, The wrapper factory te()
defines a subclass ErnieTEModel_ that applies dtype_llama and
llama_quantization_metadata but currently returns the original ErnieTEModel;
change the return to the configured subclass by returning ErnieTEModel_ from
te() so the overridden __init__ (and its applied dtype_llama and
model_options["quantization_metadata"]) is used when consumers instantiate the
model.

@comfyanonymous comfyanonymous deleted the temp_pr branch April 12, 2026 02:29
@makisekurisu-jp
Copy link
Copy Markdown

The model has been removed from Hugging Face, so it can be considered as ultimately not going to be open-sourced.

@envy-ai
Copy link
Copy Markdown

envy-ai commented Apr 15, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants