Fix Kohya UNet LoRA key conversion for conv_in/conv_out/time_embedding#14006
Merged
sayakpaul merged 4 commits intoJun 28, 2026
Merged
Conversation
_convert_unet_lora_key() had no mapping for these three top-level UNet submodules, so Kohya-format keys touching them (e.g. lora_unet_conv_in, lora_unet_time_embed_0/2) came out as conv.in/conv.out/time.embed.0/2 instead of conv_in/conv_out/time_embedding.linear_1/2, and were reported as unexpected keys instead of being applied.
Contributor
Author
|
needs a test, I'll mark the PR as ready as soon as that passes |
The initial fix mapped conv_in/conv_out in the diffusers spelling (conv.in/ conv.out) and time_embedding in the sgm spelling (time_embed.0/.2), so neither SD1.x nor SDXL was fully covered. Add the missing spellings: - sgm conv_in/conv_out: input_blocks.0.0 / out.2 (kohya-ss SDXL sgm UNet), mapped before the block renames so input_blocks.0.0 does not become down_blocks.0.0. - diffusers time_embedding: time_embedding.linear_1/2 (kohya-ss trains SD1.x on the diffusers UNet). Verified against kohya-ss source (sdxl_original_unet.py, networks/lora.py) and the diffusers UNet module names; regression set unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The conv_in/conv_out/time_embedding fix only reached _convert_unet_lora_key; for the SDXL sgm UNet those keys never got there, because _maybe_map_sgm_blocks_to_diffusers treats every non-text key as a down/mid/up block. The top-level modules that live outside that block structure (time_embed, label_emb, out = conv_out, and input_blocks.0.0 = conv_in) hit the "layer not supported" raise, or crashed the inner block-index int() parse. - Pass those top-level modules through unchanged so _convert_unet_lora_key maps them, instead of block-remapping or raising. - Map the sgm label_emb (SDXL added-conditioning MLP) to diffusers add_embedding: label_emb.0.0/0.2 -> add_embedding.linear_1/2, before the SDXL index-strip heuristic that would otherwise collapse the layer index. All additions follow the kohya/sgm naming pattern and are no-ops on real kohya-ss files (which contain none of these top-level UNet LoRA keys); verified end-to-end loading a full SDXL sgm UNet LoRA into the diffusers pipeline with no unexpected/missing adapter keys. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
Author
|
this PR is ready for review now |
sayakpaul
approved these changes
Jun 28, 2026
sayakpaul
left a comment
Member
There was a problem hiding this comment.
Thanks for the massive contribution!
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #14005.
It is in principle related to #14080, as it addresses layers that aren't trained that often, but can be trained.
What does this PR do?
Kohya-format UNet LoRA keys for several top-level UNet submodules weren't being
converted to diffusers names, so they didn't match any parameter and were reported as
unexpected keys instead of being applied. This covers both UNet dialects kohya-ss
trains on: the diffusers UNet (SD 1.x) and the sgm/LDM UNet (SDXL).
_convert_unet_lora_key— add the missing name patches:conv.in → conv_in,conv.out → conv_outinput.blocks.0.0 → conv_in,out.2 → conv_out(mapped before the block renamesso
input_blocks.0.0isn't mistaken for a down-block)time.embed.0/.2 → time_embedding.linear_1/2(sgm) andtime.embedding.linear.1/2 → time_embedding.linear_1/2(diffusers)label.emb.0.0/0.2 → add_embedding.linear_1/2(SDXL added-conditioning MLP)_maybe_map_sgm_blocks_to_diffusers— pass top-level sgm modules(
time_embed,label_emb,out,input_blocks.0.0) through unchanged so the keyconverter handles them, instead of block-remapping or hitting the "layer not supported"
raise.
Who can review?
PEFT: @sayakpaul @BenjaminBossan