-
-
Notifications
You must be signed in to change notification settings - Fork 561
LoRA Technical
How LoRA and the wider LyCORIS adapter family are loaded and applied on the Diffusers backend. This is a code map for contributors. For usage and settings, see LoRA and Networks.
All code lives in modules/lora/, with per-architecture native loaders in pipelines/<arch>/<arch>_lora.py.
| File | Responsibility |
|---|---|
extra_networks_lora.py |
Entry point. Parses <lora:name:weight> from the prompt and drives activate/deactivate. Registered as an ExtraNetwork. |
lora_overrides.py |
Method selection (get_method), the allow_native arch list, and the force lists. |
lora_load.py |
Load orchestration: network_load, the native loader load_safetensors, the _NATIVE_DISPATCH registry, disk scanning, and lora_cache. |
lora_diffusers.py |
Diffusers/PEFT path: load_diffusers, adapter and scale tracking. |
lora_nunchaku.py |
Nunchaku path for Nunchaku-quantized models. |
native_adapter.py |
Generic LyCORIS family loaders and key-parsing scaffolding shared by the per-arch native loaders. |
network.py |
Data model: NetworkOnDisk, Network, NetworkModule, ModuleType. |
network_*.py |
Per-family ModuleType and NetworkModule subclasses that implement calc_updown (lora, lokr, hada, oft, ia3, glora, norm, full, boft). |
lora_apply.py |
Weight patching: backup, network_calc_weights (accumulate updown), and the two apply functions. |
networks.py |
Apply lifecycle: network_activate, network_deactivate, the native_active flag. |
lora_common.py |
Shared module-level state, imported everywhere as l: loaded_networks, previously_loaded_networks, module_types, timer, debug. |
lora_convert.py |
Key conversion: kohya to diffusers, compvis module-name assignment, KeyConvert. |
pipelines/<arch>/<arch>_lora.py |
Per-arch native loaders for the arches in _NATIVE_DISPATCH. |
Activation runs once per generation through the extra-networks system, after the prompt is parsed.
flowchart TD
P["prompt with <lora:name:weight>"] --> A["ExtraNetworkLora.activate"]
A --> PA["parse: names, te/unet multipliers, dyn_dims, lora_modules"]
PA --> GM["lora_overrides.get_method"]
GM -->|diffusers| ND1["network_load -> load_diffusers per net"]
GM -->|nunchaku| NK["lora_nunchaku.load_nunchaku"]
GM -->|native| ND2["network_load -> load_safetensors per net"]
ND1 --> SA["pipe.set_adapters, optional fuse_lora"]
ND2 --> CH{"changed?"}
CH -->|yes| DE["network_deactivate (fuse mode only)"]
DE --> AC["network_activate: patch module weights"]
CH -->|no| SKIP["skip, weights already applied"]
SA --> DONE["generation runs with adapters live"]
AC --> DONE
NK --> DONE
SKIP --> DONE
network_load runs the per-network load:
-
l.loaded_networksis cleared and each name is resolved to aNetworkOnDiskthroughgather_networks. - Per network,
get_method(shorthash)is called again and the load is dispatched toload_diffusers, the nunchaku no-op (handled separately), orload_safetensors. - Each loaded
Networkis appended tol.loaded_networkswith its multipliers set. -
lora_cacheis trimmed tolora_in_memory_limit.
The apply step differs by path:
- Diffusers adapters are activated by
set_adapters, optionally fused. - Native networks are applied by
network_activate, which patches the model's module weights directly. - Nunchaku is handled inside its own loader.
get_method returns (method, reason). The reason is logged so an automatic fallback can be told apart from a user opt-in. Method is decided per network so a single file hash can be forced even when the arch supports native.
flowchart TD
S["get_method(shorthash)"] --> NQ{"Nunchaku transformer or unet loaded?"}
NQ -->|yes| RN["nunchaku"]
NQ -->|no| FD{"opts.lora_force_diffusers?"}
FD -->|yes| RD1["diffusers: opt-in"]
FD -->|no| FC{"pipeline class in force_classes_diffusers?"}
FC -->|yes| RD2["diffusers: class-forced"]
FC -->|no| AN{"sd_model_type in allow_native?"}
AN -->|no| RD3["diffusers: arch-unsupported"]
AN -->|yes| FH{"file hash in force_hashes_diffusers?"}
FH -->|yes| RD4["diffusers: hash-forced"]
FH -->|no| RNAT["native: default"]
allow_native lists the arches eligible for native loading: sd, sdxl, sd3, f1, f2, chroma, zimage, anima, ernieimage. Any other arch always takes the diffusers path. force_classes_diffusers currently holds the FluxKontext pipeline classes.
load_safetensors is the native loader. It splits on shared.sd_model_type.
flowchart TD
LS["load_safetensors(name, network_on_disk)"] --> C{"name in lora_cache?"}
C -->|yes| RET["return cached Network"]
C -->|no| D{"sd_model_type in _NATIVE_DISPATCH?"}
D -->|yes| PA["import pipelines/<arch>/<arch>_lora.py"]
PA --> TL["mod.try_load -> native_adapter.try_load_chain"]
TL --> NET1["Network with NetworkModule per matched key"]
D -->|no| GEN["generic key-parsing loop"]
GEN --> KC["lora_convert.KeyConvert maps keys to model modules"]
KC --> MT["l.module_types create_module per match"]
MT --> NET2["Network with NetworkModule per matched key"]
Per-arch native adapters cover the arches in _NATIVE_DISPATCH: zimage, chroma, ernieimage, f2, anima. Each maps to a module exposing try_load(name, network_on_disk, lora_scale).
The generic loop covers the remaining native-eligible arches (sd, sdxl, sd3, f1):
- The state dict is read, and kohya to diffusers conversion is applied for
f1andsd3. - Keys are matched to model modules through
KeyConvert. -
NetworkModuleinstances are built using the factories inl.module_types.
Both sub-paths return a Network whose modules dict is keyed by layer name. They share the same apply machinery. The diffusers path does not build modules, which is why an applied network is detected by len(net.modules) > 0.
native_adapter.py holds the generic, arch-independent half of the per-arch loaders:
- Family loaders, one per adapter type:
try_load_lora(with DoRA),try_load_lokr,try_load_loha,try_load_oft,try_load_ia3,try_load_glora,try_load_norm,try_load_full. -
try_load_chain(name, nod, scale, family_loaders)runs each family loader in order and merges the non-empty results into oneNetwork. - Key-parsing helpers:
parse_key,group_by_suffixes, prefix detection, marker checks, PEFT name stripping.
A per-arch module is thin. In it:
- The arch's key prefixes and a
resolve_targetsremap are set. - The needed family loaders are bound.
- A single
try_loadthat callstry_load_chainis exposed.
Coverage varies by arch:
| Arch | Module | Families |
|---|---|---|
| zimage | pipelines/z_image/zimage_lora.py |
LoRA, LoKR, LoHA, OFT |
| chroma | pipelines/chroma/chroma_lora.py |
LoRA, LoKR, LoHA, OFT |
| ernieimage | pipelines/ernie/ernie_lora.py |
LoRA, LoKR, LoHA, OFT |
| f2 | pipelines/flux/flux2_lora.py |
LoRA, LoKR, LoHA, OFT, IA3, GLoRA, Norm |
| anima | pipelines/anima/anima_lora.py |
full set including Full |
resolve_targets is where an arch rewrites legacy or fused key layouts to the current diffusers module names. For example Z-Image rewrites fused attention.qkv and bare attention.out to to_q/to_k/to_v and to_out.0, chunking the fused up-weight at load time.
classDiagram
class NetworkOnDisk {
name
filename
shorthash
metadata
sd_version
detect_version()
}
class Network {
name
network_on_disk
te_multiplier
unet_multiplier
modules
bundle_embeddings
}
class NetworkModule {
network
sd_module
alpha
scale
dora_scale
calc_updown()
calc_scale()
multiplier()
}
class ModuleType {
create_module()
}
NetworkOnDisk "1" --> "*" Network : loaded into
Network "1" --> "*" NetworkModule : modules
ModuleType ..> NetworkModule : creates
NetworkModule <|-- NetworkModuleLora
NetworkModule <|-- NetworkModuleLokr
NetworkModule <|-- NetworkModuleHada
NetworkModule <|-- NetworkModuleOFT
NetworkModule <|-- NetworkModuleIa3
NetworkModule <|-- NetworkModuleGLora
NetworkModule <|-- NetworkModuleNorm
NetworkModule <|-- NetworkModuleFull
-
NetworkOnDiskdescribes a file: name, hash, parsed safetensors metadata, and adetect_versionheuristic over metadata and filename. -
Networkis one loaded adapter: its multipliers and amodulesdict keyed by layer name. -
NetworkModuleis one layer's weights inside one network. It owns the family math incalc_updown, plus alpha, scale, and DoRA handling. -
ModuleTypeis the factory the generic path uses to turn matched weights into aNetworkModule.l.module_typeslists the eight factories tried in order.
network_activate iterates every module of the active components (text_encoder*, unet, transformer*, llm_adapter). For each module that carries a network_layer_name and whose network_current_names differ from the wanted set, it backs up, computes the combined delta, and applies it.
flowchart TD
NA["network_activate"] --> IT["for each module with network_layer_name"]
IT --> SK{"current_names == wanted_names?"}
SK -->|yes| NEXT["skip"]
SK -->|no| BK["network_backup_weights"]
BK --> CW["network_calc_weights: sum updown over loaded_networks"]
CW --> FU{"opts.lora_fuse_native?"}
FU -->|yes| AD["network_apply_direct: write delta into weight"]
FU -->|no| AW["network_apply_weights: restore from backup then add delta"]
AD --> MARK["set network_current_names = wanted_names"]
AW --> MARK
network_calc_weights walks l.loaded_networks (or previously_loaded_networks when use_previous is set), pulls each network's NetworkModule for this layer, calls calc_updown, scales by the module multiplier, and accumulates one combined delta. It dequantizes through the SDNQ backup when the target weight is quantized.
Fuse and backup are a tradeoff set by lora_fuse_native:
- Backup mode (default):
network_backup_weightsclones the original weight to CPU.network_apply_weightsrestores from that backup and adds the new delta, so switching or removing a LoRA just recomputes from the backup.network_deactivateis a no-op in this mode and returns early. - Fuse mode: no tensor backup is kept, only the sentinel
True.network_apply_directwrites the delta into the weight in place for lower peak VRAM. Removal needs an explicit subtract, sonetwork_deactivaterecomputes the previous delta withuse_previous=Trueand applies it withdeactivate=True.
Storing True versus a tensor also lets a mid-session toggle of the setting invalidate the stale backup. networks.native_active records whether native weights are currently applied and gates restoration when LoRAs are removed.
When get_method returns diffusers, lora_diffusers.load_diffusers calls the pipeline's own load_lora_weights and records the adapter name and scale in diffuser_loaded and diffuser_scales. After the load loop, network_load calls set_adapters with the collected names and weights, and fuses when configured. No NetworkModule objects are built, and lora_apply is not involved. unload_diffusers tears the adapters down.
Native LoRA is the adapter step of the larger process of adding a new architecture. The LoRA-specific touchpoints are:
-
pipelines/<arch>/<arch>_lora.py: the arch's key prefixes and aresolve_targetsremap are set, the needednative_adapterfamily loaders are bound, and a singletry_loadis exposed viatry_load_chain. -
lora_overrides.allow_native: the arch'ssd_model_typeis added soget_methodreturnsnative. -
lora_load._NATIVE_DISPATCH: the type is mapped to the new module.
Key fixups belong in resolve_targets or the family loader, not in load_safetensors. Arches not added to allow_native keep working through the diffusers path.
-
SD_LORA_DEBUG=1enables verbose load, activate, and deactivate logs, including per-network names, applied layer counts, backup size, and timings. -
SD_LORA_DUMP=1writes the sorted key list of each loaded file to a temp file, useful when keys fail to match. - The single-line load log reports
load=method(reason)andmethod=actual, whereactualisnativeonly if modules were built. A mismatch points to a silent fallback to the diffusers path. - Relevant settings:
lora_force_diffusers,lora_fuse_native,lora_fuse_diffusers,lora_in_memory_limit,lora_force_reload,extra_networks_default_multiplier.