Skip to content

LoRA Technical

CalamitousFelicitousness edited this page Jun 2, 2026 · 1 revision

LoRA Technical

How LoRA and the wider LyCORIS adapter family are loaded and applied on the Diffusers backend. This is a code map for contributors. For usage and settings, see LoRA and Networks.

All code lives in modules/lora/, with per-architecture native loaders in pipelines/<arch>/<arch>_lora.py.

Files

File Responsibility
extra_networks_lora.py Entry point. Parses <lora:name:weight> from the prompt and drives activate/deactivate. Registered as an ExtraNetwork.
lora_overrides.py Method selection (get_method), the allow_native arch list, and the force lists.
lora_load.py Load orchestration: network_load, the native loader load_safetensors, the _NATIVE_DISPATCH registry, disk scanning, and lora_cache.
lora_diffusers.py Diffusers/PEFT path: load_diffusers, adapter and scale tracking.
lora_nunchaku.py Nunchaku path for Nunchaku-quantized models.
native_adapter.py Generic LyCORIS family loaders and key-parsing scaffolding shared by the per-arch native loaders.
network.py Data model: NetworkOnDisk, Network, NetworkModule, ModuleType.
network_*.py Per-family ModuleType and NetworkModule subclasses that implement calc_updown (lora, lokr, hada, oft, ia3, glora, norm, full, boft).
lora_apply.py Weight patching: backup, network_calc_weights (accumulate updown), and the two apply functions.
networks.py Apply lifecycle: network_activate, network_deactivate, the native_active flag.
lora_common.py Shared module-level state, imported everywhere as l: loaded_networks, previously_loaded_networks, module_types, timer, debug.
lora_convert.py Key conversion: kohya to diffusers, compvis module-name assignment, KeyConvert.
pipelines/<arch>/<arch>_lora.py Per-arch native loaders for the arches in _NATIVE_DISPATCH.

Load lifecycle

Activation runs once per generation through the extra-networks system, after the prompt is parsed.

flowchart TD
    P["prompt with &lt;lora:name:weight&gt;"] --> A["ExtraNetworkLora.activate"]
    A --> PA["parse: names, te/unet multipliers, dyn_dims, lora_modules"]
    PA --> GM["lora_overrides.get_method"]
    GM -->|diffusers| ND1["network_load -> load_diffusers per net"]
    GM -->|nunchaku| NK["lora_nunchaku.load_nunchaku"]
    GM -->|native| ND2["network_load -> load_safetensors per net"]
    ND1 --> SA["pipe.set_adapters, optional fuse_lora"]
    ND2 --> CH{"changed?"}
    CH -->|yes| DE["network_deactivate (fuse mode only)"]
    DE --> AC["network_activate: patch module weights"]
    CH -->|no| SKIP["skip, weights already applied"]
    SA --> DONE["generation runs with adapters live"]
    AC --> DONE
    NK --> DONE
    SKIP --> DONE
Loading

network_load runs the per-network load:

  • l.loaded_networks is cleared and each name is resolved to a NetworkOnDisk through gather_networks.
  • Per network, get_method(shorthash) is called again and the load is dispatched to load_diffusers, the nunchaku no-op (handled separately), or load_safetensors.
  • Each loaded Network is appended to l.loaded_networks with its multipliers set.
  • lora_cache is trimmed to lora_in_memory_limit.

The apply step differs by path:

  • Diffusers adapters are activated by set_adapters, optionally fused.
  • Native networks are applied by network_activate, which patches the model's module weights directly.
  • Nunchaku is handled inside its own loader.

Method selection

get_method returns (method, reason). The reason is logged so an automatic fallback can be told apart from a user opt-in. Method is decided per network so a single file hash can be forced even when the arch supports native.

flowchart TD
    S["get_method(shorthash)"] --> NQ{"Nunchaku transformer or unet loaded?"}
    NQ -->|yes| RN["nunchaku"]
    NQ -->|no| FD{"opts.lora_force_diffusers?"}
    FD -->|yes| RD1["diffusers: opt-in"]
    FD -->|no| FC{"pipeline class in force_classes_diffusers?"}
    FC -->|yes| RD2["diffusers: class-forced"]
    FC -->|no| AN{"sd_model_type in allow_native?"}
    AN -->|no| RD3["diffusers: arch-unsupported"]
    AN -->|yes| FH{"file hash in force_hashes_diffusers?"}
    FH -->|yes| RD4["diffusers: hash-forced"]
    FH -->|no| RNAT["native: default"]
Loading

allow_native lists the arches eligible for native loading: sd, sdxl, sd3, f1, f2, chroma, zimage, anima, ernieimage. Any other arch always takes the diffusers path. force_classes_diffusers currently holds the FluxKontext pipeline classes.

Native path

load_safetensors is the native loader. It splits on shared.sd_model_type.

flowchart TD
    LS["load_safetensors(name, network_on_disk)"] --> C{"name in lora_cache?"}
    C -->|yes| RET["return cached Network"]
    C -->|no| D{"sd_model_type in _NATIVE_DISPATCH?"}
    D -->|yes| PA["import pipelines/&lt;arch&gt;/&lt;arch&gt;_lora.py"]
    PA --> TL["mod.try_load -> native_adapter.try_load_chain"]
    TL --> NET1["Network with NetworkModule per matched key"]
    D -->|no| GEN["generic key-parsing loop"]
    GEN --> KC["lora_convert.KeyConvert maps keys to model modules"]
    KC --> MT["l.module_types create_module per match"]
    MT --> NET2["Network with NetworkModule per matched key"]
Loading

Per-arch native adapters cover the arches in _NATIVE_DISPATCH: zimage, chroma, ernieimage, f2, anima. Each maps to a module exposing try_load(name, network_on_disk, lora_scale).

The generic loop covers the remaining native-eligible arches (sd, sdxl, sd3, f1):

  • The state dict is read, and kohya to diffusers conversion is applied for f1 and sd3.
  • Keys are matched to model modules through KeyConvert.
  • NetworkModule instances are built using the factories in l.module_types.

Both sub-paths return a Network whose modules dict is keyed by layer name. They share the same apply machinery. The diffusers path does not build modules, which is why an applied network is detected by len(net.modules) > 0.

native_adapter

native_adapter.py holds the generic, arch-independent half of the per-arch loaders:

  • Family loaders, one per adapter type: try_load_lora (with DoRA), try_load_lokr, try_load_loha, try_load_oft, try_load_ia3, try_load_glora, try_load_norm, try_load_full.
  • try_load_chain(name, nod, scale, family_loaders) runs each family loader in order and merges the non-empty results into one Network.
  • Key-parsing helpers: parse_key, group_by_suffixes, prefix detection, marker checks, PEFT name stripping.

A per-arch module is thin. In it:

  • The arch's key prefixes and a resolve_targets remap are set.
  • The needed family loaders are bound.
  • A single try_load that calls try_load_chain is exposed.

Coverage varies by arch:

Arch Module Families
zimage pipelines/z_image/zimage_lora.py LoRA, LoKR, LoHA, OFT
chroma pipelines/chroma/chroma_lora.py LoRA, LoKR, LoHA, OFT
ernieimage pipelines/ernie/ernie_lora.py LoRA, LoKR, LoHA, OFT
f2 pipelines/flux/flux2_lora.py LoRA, LoKR, LoHA, OFT, IA3, GLoRA, Norm
anima pipelines/anima/anima_lora.py full set including Full

resolve_targets is where an arch rewrites legacy or fused key layouts to the current diffusers module names. For example Z-Image rewrites fused attention.qkv and bare attention.out to to_q/to_k/to_v and to_out.0, chunking the fused up-weight at load time.

Data model

classDiagram
    class NetworkOnDisk {
        name
        filename
        shorthash
        metadata
        sd_version
        detect_version()
    }
    class Network {
        name
        network_on_disk
        te_multiplier
        unet_multiplier
        modules
        bundle_embeddings
    }
    class NetworkModule {
        network
        sd_module
        alpha
        scale
        dora_scale
        calc_updown()
        calc_scale()
        multiplier()
    }
    class ModuleType {
        create_module()
    }
    NetworkOnDisk "1" --> "*" Network : loaded into
    Network "1" --> "*" NetworkModule : modules
    ModuleType ..> NetworkModule : creates
    NetworkModule <|-- NetworkModuleLora
    NetworkModule <|-- NetworkModuleLokr
    NetworkModule <|-- NetworkModuleHada
    NetworkModule <|-- NetworkModuleOFT
    NetworkModule <|-- NetworkModuleIa3
    NetworkModule <|-- NetworkModuleGLora
    NetworkModule <|-- NetworkModuleNorm
    NetworkModule <|-- NetworkModuleFull
Loading
  • NetworkOnDisk describes a file: name, hash, parsed safetensors metadata, and a detect_version heuristic over metadata and filename.
  • Network is one loaded adapter: its multipliers and a modules dict keyed by layer name.
  • NetworkModule is one layer's weights inside one network. It owns the family math in calc_updown, plus alpha, scale, and DoRA handling.
  • ModuleType is the factory the generic path uses to turn matched weights into a NetworkModule. l.module_types lists the eight factories tried in order.

Apply and restore

network_activate iterates every module of the active components (text_encoder*, unet, transformer*, llm_adapter). For each module that carries a network_layer_name and whose network_current_names differ from the wanted set, it backs up, computes the combined delta, and applies it.

flowchart TD
    NA["network_activate"] --> IT["for each module with network_layer_name"]
    IT --> SK{"current_names == wanted_names?"}
    SK -->|yes| NEXT["skip"]
    SK -->|no| BK["network_backup_weights"]
    BK --> CW["network_calc_weights: sum updown over loaded_networks"]
    CW --> FU{"opts.lora_fuse_native?"}
    FU -->|yes| AD["network_apply_direct: write delta into weight"]
    FU -->|no| AW["network_apply_weights: restore from backup then add delta"]
    AD --> MARK["set network_current_names = wanted_names"]
    AW --> MARK
Loading

network_calc_weights walks l.loaded_networks (or previously_loaded_networks when use_previous is set), pulls each network's NetworkModule for this layer, calls calc_updown, scales by the module multiplier, and accumulates one combined delta. It dequantizes through the SDNQ backup when the target weight is quantized.

Fuse and backup are a tradeoff set by lora_fuse_native:

  • Backup mode (default): network_backup_weights clones the original weight to CPU. network_apply_weights restores from that backup and adds the new delta, so switching or removing a LoRA just recomputes from the backup. network_deactivate is a no-op in this mode and returns early.
  • Fuse mode: no tensor backup is kept, only the sentinel True. network_apply_direct writes the delta into the weight in place for lower peak VRAM. Removal needs an explicit subtract, so network_deactivate recomputes the previous delta with use_previous=True and applies it with deactivate=True.

Storing True versus a tensor also lets a mid-session toggle of the setting invalidate the stale backup. networks.native_active records whether native weights are currently applied and gates restoration when LoRAs are removed.

Diffusers path

When get_method returns diffusers, lora_diffusers.load_diffusers calls the pipeline's own load_lora_weights and records the adapter name and scale in diffuser_loaded and diffuser_scales. After the load loop, network_load calls set_adapters with the collected names and weights, and fuses when configured. No NetworkModule objects are built, and lora_apply is not involved. unload_diffusers tears the adapters down.

Adding native support for a new arch

Native LoRA is the adapter step of the larger process of adding a new architecture. The LoRA-specific touchpoints are:

  • pipelines/<arch>/<arch>_lora.py: the arch's key prefixes and a resolve_targets remap are set, the needed native_adapter family loaders are bound, and a single try_load is exposed via try_load_chain.
  • lora_overrides.allow_native: the arch's sd_model_type is added so get_method returns native.
  • lora_load._NATIVE_DISPATCH: the type is mapped to the new module.

Key fixups belong in resolve_targets or the family loader, not in load_safetensors. Arches not added to allow_native keep working through the diffusers path.

Debugging

  • SD_LORA_DEBUG=1 enables verbose load, activate, and deactivate logs, including per-network names, applied layer counts, backup size, and timings.
  • SD_LORA_DUMP=1 writes the sorted key list of each loaded file to a temp file, useful when keys fail to match.
  • The single-line load log reports load=method(reason) and method=actual, where actual is native only if modules were built. A mismatch points to a silent fallback to the diffusers path.
  • Relevant settings: lora_force_diffusers, lora_fuse_native, lora_fuse_diffusers, lora_in_memory_limit, lora_force_reload, extra_networks_default_multiplier.

Clone this wiki locally