Skip to content

LoRA weight underflows #49

@CCRcmcpe

Description

@CCRcmcpe

I encountered the same problem as #41, which states "LoRAs have no effects".

Background

I'm using SSDT to train LoRAs, the LoRA layers implementations are from loralib, which have weight scaling = (rank / alpha).

So if I want to use a alpha = 1, rank-16 LoRA produced by SSDT in AddNet, scale should be set to 1/16.

Some users found this additional scaling not convenient, I added an unscale weight option to scale weight by (alpha / rank) when converting SSDT checkpoint to AddNet format.

Investigation

I looked the state dict after unscaling.

image
image

All of the tensors have pretty small values, which can cause low numerical stability. ~20% of them have zero values, in this 20%, 15% in text encoder, 85% in UNet.

Experiment

In one LoRA (rank=16, alpha=1) I trained,

tmpnsz0jyrk

  • All unscaled LoRAs have basically no effect.
  • Not unscaled, scale = 0.0625 (1 / 16) worked as normal.

Conclusion and Solution

I suspect those zeros are products of underflow, which probably is the cause of #41.

Those underflows happens more often if rank is high.

At training time, add option: "alpha" to scale LoRA like loralib. Save alpha to LoRA metadata.

At inference time, add option: "scale weight" to scale LoRA weight by rank / alpha.

Backward Compatibility

Unfortunately, as you can imagine, almost all existing LoRAs have already underflowed.

If "scale weight" is enabled, for still using old LoRAs, if a LoRA have no alpha in metadata, do not scale.

Additional: NaNs

After AUTOMATIC1111/stable-diffusion-webui@9991967, when generating images, those underflowed LoRAs sometimes produces NaN errors.

Some users reported loss=NaN when using https://github.com/Linaqruf/kohya-trainer/ and https://github.com/Mikubill/naifu-diffusion/, especially at high rank. I suspect that's related to this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions