Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More convenient way to initialize LoftQ #1543

Commits on Mar 7, 2024

  1. [WIP] More convenient way to initialize LoftQ

    Related to huggingface#1532
    
    At the moment, using LoftQ is quite cumbersome, as shown in this
    example:
    
    https://github.com/huggingface/peft/tree/7e84dec20b3106bdd0a90ba8e80187f0aec835b7/examples/loftq_finetuning
    
    Essentially, users have to:
    
    1. Load the non-quantized model with LoftQ (which can be quite huge)
    2. Modify the PEFT config
    3. Save the adapter
    4. Unwrap the base model with custom functions
    5. Save the base model with modified weights (i.e. a whole copy of the
       base model)
    6. Load the base model from step 5 with bnb quantization
    7. Load the adapter from step 3
    
    Yes, there is a helper script to do this, but this still has the
    advantage that we need to load the non-quantized model and that we have
    to create a completely new model checkpoint with the modified weights.
    
    This PR aims to make this process more convenient by adding a single
    function replace_lora_weights_loftq. This function takes the
    bnb-quantized LoRA model as input. Then it goes through each module with
    LoRA weights, lazily loads the corresponding non-quantized weights one
    at a time using safetensors, computes the quantization error, and
    replaces the LoRA weights with LoftQ-initialized LoRA weights.
    
    This is much more convenient because we only require very little extra
    memory thanks to lazy loading, and we don't have to keep an extra copy
    of the weights.
    
    While working on this, I still found that LoftQ initialization often did
    not seem to help a lot, as mentioned in huggingface#1532. I measured this by
    creating (1) logits with the base model, (2) with the quantized+LoRA
    model, and (3) with the quantized+LoRA+LoftQ model. The expectation is
    that (1) should be closer to (3) than to (2). This was often not the
    case.
    
    I therefore added the possibility to run a check each time that we
    replace a LoRA weight with the LoftQ weights. If this check returns
    True, we proceed to the next weight, otherwise we discard the change.
    That way, we only make the replacement with LoftQ weights if we see a
    real improvement. Of course, this is only a form of greedy optimization,
    but it seems to work in practice. And since it's optional, users can
    choose not to use it.
    
    This PR is not yet finished since I ran into an issue with matching the
    key names from safetensors not matching.
    
    Furthermore, for now this doesn't support 8bit quantization and the
    num_iter arguments of LoftQ, which I'm not sure is really working.
    However, I guess the replace_lora_weights_loftq function could be called
    multiple times in a row.
    BenjaminBossan committed Mar 7, 2024
    Configuration menu
    Copy the full SHA
    cd597ca View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    127bb44 View commit details
    Browse the repository at this point in the history
  3. Make style

    BenjaminBossan committed Mar 7, 2024
    Configuration menu
    Copy the full SHA
    897fd71 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2024

  1. Apply suggestions from code review

    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    BenjaminBossan and stevhliu authored Mar 11, 2024
    Configuration menu
    Copy the full SHA
    2f287da View commit details
    Browse the repository at this point in the history

Commits on Mar 12, 2024

  1. Configuration menu
    Copy the full SHA
    c52624a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fc11323 View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Configuration menu
    Copy the full SHA
    bc6e0ea View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. Configuration menu
    Copy the full SHA
    e0cf8d2 View commit details
    Browse the repository at this point in the history
  2. Make style

    BenjaminBossan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    8038a6d View commit details
    Browse the repository at this point in the history
  3. Re-order tests

    Makes more sense than from the automatic merge
    BenjaminBossan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    3f39640 View commit details
    Browse the repository at this point in the history
  4. Target all linear layers in test

    Better results, bigger margins.
    BenjaminBossan committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    0ee3e38 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d6f3d29 View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2024

  1. Configuration menu
    Copy the full SHA
    eea22b1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dd3cfab View commit details
    Browse the repository at this point in the history