Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support peft LoRA adapters #335

Conversation

artek0chumak
Copy link
Collaborator

No description provided.

@borzunov borzunov self-requested a review July 1, 2023 01:19


NOSAFE_PEFT_REPO = "timdettmers/guanaco-7b"
SAFE_PEFT_REPO = "artek0chumak/guanaco-7b"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please replace this with a smaller pair of adapters? The current ones are too large to download in CI, this slows down tests by ~30%.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create smaller peft weights for bloom-560m

borzunov added a commit that referenced this pull request Jul 5, 2023
…339)

Before this PR, `free_disk_space_for()` was able to remove **(a)** only entire cached revisions (= git commits/branches) and **(b)** only from the repository we're loading right now.

This PR allows this functions to remove arbitrary files separately from any repositories.

This is useful for transition to Petals 1.2.0+, since it now uses original repos instead of the ones with converted models (see #323). In particular, the cache for `bigscience/bloom-petals` is now deprecated and should be removed in favor of `bigscience/bloom`. This is also useful as a way to free space before loading LoRA adapters (#335).
src/petals/utils/peft.py Outdated Show resolved Hide resolved
@artek0chumak artek0chumak force-pushed the artek0chumak/peft_safetensors branch from d639331 to 27bb946 Compare July 7, 2023 08:20
src/petals/utils/peft.py Outdated Show resolved Hide resolved
@artek0chumak artek0chumak marked this pull request as ready for review July 11, 2023 20:56
@artek0chumak artek0chumak force-pushed the artek0chumak/peft_safetensors branch from 45cf7f1 to 2244e7d Compare July 11, 2023 20:58
@justheuristic
Copy link
Collaborator

justheuristic commented Jul 12, 2023

Review / comments:

  1. It seems that the test adapter is initialized at zeros; therefore, the test for exact match with adapters does not fully test exact match.

    • can you please update safe peft to make sure that they affect outputs in a measurable way?
    • we can use any dummy weights s.t. not allclose(outputs_with_peft, outputs_without_peft)
  2. Currently we are using tensor_parallel with adapters naively, as though it was non-parallel

  • this may cause problems in case when adapters are non-zero
  • unless it is super-easy to fix, consider simply assert not adapters or not tensor_parallel for now
  • naive, but correct solution for later: initialize shards[0] with correct adapters; shards[1,2,...] with no adapters
  • optimal, but complicated solution for later: split adapter evenly along the low-rank dimension; this can be done using tensor_parallel's config
  1. I may have screwed up some of the tests
  • there's one change i made that seemed kinda sus: bf498a4
  1. Load balancing in presence of adapters
  • the current load-balancing algorithm ignores adapters altogether
  • this is fine if peers are allowed to download missing adapters on demand (hopefully this is what we're gonna do)
  • if not, we will have to somehow adapt load-balancing to solve cases where
    a. the bottleneck for the vanilla model is in one spot
    b. but if we consider only layers that have these two adapters, the bottleneck is in this other spot
    c. and that one small rarely used adapter doesn't even have enough servers for a full chain
  • in either case, let's create an issue
  1. adapter dtypes
  • not necessarily in this PR, but it would be nice to make sure that the "safe" guanaco is cast to FP16 to save gpu memory

Note to @borzunov : the tests do indeed take longer now; the reason behind this is that we run 2 more "heavy" tests that verify full model exact match with adapters. We can probably skip them if this is a problem.

@artek0chumak artek0chumak changed the title Add peft safetensors loading Add peft lora adapters. Jul 12, 2023
@artek0chumak
Copy link
Collaborator Author

  1. It seems that the test adapter is initialized at zeros; therefore, the test for exact match with adapters does not fully test exact match.

Thank you for mentioning this! Change LoRA weights in hub, currently they have non-zero parameters.

@borzunov borzunov changed the title Add peft lora adapters. Add peft LoRA adapters Jul 12, 2023
@borzunov borzunov changed the title Add peft LoRA adapters Support peft LoRA adapters Jul 12, 2023
@justheuristic justheuristic merged commit b9f0a54 into bigscience-workshop:main Jul 12, 2023
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants