Support peft LoRA adapters #335

artek0chumak · 2023-06-30T13:56:12Z

No description provided.

src/petals/utils/peft.py

borzunov · 2023-07-05T10:56:20Z

tests/test_peft.py

+
+
+NOSAFE_PEFT_REPO = "timdettmers/guanaco-7b"
+SAFE_PEFT_REPO = "artek0chumak/guanaco-7b"


Can you please replace this with a smaller pair of adapters? The current ones are too large to download in CI, this slows down tests by ~30%.

Create smaller peft weights for bloom-560m

…339) Before this PR, `free_disk_space_for()` was able to remove **(a)** only entire cached revisions (= git commits/branches) and **(b)** only from the repository we're loading right now. This PR allows this functions to remove arbitrary files separately from any repositories. This is useful for transition to Petals 1.2.0+, since it now uses original repos instead of the ones with converted models (see #323). In particular, the cache for `bigscience/bloom-petals` is now deprecated and should be removed in favor of `bigscience/bloom`. This is also useful as a way to free space before loading LoRA adapters (#335).

src/petals/utils/peft.py

…found

justheuristic · 2023-07-12T01:40:52Z

Review / comments:

It seems that the test adapter is initialized at zeros; therefore, the test for exact match with adapters does not fully test exact match.
- can you please update safe peft to make sure that they affect outputs in a measurable way?
- we can use any dummy weights s.t. not allclose(outputs_with_peft, outputs_without_peft)
Currently we are using tensor_parallel with adapters naively, as though it was non-parallel

this may cause problems in case when adapters are non-zero
unless it is super-easy to fix, consider simply assert not adapters or not tensor_parallel for now
naive, but correct solution for later: initialize shards[0] with correct adapters; shards[1,2,...] with no adapters
optimal, but complicated solution for later: split adapter evenly along the low-rank dimension; this can be done using tensor_parallel's config

I may have screwed up some of the tests

there's one change i made that seemed kinda sus: bf498a4

Load balancing in presence of adapters

the current load-balancing algorithm ignores adapters altogether
this is fine if peers are allowed to download missing adapters on demand (hopefully this is what we're gonna do)
if not, we will have to somehow adapt load-balancing to solve cases where
a. the bottleneck for the vanilla model is in one spot
b. but if we consider only layers that have these two adapters, the bottleneck is in this other spot
c. and that one small rarely used adapter doesn't even have enough servers for a full chain
in either case, let's create an issue

adapter dtypes

not necessarily in this PR, but it would be nice to make sure that the "safe" guanaco is cast to FP16 to save gpu memory

Note to @borzunov : the tests do indeed take longer now; the reason behind this is that we run 2 more "heavy" tests that verify full model exact match with adapters. We can probably skip them if this is a problem.

tests/test_full_model.py

…tek0chumak/petals into artek0chumak/peft_safetensors

artek0chumak · 2023-07-12T04:20:34Z

It seems that the test adapter is initialized at zeros; therefore, the test for exact match with adapters does not fully test exact match.

Thank you for mentioning this! Change LoRA weights in hub, currently they have non-zero parameters.

…tek0chumak/petals into artek0chumak/peft_safetensors

…ors' into artek0chumak/peft_safetensors

borzunov self-requested a review July 1, 2023 01:19

borzunov reviewed Jul 1, 2023

View reviewed changes

src/petals/utils/peft.py Show resolved Hide resolved

borzunov mentioned this pull request Jul 5, 2023

Allow free_disk_space_for() remove arbitrary files from Petals cache #339

Merged

borzunov reviewed Jul 5, 2023

View reviewed changes

src/petals/utils/peft.py Outdated Show resolved Hide resolved

artek0chumak added 4 commits July 7, 2023 08:07

Add possible tests

f714df4

Add first functional code

273bd30

Safer way to download pefts

6b48c8d

Fix long downloading

27bb946

artek0chumak force-pushed the artek0chumak/peft_safetensors branch from d639331 to 27bb946 Compare July 7, 2023 08:20

artek0chumak added 2 commits July 7, 2023 09:01

Add layer specific loading

ac700bd

style

83ce741

artek0chumak commented Jul 7, 2023

View reviewed changes

src/petals/utils/peft.py Outdated Show resolved Hide resolved

Add loading into device directly

eb194ca

artek0chumak marked this pull request as ready for review July 11, 2023 20:56

artek0chumak force-pushed the artek0chumak/peft_safetensors branch from 45cf7f1 to 2244e7d Compare July 11, 2023 20:58

artek0chumak and others added 13 commits July 11, 2023 21:00

Add possible tests

02e1c95

Add first functional code

6563bf1

Safer way to download pefts

9ea77b0

Fix long downloading

c02ae4c

Add layer specific loading

ddd7700

style

92612ae

Add loading into device directly

da204f1

Add skeleton for peft init

e452df2

select adapter by name in handler.py

1e22724

fix bug: do not raise "no adapter found" if ther is no adapter to be …

04cf318

…found

fix bug: send adapter name last instead of first

784168e

Add adapters loading

30e3f4a

style

a162e9d

Your Name added 2 commits July 12, 2023 04:23

test existing / non-existing block

bf498a4

add adapter to the missing server

b8d4c33

add adapter to the missing server

3b3b0eb

justheuristic reviewed Jul 12, 2023

View reviewed changes

tests/test_full_model.py Outdated Show resolved Hide resolved

borzunov and others added 6 commits July 12, 2023 05:12

fix error message

bd70d06

debugmsg

2fadb54

black-isort

1892539

make sure tp is actually tested

f7ab052

fix pytest param

5255e6e

black-isort

bec132d

artek0chumak changed the title ~~Add peft safetensors loading~~ Add peft lora adapters. Jul 12, 2023

Merge branch 'artek0chumak/peft_safetensors' of https://github.com/ar…

5dd5891

…tek0chumak/petals into artek0chumak/peft_safetensors

artek0chumak added 4 commits July 12, 2023 06:23

Fix after manual tests

81c847b

Add interface for use adapters

3f05c0e

Style

f843708

Remove debug

fc3212f

borzunov changed the title ~~Add peft lora adapters.~~ Add peft LoRA adapters Jul 12, 2023

borzunov changed the title ~~Add peft LoRA adapters~~ Support peft LoRA adapters Jul 12, 2023

borzunov and others added 8 commits July 12, 2023 12:39

review

77a5b77

Merge branch 'artek0chumak/peft_safetensors' of https://github.com/ar…

3947b9e

…tek0chumak/petals into artek0chumak/peft_safetensors

remove dropout

ea321b9

remove debugprint

aa1f0e4

review

496ebba

Merge remote-tracking branch 'artek0chumak/artek0chumak/peft_safetens…

ef6152a

…ors' into artek0chumak/peft_safetensors

Merge branch 'main' into artek0chumak/peft_safetensors

133558e

black isort

222f605

justheuristic merged commit b9f0a54 into bigscience-workshop:main Jul 12, 2023
6 of 7 checks passed

justheuristic mentioned this pull request Jul 12, 2023

Estimate adapter memory overhead in choose_num_blocks() #346

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support peft LoRA adapters #335

Support peft LoRA adapters #335

artek0chumak commented Jun 30, 2023

borzunov Jul 5, 2023

artek0chumak Jul 7, 2023

justheuristic commented Jul 12, 2023 •

edited

artek0chumak commented Jul 12, 2023



		NOSAFE_PEFT_REPO = "timdettmers/guanaco-7b"
		SAFE_PEFT_REPO = "artek0chumak/guanaco-7b"

Support peft LoRA adapters #335

Support peft LoRA adapters #335

Conversation

artek0chumak commented Jun 30, 2023

borzunov Jul 5, 2023

Choose a reason for hiding this comment

artek0chumak Jul 7, 2023

Choose a reason for hiding this comment

justheuristic commented Jul 12, 2023 • edited

Review / comments:

artek0chumak commented Jul 12, 2023

justheuristic commented Jul 12, 2023 •

edited