Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial multi-lora support #1103

Merged
merged 5 commits into from
Apr 14, 2023
Merged

Conversation

mcmonkey4eva
Copy link
Contributor

For #853

Contains initial support for loading multiple LoRAs at once.

Works as a checkbox group, with a refresh button, and an apply button.

If you checkmark new loras that weren't checkmarked before, it loads them very very quickly. If you unload prior loras, it removes all for now and then re-adds. Way faster than a full model reload.

The merge_and_unload function seems to not support 8-bit.

This alters requirements to require a direct git copy of peft for now, as they haven't published the update with this feature yet.

I have not fully tested the results of generating with multiple LoRAs, only that they load/unload and the model still works.

I have also not tested against the possibility of memleaks or other issues arising from repeatedly mucking with loras on the fly.

I have only tested in 8bit with LLaMA-13B currently.

@oobabooga
Copy link
Owner

I'll try to review this one and #1098 later today. Better supporting LoRAs is a priority to me know and your PRs are very helpful. I'm also considering using a custom version of PEFT in the requirements.txt to support applying LoRAs to 4-bit models.

server.py Outdated
@@ -211,8 +211,9 @@ def create_model_menus():
ui.create_refresh_button(shared.gradio['model_menu'], lambda: None, lambda: {'choices': get_available_models()}, 'refresh-button')
with gr.Column():
with gr.Row():
shared.gradio['lora_menu'] = gr.Dropdown(choices=available_loras, value=shared.lora_name, label='LoRA')
shared.gradio['lora_menu'] = gr.CheckboxGroup(choices=available_loras, value=shared.lora_names, label='LoRA model(s)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how many LoRAs people might end up with, but you could maybe keep this a Dropdown and add the multiselect=True argument. Probably a clearer UI experience?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, I'm dumb. In #853 this is what the auto webui used for styles (my suggested option #4) but I never looked into how that was done. This is indeed much cleaner. Going to change that and push, though it seems I'll need to rebase and force-push as Main branch changed code right next to the server.py edits here.

rebuilt off main
@mcmonkey4eva
Copy link
Contributor Author

Rebuilt from main and repushed with a mutliselect dropdown.

image

Should probably find a way to make that Apply button smaller.

@oobabooga
Copy link
Owner

If you select multiple LoRAs (like 4 or 5), the row containing the new button and the LoRA dropdown grows in an awkward way. Is it possible to implement this menu in such way that it occupies a constant area and never grows?

@mcmonkey4eva
Copy link
Contributor Author

mcmonkey4eva commented Apr 14, 2023

uhh... by really forcing it with CSS

overflow: scroll;
max-height: 3rem;

it prevents vertical growth, at the cost of the replacement awkwardness that if you have more LoRAs than fit on one line, they're hidden behind a scrollbar.

I don't know if that's worth doing? I can add it if you prefer it.


if lora_name not in ['None', '']:
print(f"Adding the LoRA {lora_name} to the model...")
# Only adding, and already peft? Do it the easy way.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to assume that the model is already peft? For instance, if you load llama-7b without the --lora argument, it will not have been loaded with PeftModel.from_pretrained.

Edit: okay, this is only executed if len(set(shared.lora_names)) > 0, in which case the model will have been loaded with PeftModel.from_pretrained.

@@ -25,7 +38,11 @@ def add_lora_to_model(lora_name):
elif shared.args.load_in_8bit:
params['device_map'] = {'': 0}

shared.model = PeftModel.from_pretrained(shared.model, Path(f"{shared.args.lora_dir}/{lora_name}"), **params)
shared.model = PeftModel.from_pretrained(shared.model, Path(f"{shared.args.lora_dir}/{lora_names[0]}"), **params)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also related to the comment above, if the model is "fresh", is it necessary to reload it with PeftModel.from_pretrained?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, this isn't actually a full reload, this is just taking the non-peft model and wrapping it (+ applying the first lora). It definitely runs a lot faster than a full model load, and doesn't output any of the loading nonsense to console at least.

requests
rwkv==0.7.3
safetensors==0.3.0
sentencepiece
pyyaml
tqdm
git+https://github.com/huggingface/peft
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, is the dev version of peft required? The code seems to run without errors with peft==0.2.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peft 0.2.0 was released March 9th https://github.com/huggingface/peft/releases and multi-adapter support was merged April 6th huggingface/peft#263 so, yes, it's needed? I'm not sure what could lead it to seemingly work on 0.2.0 for you, possibly you accidentally had a different version installed while testing because you were recently testing johnsmith0031/alpaca_lora_4bit#13 as well? idk.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely needed then.

@oobabooga
Copy link
Owner

uhh... by really forcing it with CSS

overflow: scroll;
max-height: 3rem;

it prevents vertical growth, at the cost of the replacement awkwardness that if you have more LoRAs than fit on one line, they're hidden behind a scrollbar.

I don't know if that's worth doing? I can add it if you prefer it.

I did some reorganizing and it looks fine now, no need to change the CSS. What I really wanted was for the model and the lora dropdowns to be on the same line.

@oobabooga oobabooga merged commit 64e3b44 into oobabooga:main Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants