LoRA hot swap for service deployment #863
Conversation
Summary of ChangesHello @xjq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the flexibility of the service by implementing dynamic LoRA (Low-Rank Adaptation) hot-swapping capabilities. It allows the system to load, switch, and remove LoRA weights at runtime without requiring a service restart. This is particularly beneficial for deployments serving multiple LoRAs, as it streamlines the process of applying different model adaptations on the fly, improving operational efficiency and responsiveness to varying user demands. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces LoRA hot-swapping capabilities, which is a great feature for service deployments. The changes are well-structured, adding remove_lora functions and updating the pipeline to handle LoRA removal. I've provided a few suggestions to improve code conciseness, robustness, and script usability. Overall, this is a solid contribution.
| if hasattr(self, "lora_down"): | ||
| del self.lora_down | ||
| if hasattr(self, "lora_up"): | ||
| del self.lora_up | ||
| if hasattr(self, "lora_alpha"): | ||
| del self.lora_alpha | ||
| if hasattr(self, "lora_scale"): | ||
| del self.lora_scale |
There was a problem hiding this comment.
The repeated if hasattr/del blocks can be made more concise and easier to maintain by using a loop.
| if hasattr(self, "lora_down"): | |
| del self.lora_down | |
| if hasattr(self, "lora_up"): | |
| del self.lora_up | |
| if hasattr(self, "lora_alpha"): | |
| del self.lora_alpha | |
| if hasattr(self, "lora_scale"): | |
| del self.lora_scale | |
| for attr in ("lora_down", "lora_up", "lora_alpha", "lora_scale"): | |
| if hasattr(self, attr): | |
| delattr(self, attr) |
| if hasattr(self, "current_lora_strength"): | ||
| del self.current_lora_strength |
There was a problem hiding this comment.
Instead of deleting current_lora_strength, it's more idiomatic and slightly cleaner to set it back to None. The attribute is initialized to None in __init__, so this keeps the object's structure consistent. The hasattr check is also redundant since the attribute is always present after initialization.
self.current_lora_strength = None| lightx2v_path= | ||
| model_path= | ||
| lora_dir= |
There was a problem hiding this comment.
To improve usability, it's helpful to add a check that verifies if these required variables have been set. This provides a clear error message to the user if they forget to edit the script, preventing confusing errors later on.
| lightx2v_path= | |
| model_path= | |
| lora_dir= | |
| lightx2v_path= | |
| model_path= | |
| lora_dir= | |
| if [[ -z "${lightx2v_path}" || -z "${model_path}" || -z "${lora_dir}" ]]; then | |
| echo "Error: lightx2v_path, model_path, and lora_dir must be set." >&2 | |
| exit 1 | |
| fi |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
I am serving qwen-image with several LoRAs, hot swap is vital for service deployment.
The patch includes: