-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support lazy loading the lora module for reducing the loading p… #434
feat: support lazy loading the lora module for reducing the loading p… #434
Conversation
It seems that caching the handle from safe_open might be a better solution, but need to consider the file handle reference management that used by multiple layers, I will refine it later. |
bad816f
to
9b1ac96
Compare
Still keep cache the filenames instead of filehandles, since that 1) safe_open needs the device info which differs during loading the lora modules, 2) safe_open is lazy loading until specific tensor loaded by get_tensor invoked, which is already the optimized behavior for our case. |
@tgaddair could you help review this change ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @thincal, thanks for the PR, and apologies for the slow review!
I had one question about the file handle, but happy to land this and iterate on it to see if there's any room to further optimize.
It is fine to land it firstly, since the safe_open is already lazy behavior and main overhead is about reading out the specific tensor. |
@thincal I noticed there's a failing test:
Would you be able to take a look before we merge? We should be good to go once that's resolved. |
OK, I will finish it today, thanks. |
9b1ac96
to
6d91a88
Compare
@tgaddair fix passed for |
@tgaddair ping, sorry for my late response. Could you help review the revised commit? Thanks. |
Hey @thincal, very sorry for the delay here. Let me take a look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR and again really sorry for the delay in getting it landed! Sanity checks look good on my side. May run some more benchmarking later, but good to go ahead and land.
@tgaddair it's glad to see this change be merged, and thanks for your support :) |
What does this PR do?
Fixes #433
Before submitting
to it if that's the case.
Who can review?
@tgaddair