Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama2: Multi-lora example notebook, Custom generator #1114

Merged
merged 4 commits into from
Apr 26, 2024

Conversation

jambayk
Copy link
Contributor

@jambayk jambayk commented Apr 25, 2024

Describe your changes

  • Add a notebook showing an end-to-end example for qlora fine-tuning, model optimization, adapter extraction and deployment with multiple adapters.
  • Implemented a custom text generator with can handle multiple lora adapters. Also implemented different types of QV cache management. These are stored under examples/utils and can be reused for other models.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jambayk jambayk changed the title Llama2: Multi-lora example notebook Llama2: Multi-lora example notebook, Custom generator Apr 25, 2024
@jambayk jambayk requested a review from devang-ml April 25, 2024 22:51
current

complete example

clean

rename

nit
@jambayk jambayk merged commit 2e29d84 into main Apr 26, 2024
35 checks passed
@jambayk jambayk deleted the jambayk/llama-infer branch April 26, 2024 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants