Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support both medusa v1 and v2 #421

Merged
merged 9 commits into from
Apr 18, 2024
Merged

Support both medusa v1 and v2 #421

merged 9 commits into from
Apr 18, 2024

Conversation

tgaddair
Copy link
Contributor

TGI recently introduced a smaller version of Medusa that doesn't require additional LM heads (only a single dense projection per "medusa head"). This makes it a great candidate for dynamic adapter loading, as new variants for 7b param models are < 100MB.

This implementation is taken largely from the one in TGI.

@tgaddair tgaddair requested a review from noyoshi April 18, 2024 01:52
@tgaddair tgaddair merged commit 0a3c627 into main Apr 18, 2024
1 check passed
@tgaddair tgaddair deleted the medusa-lora branch April 18, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant