-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Move to moe-kernels package and switch to common MoE layer #2511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6098899 to
d38d6e0
Compare
|
The CI failure looks unrelated. |
Narsil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Have you checked th tests before the PR on mixtral see if the output is the same ?
| SpeculativeHead, | ||
| get_linear, | ||
| ) | ||
| from text_generation_server.layers.moe import SparseMoELayer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be imported on ipex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the import in text_generation_server.layers.moe.unquantized conditional on IPEX not being used (like the old code).
Yeah, I created the test first on the |
d38d6e0 to
c997e3f
Compare
This change introduces the new `moe-kernels` package: - Add `moe-kernels` as a dependency. - Introduce a `SparseMoELayer` module that can be used by MoE models. - Port over Mixtral and Deepseek.
c997e3f to
5726a9c
Compare
| { | ||
| "id": 369, | ||
| "logprob": -0.06585693, | ||
| "logprob": 0.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see the " me" above, and that was before the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess that's what a temperature of 0.5 does. Happens a lot though:
❯ git grep '"logprob": 0\.0,' | uniq | wc -l
33
What does this PR do?
This change introduces the new
moe-kernelspackage:moe-kernelsas a dependency.SparseMoELayermodule that can be used by MoE models.GPTQ-Marlin support and porting DBRX is for follow-up PRs.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.