Skip to content

Conversation

@danieldk
Copy link
Member

@danieldk danieldk commented Sep 11, 2024

What does this PR do?

This change introduces the new moe-kernels package:

  • Add moe-kernels as a dependency.
  • Introduce a SparseMoELayer module that can be used by MoE models.
  • Port over Mixtral and Deepseek.

GPTQ-Marlin support and porting DBRX is for follow-up PRs.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@danieldk danieldk force-pushed the feature/moe-kernels branch 5 times, most recently from 6098899 to d38d6e0 Compare September 12, 2024 14:23
@danieldk danieldk marked this pull request as ready for review September 13, 2024 07:05
@danieldk
Copy link
Member Author

The CI failure looks unrelated.

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Have you checked th tests before the PR on mixtral see if the output is the same ?

SpeculativeHead,
get_linear,
)
from text_generation_server.layers.moe import SparseMoELayer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be imported on ipex?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the import in text_generation_server.layers.moe.unquantized conditional on IPEX not being used (like the old code).

@danieldk
Copy link
Member Author

Have you checked th tests before the PR on mixtral see if the output is the same ?

Yeah, I created the test first on the main branch.

This change introduces the new `moe-kernels` package:

- Add `moe-kernels` as a dependency.
- Introduce a `SparseMoELayer` module that can be used by MoE
  models.
- Port over Mixtral and Deepseek.
{
"id": 369,
"logprob": -0.06585693,
"logprob": 0.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh ?

Copy link
Member Author

@danieldk danieldk Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see the " me" above, and that was before the changes.

Copy link
Member Author

@danieldk danieldk Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess that's what a temperature of 0.5 does. Happens a lot though:

❯ git grep '"logprob": 0\.0,' | uniq | wc -l
33

@danieldk danieldk merged commit ce85efa into main Sep 17, 2024
@danieldk danieldk deleted the feature/moe-kernels branch September 17, 2024 16:09
@danieldk danieldk restored the feature/moe-kernels branch September 17, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants