Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deepseek v2 support #693

Merged
merged 6 commits into from
Jul 27, 2024
Merged

Deepseek v2 support #693

merged 6 commits into from
Jul 27, 2024

Conversation

hnyls2002
Copy link
Collaborator

@hnyls2002 hnyls2002 commented Jul 21, 2024

To use deepseek v2, please sepcify the --context-length or --max-num-reqs to avoid oom. The context length for deepseek is quite large, for the current static req_to_token layout, we cannot support large requests num and large context length at the same time.

@hnyls2002 hnyls2002 marked this pull request as draft July 21, 2024 22:44
@m0g1cian
Copy link

Looking forward to see Deepseek v2 gets supported! I was trying to do the same thing two weeks ago but found the exact same issue of

RuntimeError: shape mismatch: value tensor of shape [7, 16, 256] cannot be broadcast to indexing result of shape [7, 16, 40]

@hnyls2002 hnyls2002 marked this pull request as ready for review July 26, 2024 23:31
@hnyls2002 hnyls2002 merged commit 679ebcb into main Jul 27, 2024
2 checks passed
@hnyls2002 hnyls2002 deleted the deepseek branch July 27, 2024 00:10
@Xu-Chen
Copy link
Contributor

Xu-Chen commented Jul 27, 2024

Thank you for your excellent work. Will you support MLA in the future? Reduce KV cache to support larger context length.

@Ying1123 Ying1123 mentioned this pull request Aug 2, 2024
29 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants