Support mistral and sliding window attention #1075

grimoire · 2024-01-31T03:10:13Z

Mistral 7b v0.1
sliding window attention kernel
window block manager
rename params

Important

This PR contains refactoring of the engine core mechanism.

lmdeploy/pytorch/models/mistral.py

zhyncs · 2024-02-20T05:49:46Z

@grimoire is very productive, and the support for new models on the PyTorch engine is very timely. Currently, Mistral, Qwen 1.5, and DeepSeek MoE all rely on sliding window attention. @lvhan028 @RunningLeon Will we consider prioritizing this PR for review and merging it as soon as possible? Given that these are relatively large features, the community would likely appreciate being able to use them sooner rather than later.

RunningLeon · 2024-02-20T06:14:51Z

@grimoire is very productive, and the support for new models on the PyTorch engine is very timely. Currently, Mistral, Qwen 1.5, and DeepSeek MoE all rely on sliding window attention. @lvhan028 @RunningLeon Will we consider prioritizing this PR for review and merging it as soon as possible? Given that these are relatively large features, the community would likely appreciate being able to use them sooner rather than later.

@zhyncs Hi, feel free to review and test this PR. Any comment would be sincerely appreciated.

lvhan028 · 2024-02-20T07:46:41Z

This PR makes a great change. QA needs more time to test it.
I am afraid it cannot catch the latest v0.2.4 version, which will be released this week.
We'll release it in v0.2.5 next week.

lmdeploy/model.py

docs/en/supported_models/supported_models.md

README.md

lvhan028 · 2024-02-21T12:21:21Z

@RunningLeon Is there evaluation result of this PR?

RunningLeon · 2024-02-22T03:11:43Z

@RunningLeon Is there evaluation result of this PR?

here are the results. There might be something wrong.

lmdeploy/pytorch/paging/eviction_helper/recompute_eviction_helper.py

lvhan028 · 2024-02-22T12:39:56Z

@zhulinJulia24 may perform regression test.

RunningLeon · 2024-02-23T02:36:20Z

@lvhan028 The perform test is OK as here

RunningLeon

LGTM

grimoire and others added 10 commits January 26, 2024 16:44

support yi

919f2dc

Merge branch 'main' into refactor-kv-head

1dcfabf

update docs

8f6fd3d

add kernel

af0dc04

Merge branch 'main' into win-attn

1336594

fix seq

b605813

add win block manager

5508dd6

finish mistral

a746668

fix drop block

952b964

update docs

6b7cd3b

lvhan028 added the enhancement New feature or request label Jan 31, 2024

fix ut

f6635df

RunningLeon reviewed Feb 1, 2024

View reviewed changes

lmdeploy/pytorch/models/mistral.py Outdated Show resolved Hide resolved

grimoire added 2 commits February 2, 2024 18:04

fix for transformers 4.37.1

e346c81

update docs

2ba83ff

RunningLeon mentioned this pull request Feb 6, 2024

Support mixtral for pytorch engine #1133

Merged

grimoire added 2 commits February 19, 2024 15:39

Merge branch 'main' into win-attn

61fcf45

Merge branch 'main' into win-attn

7dfc129

This was referenced Feb 20, 2024

Support qwen1.5 in pytorch engine #1160

Merged

Support torch deepseek moe #1163

Merged

lvhan028 requested a review from zhulinJulia24 February 20, 2024 07:44

lvhan028 reviewed Feb 20, 2024

View reviewed changes

lmdeploy/model.py Outdated Show resolved Hide resolved

merge main

9ecb5af

RunningLeon reviewed Feb 21, 2024

View reviewed changes

docs/en/supported_models/supported_models.md Show resolved Hide resolved

mistral template

12ebf7b

RunningLeon reviewed Feb 21, 2024

View reviewed changes

README.md Show resolved Hide resolved

readme-zh-cn

09124f4

grimoire added 2 commits February 22, 2024 12:07

remove print

d847f83

Merge branch 'main' into win-attn

f584453

lvhan028 reviewed Feb 22, 2024

View reviewed changes

lmdeploy/pytorch/paging/eviction_helper/recompute_eviction_helper.py Outdated Show resolved Hide resolved

remove div up

196ab3c

grimoire mentioned this pull request Feb 22, 2024

Support gemma model in pytorch engine #1184

Merged

1 task

lvhan028 approved these changes Feb 22, 2024

View reviewed changes

RunningLeon approved these changes Feb 23, 2024

View reviewed changes

lvhan028 merged commit 8d8f972 into InternLM:main Feb 23, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support mistral and sliding window attention #1075

Support mistral and sliding window attention #1075

grimoire commented Jan 31, 2024 •

edited

zhyncs commented Feb 20, 2024

RunningLeon commented Feb 20, 2024

lvhan028 commented Feb 20, 2024

lvhan028 commented Feb 21, 2024

RunningLeon commented Feb 22, 2024

lvhan028 commented Feb 22, 2024

RunningLeon commented Feb 23, 2024

RunningLeon left a comment

Support mistral and sliding window attention #1075

Support mistral and sliding window attention #1075

Conversation

grimoire commented Jan 31, 2024 • edited

zhyncs commented Feb 20, 2024

RunningLeon commented Feb 20, 2024

lvhan028 commented Feb 20, 2024

lvhan028 commented Feb 21, 2024

RunningLeon commented Feb 22, 2024

lvhan028 commented Feb 22, 2024

RunningLeon commented Feb 23, 2024

RunningLeon left a comment

Choose a reason for hiding this comment

grimoire commented Jan 31, 2024 •

edited