Skip to content

Conversation

@qyh111
Copy link
Contributor

@qyh111 qyh111 commented Aug 15, 2025

Prupose

What this PR does / why we need it?

fix bug when running deepseek model

Modifications

Does this PR introduce any user-facing change?

Test

image

How was this patch tested?

@ygwpz ygwpz merged commit d060f78 into ModelEngine-Group:develop Aug 16, 2025
ygwpz added a commit that referenced this pull request Sep 4, 2025
* fix issue#26 and issue#36 (#55)

* [Doc] Add vllm institution (#61)

* [CI] Add issue and pull request template; [Fix][Doc] Fix nfs doc error. (#62) (#64)

* [CI] Add issue template

* [CI] Add pr template

* [Fix][Doc] Fix nfs doc error, close #57

Co-authored-by: harrisonyhq <harrisonyhq@gmail.com>

* [Doc] update install doc using patch to build from source code (#68)

* [Feat] Merge 0.0.1 back into develop (#72)

* [CI] Add issue and pull request template; [Fix][Doc] Fix nfs doc error. (#62)

* [CI] Add issue template

* [CI] Add pr template

* [Fix][Doc] Fix nfs doc error, close #57

* [CI][Style] Add Github workflow for pre commit and format the codestyle (#70)

* [CI] Add github flow for pre-commit and unittest

* [Style] Fix typo and sytle problem in repo

---------

Co-authored-by: harrisonyhq <harrisonyhq@gmail.com>

* [Style] Fix codestyle problems and typo in develop (#75)

* [Style] Fix codestyle problems and typo

* [Fix] Fix CI bug

* [CI] Add workflow trigger on push

* [CI] Add support pyproject.toml to enable using python -m build to compile whl package

* ucm_sparse framework v1.0 (#79)

* [Fix] Fix cant find cmake error when using pip install -e .

* Revert "ucm_sparse framework v1.0 (#79)" (#82)

This reverts commit b965dc8.

* [Feature] add Mooncake Store

* [Fix bug] fix docker build err and installation.md (#87)

* adapt deepseek (#89)

* [Feature][P/D] add example for disaggregated prefill (#90)

* [Perf] Pipelined ucmnfsstore (#97)

* pipelined ucmnfsstore

* update default stream number

* Revert "[Feature] add Mooncake Store" (#98)

* [Fix bug] fix uc_connector ut and change hash generation method

* [Fix] Fix .so build error (#104)

[Fix] Fix so file import error in build and edit mode

[Fix] format the code

[Feat] Add device recognize function

* [Fix] Fix ascend compile error (#106)

* ESA 1.0

fix typo

ESA: add vllm and vllm-ascend patch

add vllm and vllm_ascend patch

* fix typo

* [fix] compatible with prefix cache

* add sparse_attn example

* add sparse_attn docs

* Modify start_load_kv (#103)

* [Fix] Fix duplicate create/commit errors upon preemption (#109)

* [refact] format

* adapt for vllm 0.9.1 (#113)

Co-authored-by: y00945504 <yuhui87@huawei.com>

* add patch

* fix: uc_connector,rm .gitkeep ucm_oceanstor.py

* rename vllm-adapt-2 to vllm-adapt-sparse

* [Fix] Fix spelling issues with PR templates (#119)

* remove load_tasks

* [bugfix] bugfix in ucmnfsstore (#123)

* trans task timeout support

* [Fix] posix file open interface bugfix

* add config parameter

* Fix rank handling in multi-node PP setup (#129)

* [Feat]Support UCM Sparse on cuda (#126)

* [Feat]Support UCM Sparse on cuda

* [DOCS]Add doc for format code.

* [Feature] Add mooncake store (#117)

* 暂存

* [Feature] Monncake connector support both config and file

* [Doc] Add docs for Ucm Mooncake Connector

* [Feature] Add mooncake to ucm factory

* [Doc][Fix] Modify the description of configuration to match usage.

* [Feature] [Fix] Load Mooncake config from dict, when lack params, load from env config file.

* [Doc] update the performance and modify description.

* [Test] Example config file for Mooncake test `test_mooncake_env.py`.

* [Test] [Del] Removed unnecessary tests that do not match the current functionality

* [Feat!] [Del] Adjust the mooncake configuration method, remove the configuration file method, and only retain the parameter transmission method

* [Doc] [Fix] modifiy the performance figure of Mooncake Store.

* [Feat] add __del__() to shutdown all the mooncake components

---------

Co-authored-by: z00452769 <zhangyichen@huawei.com>
Co-authored-by: propanone1006 <1035097916@qq.com>
Co-authored-by: propanone1006 <1035067916@qq.com>

* [bugfix]modify mla dump (#128)

* modify mla dump

* fix ci problem

* [BugFix] aggregate work ouputs to decide dumped blocks

* [BugFix] Modify npu worker for aggregating modelrunner_outputs

* [CI] Add vllm patch for sparse in dockerfile (#134)

* [CI] Add vllm patch for sparse in dockerfile

* [Fix] Add patch in dockerfile and pip mirror.

* [Fix] Update version 0.0.2

* ESA: skip processing for short requests (#147)

* ucm_sparse: skip processing for  short requests

* add comments

---------

Co-authored-by: flesher0813 <33923823+flesher0813@users.noreply.github.com>
Co-authored-by: harrisonyhq <harrisonyhq@gmail.com>
Co-authored-by: hek14 <1023129548@qq.com>
Co-authored-by: Chen Deng <120033622+propanone1006@users.noreply.github.com>
Co-authored-by: propanone1006 <1035067916@qq.com>
Co-authored-by: qyh111 <qiuyuhao1@huawei.com>
Co-authored-by: Mag1c.H <hemajun815@163.com>
Co-authored-by: t00939662 <tianxuehan@huawei.com>
Co-authored-by: Fate469434 <58885253+Fate469434@users.noreply.github.com>
Co-authored-by: y00945504 <yuhui87@huawei.com>
Co-authored-by: Zbm1996 <370478722@qq.com>
Co-authored-by: NaganooMei <290992347@qq.com>
Co-authored-by: NaganooMei <104300720+NaganooMei@users.noreply.github.com>
Co-authored-by: f00943869 <fenghao0720@outlook.com>
Co-authored-by: hufumans <113507465+hufumans@users.noreply.github.com>
Co-authored-by: z00452769 <zhangyichen@huawei.com>
Co-authored-by: propanone1006 <1035097916@qq.com>
Co-authored-by: zhou-haitao <74044944+zhou-haitao@users.noreply.github.com>
Co-authored-by: flesher0813 <1208954694@qq.com>
Co-authored-by: AooooooA-C <chenaozhu@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants