[Common] New KVCacheMgr to support CB #371

pujiang2018 · 2024-05-07T05:05:49Z

No description provided.

…feature/cb_dev

…s member removed

…feature/cb_dev

pujiang2018 · 2024-05-07T05:06:55Z

@Duyi-Wang @abenmao @changqi1 Pls take time to review this.

Duyi-Wang · 2024-05-07T05:12:22Z

src/common/kvcache_mgr.h

+public:
+    virtual ~KVCacheMgrImplBase() = default;
+    virtual bool delSequence(int seqID) = 0;
+    virtual bool addSequence(int seqID, int prefixId = 0) = 0;


default value for prefixID is better to be -1? since the ID starts from 0.

Duyi-Wang · 2024-05-07T05:58:11Z

src/common/kvcache_mgr.h

+            readyList.push_back(it->second);
+        }
+
+        readyCaches == std::move(readyList);


typo and fixed.

* [Common] Add sequenceMeta, sequenceGroup and sequenecePool. (#343) * merge batchSize and seqLen into one in TokenEembedding * merge batchSize and seqLen into one in TokenEembedding (#350) * [Common] Move Martix into xft namespace. (#351) * remove unsed function in DecoderLayer * [Layer] Remove unused functions in Decoder layer (#353) * fix compile error of embeddingForward * [Model] Fix compile error of embeddingForward in YaRNLlama (#358) * [Common] Add sampling params into group seq. (#356) * remove DecoderContext in computeSoftmax * [Util] Remove DecoderContext in computeSoftmax (#362) * [Common] Refactor sequence.h. (#363) * [kernels] refactor flash attention for continuous batching (#361) * [models] Add attnMeta for continuous batching (#364) * [Layers] fix build error (#365) * [Model] add interface for seq meta. (#366) * refactor resize function in DecoderContext to support CB, and qkScores member removed * [Common] Modify resize() in DecoderContext to support (#367) * add some code to CommonDecoder::forward() * SequenceMeta refactor * [Model] New CommonDecoder::forward impl. skeleton (#369) * new KVCacheMgr supporting CB * fix typo & set default prefixId to -1 in addSequence() * [Common] New KVCacheMgr to support CB (#371) * [Sampling] Add repetition penalty for new seq type. (#373) * New foward to support CB (CommonDecoder->DecoderBlock->DecoderLayer->Attention/MLP) * add todo * [Sampling] Add greedy search for cb path. (#376) * logic issue fix * code fix to make new forward work * add maxSeqLen limitation * cross attention impl. for CB * DecoderContext::resize fix * correct the output of the new forward * add cb_check * fix incorrect buffer size calculation * 2 sequences -> 3 sequences * better method to prepare KV cache --------- Co-authored-by: Changqing Li <changqing.li@intel.com> Co-authored-by: Duyi-Wang <duyi.wang@intel.com> Co-authored-by: Meng,Chen <chen.meng@intel.com>

pujiang2018 added 14 commits April 25, 2024 23:48

merge batchSize and seqLen into one in TokenEembedding

dbcb267

Merge commit '9a53fb2ea6b9141ba7c045bc0d135c1809e8f22c' into pujiang/…

25ee312

…feature/cb_dev

remove unsed function in DecoderLayer

376b2bc

Merge commit '4ff47074fc85a27e13251c3fb618f36e338c456f' into pujiang/…

d281a54

…feature/cb_dev

fix compile error of embeddingForward

b5b225a

remove DecoderContext in computeSoftmax

d5c9407

Merge commit 'f8f85714331c0df2ce4a8344e06972316770ec11' into pujiang/…

be615b2

…feature/cb_dev

Merge commit '2499f602c22184ca5afaa2f013ae0ff4e3bd4263' into pujiang/…

5833d41

…feature/cb_dev

refactor resize function in DecoderContext to support CB, and qkScore…

0514833

…s member removed

Merge commit 'c792aff5f7cc8554afa0399f4b6d241333b0b56c' into pujiang/…

5315a5a

…feature/cb_dev

add some code to CommonDecoder::forward()

63e895a

SequenceMeta refactor

9cfc7c6

new KVCacheMgr supporting CB

41e692c

Merge commit '2e057165b456ef9b88591a880403ffe47e7500c3' into pujiang/…

32c845c

…feature/cb_dev

Duyi-Wang added the enhancement New feature or request label May 7, 2024

Duyi-Wang reviewed May 7, 2024

View reviewed changes

fix typo & set default prefixId to -1 in addSequence()

c2ac8d2

Duyi-Wang approved these changes May 7, 2024

View reviewed changes

pujiang2018 merged commit dfa1d0e into cb_dev May 7, 2024

Duyi-Wang pushed a commit that referenced this pull request May 9, 2024

[Common] New KVCacheMgr to support CB (#371)

a3f8f6e

Duyi-Wang pushed a commit that referenced this pull request May 15, 2024

[Common] New KVCacheMgr to support CB (#371)

aa48f7e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Common] New KVCacheMgr to support CB #371

[Common] New KVCacheMgr to support CB #371

pujiang2018 commented May 7, 2024

pujiang2018 commented May 7, 2024

Duyi-Wang May 7, 2024

pujiang2018 May 7, 2024

Duyi-Wang May 7, 2024

pujiang2018 May 7, 2024

[Common] New KVCacheMgr to support CB #371

[Common] New KVCacheMgr to support CB #371

Conversation

pujiang2018 commented May 7, 2024

pujiang2018 commented May 7, 2024

Duyi-Wang May 7, 2024

Choose a reason for hiding this comment

pujiang2018 May 7, 2024

Choose a reason for hiding this comment

Duyi-Wang May 7, 2024

Choose a reason for hiding this comment

pujiang2018 May 7, 2024

Choose a reason for hiding this comment