[Model] add interface for seq meta. #366

Duyi-Wang · 2024-04-30T07:08:03Z

Replace Model::input() and Model::config() with Model::set_input().
Refactor Decoder.forward() params to std::vector<xft::SequenceMeta *> &seq, bool logits_all = false.

abenmao · 2024-04-30T07:20:26Z

include/models.h

    bool isDone();

-    std::tuple<float *, int, int> forward();
+    std::tuple<float *, int, int> forward(bool logits_all = true);


default to false

Default to true for high level API evaluation.

Keep align with existed path, will check and change in the future.

* [Common] Add sequenceMeta, sequenceGroup and sequenecePool. (#343) * merge batchSize and seqLen into one in TokenEembedding * merge batchSize and seqLen into one in TokenEembedding (#350) * [Common] Move Martix into xft namespace. (#351) * remove unsed function in DecoderLayer * [Layer] Remove unused functions in Decoder layer (#353) * fix compile error of embeddingForward * [Model] Fix compile error of embeddingForward in YaRNLlama (#358) * [Common] Add sampling params into group seq. (#356) * remove DecoderContext in computeSoftmax * [Util] Remove DecoderContext in computeSoftmax (#362) * [Common] Refactor sequence.h. (#363) * [kernels] refactor flash attention for continuous batching (#361) * [models] Add attnMeta for continuous batching (#364) * [Layers] fix build error (#365) * [Model] add interface for seq meta. (#366) * refactor resize function in DecoderContext to support CB, and qkScores member removed * [Common] Modify resize() in DecoderContext to support (#367) * add some code to CommonDecoder::forward() * SequenceMeta refactor * [Model] New CommonDecoder::forward impl. skeleton (#369) * new KVCacheMgr supporting CB * fix typo & set default prefixId to -1 in addSequence() * [Common] New KVCacheMgr to support CB (#371) * [Sampling] Add repetition penalty for new seq type. (#373) * New foward to support CB (CommonDecoder->DecoderBlock->DecoderLayer->Attention/MLP) * add todo * [Sampling] Add greedy search for cb path. (#376) * logic issue fix * code fix to make new forward work * add maxSeqLen limitation * cross attention impl. for CB * DecoderContext::resize fix * correct the output of the new forward * add cb_check * fix incorrect buffer size calculation * 2 sequences -> 3 sequences * better method to prepare KV cache --------- Co-authored-by: Changqing Li <changqing.li@intel.com> Co-authored-by: Duyi-Wang <duyi.wang@intel.com> Co-authored-by: Meng,Chen <chen.meng@intel.com>

[Model] add interface for seq meta.

480e9d8

Duyi-Wang added the enhancement New feature or request label Apr 30, 2024

abenmao reviewed Apr 30, 2024

View reviewed changes

changqi1 approved these changes Apr 30, 2024

View reviewed changes

abenmao approved these changes Apr 30, 2024

View reviewed changes

abenmao merged commit 2499f60 into intel:cb_dev Apr 30, 2024

Duyi-Wang deleted the refactor_forward branch April 30, 2024 07:34

Duyi-Wang added a commit that referenced this pull request May 9, 2024

[Model] add interface for seq meta. (#366)

239a045

Duyi-Wang added a commit that referenced this pull request May 15, 2024

[Model] add interface for seq meta. (#366)

e12ffa8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] add interface for seq meta. #366

[Model] add interface for seq meta. #366

Duyi-Wang commented Apr 30, 2024 •

edited

Loading

abenmao Apr 30, 2024

Duyi-Wang Apr 30, 2024

Duyi-Wang Apr 30, 2024

[Model] add interface for seq meta. #366

[Model] add interface for seq meta. #366

Conversation

Duyi-Wang commented Apr 30, 2024 • edited Loading

abenmao Apr 30, 2024

Choose a reason for hiding this comment

Duyi-Wang Apr 30, 2024

Choose a reason for hiding this comment

Duyi-Wang Apr 30, 2024

Choose a reason for hiding this comment

Duyi-Wang commented Apr 30, 2024 •

edited

Loading