[Sampling] Add repetition penalty for new seq type. #373

Duyi-Wang · 2024-05-08T01:40:05Z

No description provided.

* [Common] Add sequenceMeta, sequenceGroup and sequenecePool. (#343) * merge batchSize and seqLen into one in TokenEembedding * merge batchSize and seqLen into one in TokenEembedding (#350) * [Common] Move Martix into xft namespace. (#351) * remove unsed function in DecoderLayer * [Layer] Remove unused functions in Decoder layer (#353) * fix compile error of embeddingForward * [Model] Fix compile error of embeddingForward in YaRNLlama (#358) * [Common] Add sampling params into group seq. (#356) * remove DecoderContext in computeSoftmax * [Util] Remove DecoderContext in computeSoftmax (#362) * [Common] Refactor sequence.h. (#363) * [kernels] refactor flash attention for continuous batching (#361) * [models] Add attnMeta for continuous batching (#364) * [Layers] fix build error (#365) * [Model] add interface for seq meta. (#366) * refactor resize function in DecoderContext to support CB, and qkScores member removed * [Common] Modify resize() in DecoderContext to support (#367) * add some code to CommonDecoder::forward() * SequenceMeta refactor * [Model] New CommonDecoder::forward impl. skeleton (#369) * new KVCacheMgr supporting CB * fix typo & set default prefixId to -1 in addSequence() * [Common] New KVCacheMgr to support CB (#371) * [Sampling] Add repetition penalty for new seq type. (#373) * New foward to support CB (CommonDecoder->DecoderBlock->DecoderLayer->Attention/MLP) * add todo * [Sampling] Add greedy search for cb path. (#376) * logic issue fix * code fix to make new forward work * add maxSeqLen limitation * cross attention impl. for CB * DecoderContext::resize fix * correct the output of the new forward * add cb_check * fix incorrect buffer size calculation * 2 sequences -> 3 sequences * better method to prepare KV cache --------- Co-authored-by: Changqing Li <changqing.li@intel.com> Co-authored-by: Duyi-Wang <duyi.wang@intel.com> Co-authored-by: Meng,Chen <chen.meng@intel.com>

[Sampling] Add repetition penalty for new seq type.

2212871

Duyi-Wang requested a review from pujiang2018 May 8, 2024 01:40

Duyi-Wang added build related to project build. enhancement New feature or request and removed build related to project build. labels May 8, 2024

pujiang2018 approved these changes May 8, 2024

View reviewed changes

Duyi-Wang merged commit e79c9c2 into intel:cb_dev May 8, 2024

Duyi-Wang deleted the repetition_pen branch May 8, 2024 02:10

Duyi-Wang added a commit that referenced this pull request May 9, 2024

[Sampling] Add repetition penalty for new seq type. (#373)

dbb317b

Duyi-Wang added a commit that referenced this pull request May 15, 2024

[Sampling] Add repetition penalty for new seq type. (#373)

3f15904

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Sampling] Add repetition penalty for new seq type. #373

[Sampling] Add repetition penalty for new seq type. #373

Duyi-Wang commented May 8, 2024

[Sampling] Add repetition penalty for new seq type. #373

[Sampling] Add repetition penalty for new seq type. #373

Conversation

Duyi-Wang commented May 8, 2024