Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Framework] Continuous Batching Support #357

Merged
merged 35 commits into from
May 15, 2024
Merged

[Framework] Continuous Batching Support #357

merged 35 commits into from
May 15, 2024

Commits on May 15, 2024

  1. Configuration menu
    Copy the full SHA
    1cc9c1d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d01be1a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    db0c4e9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a4f4b25 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    45bcfa3 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    a704873 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    987a874 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    451ef21 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    5e98e6d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    2b5e266 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    e12ffa8 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    a4442f0 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    fb52594 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    aa48f7e View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    3f15904 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    8c2e6b4 View commit details
    Browse the repository at this point in the history
  17. [Model/Layer] New forward to support CB (CommonDecoder->DecoderBlock-…

    …>DecoderLayer->Attention/MLP) (#375)
    pujiang2018 authored and Duyi-Wang committed May 15, 2024
    Configuration menu
    Copy the full SHA
    aac0167 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    0e35c8f View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    f441906 View commit details
    Browse the repository at this point in the history
  20. [Layer] update mlp for CB. (#384)

    marvin-Yu authored and Duyi-Wang committed May 15, 2024
    Configuration menu
    Copy the full SHA
    f9bfb49 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    3f232c5 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    6625b01 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    f220fe0 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    eb417af View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    b5bda0c View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    35562c0 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    e67e455 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    c576aff View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    eff6a75 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    2b374ff View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    7e9d731 View commit details
    Browse the repository at this point in the history
  32. [Layer] Better method to reinterpret KV cache (#397)

    * [Common] Add sequenceMeta, sequenceGroup and sequenecePool. (#343)
    
    * merge batchSize and seqLen into one in TokenEembedding
    
    * merge batchSize and seqLen into one in TokenEembedding (#350)
    
    * [Common] Move Martix into xft namespace. (#351)
    
    * remove unsed function in DecoderLayer
    
    * [Layer] Remove unused functions in Decoder layer (#353)
    
    * fix compile error of embeddingForward
    
    * [Model] Fix compile error of embeddingForward in YaRNLlama (#358)
    
    * [Common] Add sampling params into group seq. (#356)
    
    * remove DecoderContext in computeSoftmax
    
    * [Util] Remove DecoderContext in computeSoftmax (#362)
    
    * [Common] Refactor sequence.h. (#363)
    
    * [kernels] refactor flash attention for continuous batching (#361)
    
    * [models] Add attnMeta for continuous batching (#364)
    
    * [Layers] fix build error (#365)
    
    * [Model] add interface for seq meta. (#366)
    
    * refactor resize function in DecoderContext to support CB, and qkScores member removed
    
    * [Common] Modify resize() in DecoderContext to support  (#367)
    
    * add some code to CommonDecoder::forward()
    
    * SequenceMeta refactor
    
    * [Model] New CommonDecoder::forward impl. skeleton (#369)
    
    * new KVCacheMgr supporting CB
    
    * fix typo & set default prefixId to -1 in addSequence()
    
    * [Common] New KVCacheMgr to support CB (#371)
    
    * [Sampling] Add repetition penalty for new seq type. (#373)
    
    * New foward to support CB (CommonDecoder->DecoderBlock->DecoderLayer->Attention/MLP)
    
    * add todo
    
    * [Sampling] Add greedy search for cb path. (#376)
    
    * logic issue fix
    
    * code fix to make new forward work
    
    * add maxSeqLen limitation
    
    * cross attention impl. for CB
    
    * DecoderContext::resize fix
    
    * correct the output of the new forward
    
    * add cb_check
    
    * fix incorrect buffer size calculation
    
    * 2 sequences -> 3 sequences
    
    * better method to prepare KV cache
    
    ---------
    
    Co-authored-by: Changqing Li <changqing.li@intel.com>
    Co-authored-by: Duyi-Wang <duyi.wang@intel.com>
    Co-authored-by: Meng,Chen <chen.meng@intel.com>
    4 people committed May 15, 2024
    Configuration menu
    Copy the full SHA
    524bf32 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    3865654 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    af0aae8 View commit details
    Browse the repository at this point in the history
  35. [Build] Fix build issue.

    Duyi-Wang committed May 15, 2024
    Configuration menu
    Copy the full SHA
    cef27bc View commit details
    Browse the repository at this point in the history