Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Framework] Continuous Batching Support #357

Merged
merged 35 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
1cc9c1d
[Common] Add sequenceMeta, sequenceGroup and sequenecePool. (#343)
changqi1 May 15, 2024
d01be1a
merge batchSize and seqLen into one in TokenEembedding (#350)
pujiang2018 May 15, 2024
db0c4e9
[Common] Move Martix into xft namespace. (#351)
Duyi-Wang May 15, 2024
a4f4b25
[Layer] Remove unused functions in Decoder layer (#353)
pujiang2018 May 15, 2024
45bcfa3
[Model] Fix compile error of embeddingForward in YaRNLlama (#358)
pujiang2018 May 15, 2024
a704873
[Common] Add sampling params into group seq. (#356)
Duyi-Wang May 15, 2024
987a874
[Util] Remove DecoderContext in computeSoftmax (#362)
pujiang2018 May 15, 2024
451ef21
[Common] Refactor sequence.h. (#363)
Duyi-Wang May 15, 2024
5e98e6d
[kernels] refactor flash attention for continuous batching (#361)
abenmao May 15, 2024
2b5e266
[models] Add attnMeta for continuous batching (#364)
abenmao May 15, 2024
e12ffa8
[Model] add interface for seq meta. (#366)
Duyi-Wang May 15, 2024
a4442f0
[Common] Modify resize() in DecoderContext to support (#367)
pujiang2018 May 15, 2024
fb52594
[Model] New CommonDecoder::forward impl. skeleton (#369)
pujiang2018 May 15, 2024
aa48f7e
[Common] New KVCacheMgr to support CB (#371)
pujiang2018 May 15, 2024
3f15904
[Sampling] Add repetition penalty for new seq type. (#373)
Duyi-Wang May 15, 2024
8c2e6b4
[Sampling] Add greedy search for cb path. (#376)
Duyi-Wang May 15, 2024
aac0167
[Model/Layer] New forward to support CB (CommonDecoder->DecoderBlock-…
pujiang2018 May 15, 2024
0e35c8f
[Model] Return seqIDs when set input. (#377)
Duyi-Wang May 15, 2024
f441906
[Framework] Code fix to make new path for CB work (#379)
pujiang2018 May 15, 2024
f9bfb49
[Layer] update mlp for CB. (#384)
marvin-Yu May 15, 2024
3f232c5
[Framework] Update set_input for cb. (#381)
Duyi-Wang May 15, 2024
6625b01
[Layers] Added RotaryEmbedding forward for cb mode & Fixed rope uts (…
abenmao May 11, 2024
f220fe0
[Layer] Cross attention impl. for CB (#382)
pujiang2018 May 11, 2024
eb417af
[Build] Fix namespace build issue. (#388)
Duyi-Wang May 11, 2024
b5bda0c
[Common] DecoderContext::resize bug fix (#387)
pujiang2018 May 11, 2024
35562c0
[Model][Layer] Correct output of the new forward (#389)
pujiang2018 May 11, 2024
e67e455
[Example] add cb_check example (#390)
pujiang2018 May 13, 2024
c576aff
[Bug] Fix incorrect buffer size calculation (#391)
pujiang2018 May 13, 2024
eff6a75
[Example] Fix continuous batching C++ example. (#392)
Duyi-Wang May 13, 2024
2b374ff
[Example] More check in C++ continuous batching example (#393)
pujiang2018 May 13, 2024
7e9d731
[Model] Check maxLen should be [input len, model max len]. (#394)
Duyi-Wang May 13, 2024
524bf32
[Layer] Better method to reinterpret KV cache (#397)
pujiang2018 May 13, 2024
3865654
[Interface] Add python api for continuous batching. (#398)
Duyi-Wang May 15, 2024
af0aae8
[Example] Reactivate the old path.
Duyi-Wang May 15, 2024
cef27bc
[Build] Fix build issue.
Duyi-Wang May 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions ci_build
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,6 @@ ut() {
for file in ./*; do
if [ -x "$file" ]; then

#Todo(marvin): delete me when the case is ready.
if [[ "$file" == "./rotary_embedding_test" ]]; then
Warning "Bypass the fail case of $file."
continue
fi
##################################################

if [[ "$file" != *_test ]]; then
Warning "$file is not ending with '_test', skip current loop."
continue
Expand Down
16 changes: 13 additions & 3 deletions examples/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,26 +14,36 @@
# ============================================================================
cmake_minimum_required(VERSION 3.15.1)

aux_source_directory(${CMAKE_CURRENT_SOURCE_DIR} EXAMPLE_SCR)

include(${CMAKE_SOURCE_DIR}/cmake/cmdline.cmake)
include(${CMAKE_SOURCE_DIR}/cmake/sentencepiece.cmake)

add_executable(example ${EXAMPLE_SCR})
set(EXAMPLE_SRCS "example.cpp" "vocab_opt.cpp" "vocab_qwen.cpp")
set(CB_CHECK_SRCS "cb_check.cpp" "vocab_opt.cpp" "vocab_qwen.cpp")

add_executable(example ${EXAMPLE_SRCS})
add_executable(cb_check ${CB_CHECK_SRCS})

target_include_directories(example PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/cmdline)
target_include_directories(example PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/sentencepiece/include)
target_include_directories(cb_check PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/cmdline)
target_include_directories(cb_check PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/sentencepiece/include)

target_link_directories(example PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/sentencepiece/${CMAKE_INSTALL_LIBDIR})
target_link_directories(cb_check PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/sentencepiece/${CMAKE_INSTALL_LIBDIR})

if(BUILD_WITH_SHARED_LIBS)
target_link_libraries(example PRIVATE xfastertransformer)
target_link_libraries(cb_check PRIVATE xfastertransformer)
else()
target_link_libraries(example PRIVATE xfastertransformer_static)
target_link_libraries(cb_check PRIVATE xfastertransformer_static)
endif()
target_link_libraries(example PRIVATE sentencepiece -lstdc++fs)
target_link_libraries(cb_check PRIVATE sentencepiece -lstdc++fs)
if(WITH_GPU)
target_link_libraries(example PRIVATE -fsycl -fsycl-device-code-split=per_kernel -lOpenCL)
target_link_libraries(cb_check PRIVATE -fsycl -fsycl-device-code-split=per_kernel -lOpenCL)
endif()

add_dependencies(example cmdline sentencepiece_lib)
add_dependencies(cb_check cmdline sentencepiece_lib)
Loading