[Core][Optimization] change python dict to pytorch tensor for blocks to swap #4659

youkaichao · 2024-05-07T20:00:08Z

Continue after #4607 .

csrc/cache_kernels.cu

cadedaniel · 2024-05-07T20:33:28Z

vllm/core/block_manager_v1.py

        assert (num_lookahead_slots == 0
                ), "BlockSpaceManagerV1 does not support lookahead allocation"

        # CPU block -> GPU block.
+        # dict is efficient in lookup `if cpu_block in mapping`


is it a better approach if we do list(mapping.items()) in the worker layer?

IMO the mapping semantics are better represented by dict[int, int], which is why I ask

@cadedaniel This is basically for performance. List append/extend is faster than Dict insert/update.

…to swap (vllm-project#4659)

youkaichao added 15 commits May 7, 2024 12:27

to tensor before broadcast

1240741

update cache_swap

c4900b9

update cache engine

826ce2f

update ExecuteModelRequest

2e5a2a6

update blocks_to_swap_in in schedule

051c4e9

update blocks_to_swap_out in schedule

822cc47

update attention interface

f7f0eee

update tests/worker/test_swap.py

6fbbe06

update tests/kernels/test_cache.py

7364faa

update tests/core/test_scheduler.py

688f237

update cpu kernel signature

b671a08

update cpp code

55f6686

update block manager

5cb2502

fix tests

34aac0a

update tests

3641db6

youkaichao mentioned this pull request May 7, 2024

[Core][Optimization] change python dict to pytorch tensor #4607

Merged

2 tasks

cadedaniel reviewed May 7, 2024

View reviewed changes

use const

25476e2

youkaichao requested a review from cadedaniel May 7, 2024 21:37

youkaichao mentioned this pull request May 7, 2024

[Core][Distributed] support both cpu and device tensor in broadcast tensor dict #4660

Merged

update tests

300b1d5

cadedaniel approved these changes May 7, 2024

View reviewed changes

update tests

04c5bac

youkaichao mentioned this pull request May 8, 2024

[RFC]: Inline Golden (Expected) Tests #4663

Open

youkaichao added 4 commits May 7, 2024 18:21

update tests

7099156

Merge branch 'main' into blocks_to_swap_out

ef72edd

use cpu tensor for blocks to swap

72dd8db

fix test

cd94e09

simon-mo merged commit 20cfcde into vllm-project:main May 8, 2024
53 of 55 checks passed

youkaichao deleted the blocks_to_swap_out branch May 8, 2024 19:14

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request May 9, 2024

[Core][Optimization] change python dict to pytorch tensor for blocks …

8f229ec

…to swap (vllm-project#4659)

jikunshang mentioned this pull request May 10, 2024

[Core]fix type annotation for swap_blocks #4726

Merged

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request May 19, 2024

[Core][Optimization] change python dict to pytorch tensor for blocks …

8afd8f7

…to swap (vllm-project#4659)

dtrifiro pushed a commit to dtrifiro/vllm that referenced this pull request May 21, 2024

[Core][Optimization] change python dict to pytorch tensor for blocks …

2563537

…to swap (vllm-project#4659)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core][Optimization] change python dict to pytorch tensor for blocks to swap #4659

[Core][Optimization] change python dict to pytorch tensor for blocks to swap #4659

youkaichao commented May 7, 2024

cadedaniel May 7, 2024

cadedaniel May 7, 2024

youkaichao May 7, 2024

[Core][Optimization] change python dict to pytorch tensor for blocks to swap #4659

[Core][Optimization] change python dict to pytorch tensor for blocks to swap #4659

Conversation

youkaichao commented May 7, 2024

cadedaniel May 7, 2024

Choose a reason for hiding this comment

cadedaniel May 7, 2024

Choose a reason for hiding this comment

youkaichao May 7, 2024

Choose a reason for hiding this comment