Skip to content

[Unity] Paged KV Cache as LM Support#15910

Merged
tqchen merged 1 commit into
apache:unityfrom
MasterJH5574:unity-dev/2023-10-10-paged-kv-cache
Oct 15, 2023
Merged

[Unity] Paged KV Cache as LM Support#15910
tqchen merged 1 commit into
apache:unityfrom
MasterJH5574:unity-dev/2023-10-10-paged-kv-cache

Conversation

@MasterJH5574
Copy link
Copy Markdown
Contributor

This PR introduces the PagedKVCache object to lm_support.cc for the KV cache value management in batching settings.

One test file is included. Note that this file does not contain the test of attention function/kernel. That part will be uploaded and tested separately.

Comment thread src/runtime/relax_vm/lm_support.cc Outdated
@MasterJH5574 MasterJH5574 marked this pull request as draft October 11, 2023 15:15
@MasterJH5574
Copy link
Copy Markdown
Contributor Author

Mark as draft for an update of docstring and APIs. Sorry that it is not mature enough.

@MasterJH5574 MasterJH5574 force-pushed the unity-dev/2023-10-10-paged-kv-cache branch from 7671ddb to 4a5b263 Compare October 12, 2023 00:27
@MasterJH5574 MasterJH5574 marked this pull request as ready for review October 12, 2023 00:28
@MasterJH5574
Copy link
Copy Markdown
Contributor Author

Documented the paged KV cache and updated some of the interfaces. It's ready for review now.

@MasterJH5574 MasterJH5574 force-pushed the unity-dev/2023-10-10-paged-kv-cache branch from 4a5b263 to 36da7a4 Compare October 12, 2023 00:30
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
Comment thread src/runtime/relax_vm/paged_kv_cache.cc
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
Comment thread src/runtime/relax_vm/paged_kv_cache.cc Outdated
@MasterJH5574 MasterJH5574 force-pushed the unity-dev/2023-10-10-paged-kv-cache branch from 36da7a4 to 8ef57ba Compare October 12, 2023 18:29
Comment thread tests/python/relax/test_runtime_builtin_paged_attention_kv_cache.py Outdated
@MasterJH5574 MasterJH5574 force-pushed the unity-dev/2023-10-10-paged-kv-cache branch 2 times, most recently from 4c80ff8 to 8ce79b9 Compare October 13, 2023 18:46
This PR introduces the PagedKVCache object to Relax runtime
for the KV cache value management in batching settings.

One test file is included. Note that this file does not contain
the test of attention function/kernel. That part will be uploaded
and tested separately.
@MasterJH5574 MasterJH5574 force-pushed the unity-dev/2023-10-10-paged-kv-cache branch from 8ce79b9 to 6a5c618 Compare October 14, 2023 21:49
@tqchen tqchen merged commit c606fcf into apache:unity Oct 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants