Update: update lmcache version to v0.3.3#2
Conversation
…#1037) * add LMCache ROCm installation procedure to the doc Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> * clean up Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> --------- Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
delete obsoleted dockerfile Co-authored-by: Samuel Shen <slshen@uchicago.edu> Co-authored-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu>
Remove unnecessary build package dependency Library 'xxhash' is not required for package build. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu>
Speedup Signed-off-by: Shaoting <shaotingf@uchicago.edu>
* checkpoint Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * checkpoint Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * update Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix unit tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix unit tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix unit test Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix unit test Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
fix: wrong key check in kv_controller.py Signed-off-by: wxsms <wxsms@foxmail.com>
* Speedup Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix dependency for unit test Signed-off-by: Shaoting <shaotingf@uchicago.edu> --------- Signed-off-by: Shaoting <shaotingf@uchicago.edu>
* checkpoint Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * checkpoint2 Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * checkpoint 4 Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix nixl config Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix top-level code path Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * have everything Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * update config Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix init Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix several bugs Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * runnable versio Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * 1p1d runnable Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix format Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix format Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add 2p2d example Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * minor fix Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add empty token check in cache engine (LMCache#982) Signed-off-by: lrq619 <lrq619@outlook.com> * [PD] Support tp for pd (LMCache#1039) * Add ports for tp Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Add memory obj Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Add memory obj line Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Refactor to_int_list Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Modify proxy Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Modify gitignore Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * delete log Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Clean up Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Update README Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Modify xpyd proxy Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * clean up log Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * CUDA device Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Refactor tp_rank Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Refactor receiver tp_rank Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * format Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> --------- Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * [PD] Refactor tp_rank inside sender (LMCache#1044) Refactor tp_rank inside sender Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * format Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix rpc_port Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix disagg_spec Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix rpc_rank Signed-off-by: Shaoting <shaotingf@uchicago.edu> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: lrq619 <lrq619@outlook.com> Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> Signed-off-by: Wei Cai <caiweivi@gmail.com> Signed-off-by: Shaoting <shaotingf@uchicago.edu> Co-authored-by: LaiRuiqi <58351056+lrq619@users.noreply.github.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu>
Add mistralai/Mistral-Large-2407 information Signed-off-by: Jasmond Loh <Jasmond.Loh@hotmail.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu>
hotfix Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
* Test Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Enable cancel/skip in progress builds Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Activate correct .venv Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add gpu selections for each test Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Refactor scripts into CI directory Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Modify gpu memory utilization Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Cleanup useless scripts Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add pipelines for cleanup and end-to-end tests Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add maxfail=1 Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix typo: yaml to yml Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Lazy import NIXL Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add end to end log upload Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Differentiate cpu and disk log Signed-off-by: Shaoting <shaotingf@uchicago.edu> * increase timeout for docker logs Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add max model len to ensure successful startup Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Change to a light weight model to increase concurrency Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add timeout Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix log check logic Signed-off-by: Shaoting <shaotingf@uchicago.edu> --------- Signed-off-by: Shaoting <shaotingf@uchicago.edu>
Signed-off-by: idellzheng <idellzheng@tencent.com>
…MCache#1083) docs: fix config file name consistency in KV cache sharing example Signed-off-by: Kay Yan <kay.yan@daocloud.io>
…ion from testcase, documentation and examples (LMCache#1085) Remove deprecated pipelined_backend configuration option from documentation, examples and testcase Signed-off-by: Kay Yan <kay.yan@daocloud.io> Co-authored-by: Shaoting <shaotingf@uchicago.edu>
* Fix integration test exit Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Separate cleanup script Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add stpes dependency Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add pre-exit hook Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Delete hook Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Use CID to clean up Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Remove delete buildkite key Signed-off-by: Shaoting <shaotingf@uchicago.edu> --------- Signed-off-by: Shaoting <shaotingf@uchicago.edu> Co-authored-by: Samuel Shen <102553648+sammshen@users.noreply.github.com>
* initial commit Signed-off-by: Samuel Shen <slshen@uchicago.edu> * fix serdes Signed-off-by: Samuel Shen <slshen@uchicago.edu> * fix serdes Signed-off-by: Samuel Shen <slshen@uchicago.edu> * fix code quality Signed-off-by: Samuel Shen <slshen@uchicago.edu> * fix code quality Signed-off-by: Samuel Shen <slshen@uchicago.edu> --------- Signed-off-by: Samuel Shen <slshen@uchicago.edu> Co-authored-by: Samuel Shen <slshen@uchicago.edu>
* Enable loopup server on all workers Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Remove incorrect unpin Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Add docker cleanup Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Refactor Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Fix client for loop Signed-off-by: Shaoting <shaotingf@uchicago.edu> --------- Signed-off-by: Shaoting <shaotingf@uchicago.edu>
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.4.1 to 2.4.2. - [Release notes](https://github.com/ossf/scorecard-action/releases) - [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md) - [Commits](ossf/scorecard-action@f49aabe...05b42c6) --- updated-dependencies: - dependency-name: ossf/scorecard-action dependency-version: 2.4.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Martin Hickey <martin.hickey@ie.ibm.com>
Bumps [step-security/harden-runner](https://github.com/step-security/harden-runner) from 2.12.1 to 2.12.2. - [Release notes](https://github.com/step-security/harden-runner/releases) - [Commits](step-security/harden-runner@002fdce...6c439dc) --- updated-dependencies: - dependency-name: step-security/harden-runner dependency-version: 2.12.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Martin Hickey <martin.hickey@ie.ibm.com>
Refactor docs workflow Refactor docs workflow for consistency and security to be aligned with the other workflows in the project. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: siddhantray <siddhant.r98@gmail.com>
…1097) * Tighten harden runner security Tighten the harden runner security by setting allowed endpoints. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> * Enable docker and containers Workflows use containers and therefore need access. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> * Update runners in other publish jobs Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> --------- Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
…che#1103) * Enable container access for score card workflow Score card workflow requires docker access. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> * Enable container access for publish to PyPi and TestPyPI Access required for publishing. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> --------- Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
…ack to blackhole (LMCache#764) * Introduce a remote probe thread to monitor remote and support fallback to blackhole Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Revert the error handling for each request Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Fix comment Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Thread safe improvement Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Improve output Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Specify the concrete type of backend Signed-off-by: baoloongmao <baoloongmao@tencent.com> * fix circle import Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Remove Chinese comments Signed-off-by: baoloongmao <baoloongmao@tencent.com> --------- Signed-off-by: baoloongmao <baoloongmao@tencent.com>
* added unit test for storage backend Signed-off-by: Wei Cai <caiweivi@gmail.com> [Core] Use a faster hash function (LMCache#1020) * perf: replace sha256 by xxhash Signed-off-by: Zhou Fang <fang.github@gmail.com> * build: add xxhash to requirements Signed-off-by: Zhou Fang <fang.github@gmail.com> * style: import Signed-off-by: Zhou Fang <fang.github@gmail.com> * build: add xxhash to common requirements Signed-off-by: Zhou Fang <fang.github@gmail.com> --------- Signed-off-by: Zhou Fang <fang.github@gmail.com> [DOC] [ROCm]: Update LMCache installation procedure for ROCm (LMCache#1037) * add LMCache ROCm installation procedure to the doc Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> * clean up Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> --------- Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com> [CD]: Clean up unnecessary manylinux base file (LMCache#1043) delete obsoleted dockerfile Co-authored-by: Samuel Shen <slshen@uchicago.edu> Co-authored-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu> [Refactor]: Remove unnecessary build package dependency (LMCache#1059) Remove unnecessary build package dependency Library 'xxhash' is not required for package build. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: Shaoting <shaotingf@uchicago.edu> Signed-off-by: Wei Cai <caiweivi@gmail.com> * Change pin count Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Format Signed-off-by: Shaoting <shaotingf@uchicago.edu> * Update test_local_disk_backend.py * Update test_local_disk_backend.py * Update test_connector.py --------- Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> Signed-off-by: Wei Cai <caiweivi@gmail.com> Signed-off-by: Shaoting <shaotingf@uchicago.edu> Co-authored-by: Shaoting <shaotingf@uchicago.edu> Co-authored-by: Samuel Shen <102553648+sammshen@users.noreply.github.com>
[Fix] Fix the retrieved token set for P2P mode Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
* conditional in 1p1d example Signed-off-by: Ubuntu <ubuntu@lccn02.novalocal> * fix other pd examples Signed-off-by: Ubuntu <ubuntu@lccn02.novalocal> --------- Signed-off-by: Ubuntu <ubuntu@lccn02.novalocal> Co-authored-by: Ubuntu <ubuntu@lccn02.novalocal>
* [Add] fix not implemented error in gds Signed-off-by: ApostaC <yihua98@uchicago.edu> * [fix] linter erros Signed-off-by: ApostaC <yihua98@uchicago.edu> --------- Signed-off-by: ApostaC <yihua98@uchicago.edu>
LMCache#1026) [Bugfix] Metadata file path missing suffix in GdsBackend and WekaGdsBackend Signed-off-by: An <pqqqan@foxmail.com>
* make pd and prefix compat Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix format Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * minor fix Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * minor fix Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Remove deprecated assert for nixl and offloading Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Fix typo of init_cpu_memory_allocator Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Fix meta.fmt instead of fmt Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Mark disk and remote ssd as not support Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * fix bugs Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix bugs Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Prevent pre-exit with cpu offloading Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Remove mis-initialize of cpu backend Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * Fix local cpu backend test Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> * The local_cpu backend is always created Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu> Co-authored-by: Shaoting Feng <shaotingf@uchicago.edu>
There was a problem hiding this comment.
Pull Request Overview
This PR updates LMCache to version v0.3.3 and addresses version compatibility issues with storage backends. The update includes important changes to licensing, dependency management, and core functionality to ensure consistent caching behavior across distributed systems.
- Updates LMCache from an unspecified version to v0.3.3 with hash consistency improvements
- Standardizes licensing from full Apache-2.0 headers to SPDX license identifiers
- Removes version pinning for torch and related dependencies to improve vLLM compatibility
Reviewed Changes
Copilot reviewed 297 out of 329 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| Multiple test files | Remove deprecated pipelined_backend configuration parameter |
| License headers | Replace verbose Apache-2.0 headers with SPDX identifiers across all files |
| requirements/*.txt | Unpin torch versions and add version constraints for pytest compatibility |
| lmcache/v1/token_database.py | Major refactor to support vLLM hash functions and consistent cross-process caching |
| Storage backend files | Update interfaces and add new functionality for nixl, offload server, and monitoring |
| Multiple connector files | Add ping support and improve error handling for remote connections |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| - change urls at `lmcache.Value.lmcache.config` | ||
| - `remote_url: mooncakestore://[master service url]/` | ||
| - `remote_url: mooncakestore://[master service url]:[master service port]/` | ||
| - `service port` must be included |
There was a problem hiding this comment.
필수값인 경우, helm chart 상에서 명시적으로 "required" 옵션을 주거나, 기본값이 있다면 설명에서는 생략해도 될 것 같습니다
There was a problem hiding this comment.
config내에 remote_url을 넣어줄때의 방법이라 helm chart상의 값이 아닙니다
따로 기본값이나 required를 넣어줄 방법이 없어 README에만 적었습니다.
| @@ -0,0 +1,32 @@ | |||
| # This is a YAML-formatted file. | |||
There was a problem hiding this comment.
- helm/values 디렉토리는 예시 values를 모두 추가해주신 것이죠?
- 현재 moreh-dev/dynamo에서는 helm 예시를 모두 https://github.com/moreh-dev/dynamo/tree/main/examples/helm 로 이동시키긴 했습니다
- 유사하게 별도의 examples 디렉터리가 있다면 해당 디렉터리 하위로 옮겨도 좋을 것 같습니다
insukim1994
left a comment
There was a problem hiding this comment.
리뷰가 늦어서 죄송합니다. 몇가지 궁금한 부분들 코멘트드렸습니다
| ): | ||
| config = infinistore.ClientConfig( | ||
| host_addr=host, | ||
| host_addr=socket.gethostbyname(host), |
There was a problem hiding this comment.
host를 혹시 그대로 사용할 수 없는 이유가 있었나요? (가령 불필요한 문자열이 추가되는 등의 이슈가 있었나요?)
There was a problem hiding this comment.
제가 리뷰를 이제 봤네요ㅠㅠ
infinistore에 svc domain 주소로 넣어주면 ip로 변환을 하지 않고 그대로 쓰려해서 문제가 되어
밖에서 host_addr로 변환해서 넣어줘야만 합니다.
| nodeAffinity: | ||
| requiredDuringSchedulingIgnoredDuringExecution: | ||
| nodeSelectorTerms: | ||
| {{ .Values.mooncake.master.nodeSelectorTerms | toYaml | indent 14 }} |
There was a problem hiding this comment.
{{ .Values.mooncake.master.nodeSelectorTerms | toYaml | indent 14 }}
대신
{{- .Values.mooncake.master.nodeSelectorTerms | toYaml | indent 14 }} 으로 변경하고 인덴테이션을 다음처럼 맞춰줘도 좋을 것 같습니다:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
{{- .Values.mooncake.master.nodeSelectorTerms | toYaml | indent 14 }}
helm/mooncake/values.yaml
Outdated
| # value: "some-value" | ||
| # 리소스 할당 (CPU, 메모리, GPU 등) | ||
| resources: {} | ||
| # Service 포트 설정 |
There was a problem hiding this comment.
코멘트 내용 상 아래에 있는 설명인 것이죠? 인덴테이션이 추가로 적용된 것 같아서 맞춰주어도 좋을 것 같습니다
helm/mooncake/values.yaml
Outdated
| # value: "some-value" | ||
| # 리소스 할당 (CPU, 메모리, GPU 등) | ||
| resources: {} | ||
| # Service 포트 설정 |
There was a problem hiding this comment.
여기도 코멘트 내용 상 아래에 있는 설명인 것이죠? 인덴테이션이 추가로 적용된 것 같아서 맞춰주어도 좋을 것 같습니다
|
helm-docs를 통해서 아래 각각의 차트들에 대한 README.md 파일을 생성할 수 있는데 추가해주신는 것도 괜찮을 것 같습니다: |
LMCache의 version을 v0.3.3으로 업데이트
봐야할 부분
https://moreh.slack.com/lists/T01BLNAK88Z/F099BH8PEKU?record_id=Rec09C08W1SS1