-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[Feature]: AutoDeploy: Integrate with CppMambaHybridCacheManager #14320
Copy link
Copy link
Open
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendDisaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Metadata
Metadata
Assignees
Labels
AutoDeploy<NV> AutoDeploy Backend<NV> AutoDeploy BackendDisaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Type
Fields
Give feedbackNo fields configured for issues without a type.
Projects
Status
Backlog
🚀 The feature, motivation and pitch
Currently disaggregated serving with Mamba models is only supported through the CppMambaHybridCacheManager. So to support them in AutoDeploy, we need to integrate with this cache manager.
Alternatives
No response
Additional context
No response
Before submitting a new issue...