Skip to content

[Feature]: AutoDeploy: Integrate with CppMambaHybridCacheManager #14320

@govind-ramnarayan

Description

@govind-ramnarayan

🚀 The feature, motivation and pitch

Currently disaggregated serving with Mamba models is only supported through the CppMambaHybridCacheManager. So to support them in AutoDeploy, we need to integrate with this cache manager.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Labels

AutoDeploy<NV> AutoDeploy BackendDisaggregated serving<NV>Deploying with separated, distributed components (params, kv-cache, compute). Arch & perf.feature requestNew feature or request. This includes new model, dtype, functionality support

Type

No type
No fields configured for issues without a type.

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions