Is your feature request related to a problem? Please describe.
The NVCF autoscaler service exists in the repository and is described as the service that monitors invocation and utilization patterns and horizontally scales NVCF functions. It is not yet integrated as a first-class install option in the self-managed stack.
Self-managed users need the same autoscaling capability available in NVCF production deployments, with clear deployment, configuration, and validation steps.
Describe the solution you'd like
Add function autoscaler support to the self-managed deployment path.
The implementation should cover:
- Helm chart or Helmfile integration.
- Required Cassandra schema and migrations.
- VictoriaMetrics backend configuration.
- NVCF API endpoint and authentication configuration.
- Scale-up, scale-down, and scale-to-zero validation.
- Documentation for the autoscaler algorithm.
Describe alternatives you've considered
Users can manually scale deployments or rely on Kubernetes in-cluster autoscaling, but Kubernetes autoscaling only scales within one cluster. NVCF supports function-aware autoscaling across multiple GPU clusters.
Additional context
Suggested acceptance criteria:
- Autoscaler can be enabled in a self-managed deployment through documented values.
- Autoscaler reaches readiness with the default self-managed dependencies.
- A function can scale up under load and scale down when idle.
- Autoscaler logs, metrics, and health endpoints are documented.
- Failure modes for missing metrics, Cassandra, or API auth are documented.
Is your feature request related to a problem? Please describe.
The NVCF autoscaler service exists in the repository and is described as the service that monitors invocation and utilization patterns and horizontally scales NVCF functions. It is not yet integrated as a first-class install option in the self-managed stack.
Self-managed users need the same autoscaling capability available in NVCF production deployments, with clear deployment, configuration, and validation steps.
Describe the solution you'd like
Add function autoscaler support to the self-managed deployment path.
The implementation should cover:
Describe alternatives you've considered
Users can manually scale deployments or rely on Kubernetes in-cluster autoscaling, but Kubernetes autoscaling only scales within one cluster. NVCF supports function-aware autoscaling across multiple GPU clusters.
Additional context
Suggested acceptance criteria: