Public Roadmap
Hi NVCF developers and users!
This roadmap summarizes the current NVCF direction for Q2 and Q3 2026. It is directional, not a commitment to ship every item listed. Scope and sequencing will change as we learn from contributors and users.
NVCF tracks public roadmap work with GitHub issues labeled roadmap and no-stale.
NVCF Platform
These features expand what is possible with NVCF: broader infrastructure flexibility, new platform capabilities, and more deployment options.
Q2 priorities (March - June 2026):
- Multi-cluster: Register multiple compute clusters with a single NVCF control plane.
- Task support: Run long-lived or asynchronous jobs on NVCF with lifecycle tracking, status visibility, and operational controls.
- Autoscaler: Automatic scaling for self-hosted deployments based on workload demand.
- Guides and documentation: Published guides for Model Express, Dynamo operator usage, and related workflows.
- Model Express guide: Published guide for users to leverage Model Express.
Q3 priorities (July - September 2026):
- Multi-account: Add supported multi-account management so operators can create, select, and manage separate accounts for tenants, teams, or environments.
- Multi-region: Support deployment of multiple NVCF control planes.
- Self-hosted UI: A lightweight management UI through the NGC Lite experience.
Looking for contributions:
- Pluggable infrastructure: Swap in your own infrastructure components, such as Cassandra, DNS, and vault-compatible secret storage, instead of relying on defaults.
- Billing integrations: Usage metering, invoice generation, and billing provider integrations for self-hosted operators.
Invocation, Routing, and LLM Gateway
A purpose-built gateway layer with routing, scaling, and rate limiting designed for LLM workloads.
Q2 priorities (March - June 2026):
- LLM Gateway: A dedicated gateway for LLM traffic with request-level routing control, including KVCache-aware routing, autoscaling, and rate limiting.
Q3 priorities (July - September 2026):
- Q3 priorities are still being developed.
Looking for contributions:
- Authentication and authorization: Identity and access control integrations, such as Keycloak and Envoy, for securing endpoints and managing permissions.
- Vanity endpoint routing: Custom endpoint URLs and endpoint-pattern mapping for branded and multi-tenant deployments.
Storage, Caching, and SIS
High-performance storage and checkpoint support to keep up with workloads at scale.
Q2 priorities (March - June 2026):
- High-performance storage: Multi-node shared storage for workloads that need high-throughput or shared data access, including multi-cluster shared filesystem support.
- Checkpoint and restore: Exploratory work to support checkpoint and restore for NVCF functions to reduce GPU cold-start latency and improve recovery performance.
Q3 priorities (July - September 2026):
- Q3 priorities are still being developed.
Scheduling and Resource Flexibility
Making GPU resources more flexible so multiple workloads can share a single device and scheduling can scale with demand.
Q2 priorities (March - June 2026):
- Scheduler integration: KAI Scheduler support.
Q3 priorities (July - September 2026):
- Fractional GPU support: Dynamic Resource Allocation (DRA) based fractional GPU support, so multiple workloads can share GPU resources below a full device.
- GPU co-location: Time-sharing and co-location scenarios where multiple workloads safely share a single GPU.
Observability and Operations
Cluster-level observability and operational tooling for monitoring, upgrading, and scaling self-hosted deployments.
Q2 priorities (March - June 2026):
- Observability reference architecture: A reference architecture for self-hosted operators covering multi-cluster observability, node health, reliability, storage, and infrastructure sizing guidance.
- Release and upgrade path: Validate the NVCF release path and document supported upgrade flows.
- Infrastructure sizing: Better public guidance for infrastructure sizing at scale.
- Event ledger: Backend service that records and exposes timestamped lifecycle events for every NVCF function deployment.
- Bring Your Own Observability (BYOO): Stream NVCF function logs, metrics, and traces into any OpenTelemetry-compatible backend.
Looking for contributions:
- Observability vendor integrations: Example configurations, dashboards, and provider plugins for monitoring stacks like Datadog, Grafana, and Prometheus.
- Marketplace packaging: NVCF packaging for platforms like OpenShift Marketplace.
Agentic Workloads
Support for workflows that go beyond single-step GPU inference, combining GPU and CPU steps with multi-step orchestration.
Q3 priorities (July - September 2026):
- CPU-only functions: CPU-only function support through dynamic resource and instance type changes, with user-defined CPU and memory settings. This is the foundation for agentic workflows that mix GPU inference with CPU-based steps.
Looking for contributions:
- Agentic workflow integrations: Workflow harness integrations, including LangChain-style frameworks, so developers can orchestrate multi-step agentic workflows on NVCF.
- Reference architectures: Public examples and reference architectures demonstrating functions and tasks in a single workflow.
Contribution Areas
These are the areas where community contributions can have the most impact on NVCF. Pick what interests you, explore the open issues, and start a conversation. Make sure your PR is tied to an issue and follows the guidelines from CONTRIBUTING.md.
Look for issues labeled good-first-issue and help-wanted to find starting points, or open a new issue if you have an idea that is not tracked yet.
- Billing integrations: Usage metering, invoice generation, and billing provider integrations for self-hosted operators.
- Observability vendor integrations: Example configurations, dashboards, and provider plugins for monitoring stacks like Datadog, Grafana, and Prometheus.
- Agentic workflow harnesses: Integrations with orchestration frameworks like LangChain for multi-step agentic workflows.
- Authentication and authorization: Identity and access control integrations, such as Keycloak and Envoy, for securing endpoints and managing permissions.
- Vanity endpoint routing: Custom endpoint URLs and endpoint-pattern mapping for branded and multi-tenant deployments.
- Marketplace packaging: NVCF packaging for platforms like OpenShift Marketplace.
- Pluggable infrastructure: Swap in your own infrastructure components, such as Cassandra, DNS, and vault-compatible secret storage, instead of relying on defaults.
- Reference architectures: Public examples and reference architectures demonstrating functions and tasks in a single workflow.
Existing Tracking Issues
Public Roadmap
Hi NVCF developers and users!
This roadmap summarizes the current NVCF direction for Q2 and Q3 2026. It is directional, not a commitment to ship every item listed. Scope and sequencing will change as we learn from contributors and users.
NVCF tracks public roadmap work with GitHub issues labeled
roadmapandno-stale.NVCF Platform
These features expand what is possible with NVCF: broader infrastructure flexibility, new platform capabilities, and more deployment options.
Q2 priorities (March - June 2026):
Q3 priorities (July - September 2026):
Looking for contributions:
Invocation, Routing, and LLM Gateway
A purpose-built gateway layer with routing, scaling, and rate limiting designed for LLM workloads.
Q2 priorities (March - June 2026):
Q3 priorities (July - September 2026):
Looking for contributions:
Storage, Caching, and SIS
High-performance storage and checkpoint support to keep up with workloads at scale.
Q2 priorities (March - June 2026):
Q3 priorities (July - September 2026):
Scheduling and Resource Flexibility
Making GPU resources more flexible so multiple workloads can share a single device and scheduling can scale with demand.
Q2 priorities (March - June 2026):
Q3 priorities (July - September 2026):
Observability and Operations
Cluster-level observability and operational tooling for monitoring, upgrading, and scaling self-hosted deployments.
Q2 priorities (March - June 2026):
Looking for contributions:
Agentic Workloads
Support for workflows that go beyond single-step GPU inference, combining GPU and CPU steps with multi-step orchestration.
Q3 priorities (July - September 2026):
Looking for contributions:
Contribution Areas
These are the areas where community contributions can have the most impact on NVCF. Pick what interests you, explore the open issues, and start a conversation. Make sure your PR is tied to an issue and follows the guidelines from
CONTRIBUTING.md.Look for issues labeled
good-first-issueandhelp-wantedto find starting points, or open a new issue if you have an idea that is not tracked yet.Existing Tracking Issues