Skip to content

[Roadmap] NVCF Q2/Q3 2026 public roadmap #27

@athappa-nv

Description

@athappa-nv

Public Roadmap

Hi NVCF developers and users!

This roadmap summarizes the current NVCF direction for Q2 and Q3 2026. It is directional, not a commitment to ship every item listed. Scope and sequencing will change as we learn from contributors and users.

NVCF tracks public roadmap work with GitHub issues labeled roadmap and no-stale.

NVCF Platform

These features expand what is possible with NVCF: broader infrastructure flexibility, new platform capabilities, and more deployment options.

Q2 priorities (March - June 2026):

  • Multi-cluster: Register multiple compute clusters with a single NVCF control plane.
  • Task support: Run long-lived or asynchronous jobs on NVCF with lifecycle tracking, status visibility, and operational controls.
  • Autoscaler: Automatic scaling for self-hosted deployments based on workload demand.
  • Guides and documentation: Published guides for Model Express, Dynamo operator usage, and related workflows.
  • Model Express guide: Published guide for users to leverage Model Express.

Q3 priorities (July - September 2026):

  • Multi-account: Add supported multi-account management so operators can create, select, and manage separate accounts for tenants, teams, or environments.
  • Multi-region: Support deployment of multiple NVCF control planes.
  • Self-hosted UI: A lightweight management UI through the NGC Lite experience.

Looking for contributions:

  • Pluggable infrastructure: Swap in your own infrastructure components, such as Cassandra, DNS, and vault-compatible secret storage, instead of relying on defaults.
  • Billing integrations: Usage metering, invoice generation, and billing provider integrations for self-hosted operators.

Invocation, Routing, and LLM Gateway

A purpose-built gateway layer with routing, scaling, and rate limiting designed for LLM workloads.

Q2 priorities (March - June 2026):

  • LLM Gateway: A dedicated gateway for LLM traffic with request-level routing control, including KVCache-aware routing, autoscaling, and rate limiting.

Q3 priorities (July - September 2026):

  • Q3 priorities are still being developed.

Looking for contributions:

  • Authentication and authorization: Identity and access control integrations, such as Keycloak and Envoy, for securing endpoints and managing permissions.
  • Vanity endpoint routing: Custom endpoint URLs and endpoint-pattern mapping for branded and multi-tenant deployments.

Storage, Caching, and SIS

High-performance storage and checkpoint support to keep up with workloads at scale.

Q2 priorities (March - June 2026):

  • High-performance storage: Multi-node shared storage for workloads that need high-throughput or shared data access, including multi-cluster shared filesystem support.
  • Checkpoint and restore: Exploratory work to support checkpoint and restore for NVCF functions to reduce GPU cold-start latency and improve recovery performance.

Q3 priorities (July - September 2026):

  • Q3 priorities are still being developed.

Scheduling and Resource Flexibility

Making GPU resources more flexible so multiple workloads can share a single device and scheduling can scale with demand.

Q2 priorities (March - June 2026):

  • Scheduler integration: KAI Scheduler support.

Q3 priorities (July - September 2026):

  • Fractional GPU support: Dynamic Resource Allocation (DRA) based fractional GPU support, so multiple workloads can share GPU resources below a full device.
  • GPU co-location: Time-sharing and co-location scenarios where multiple workloads safely share a single GPU.

Observability and Operations

Cluster-level observability and operational tooling for monitoring, upgrading, and scaling self-hosted deployments.

Q2 priorities (March - June 2026):

  • Observability reference architecture: A reference architecture for self-hosted operators covering multi-cluster observability, node health, reliability, storage, and infrastructure sizing guidance.
  • Release and upgrade path: Validate the NVCF release path and document supported upgrade flows.
  • Infrastructure sizing: Better public guidance for infrastructure sizing at scale.
  • Event ledger: Backend service that records and exposes timestamped lifecycle events for every NVCF function deployment.
  • Bring Your Own Observability (BYOO): Stream NVCF function logs, metrics, and traces into any OpenTelemetry-compatible backend.

Looking for contributions:

  • Observability vendor integrations: Example configurations, dashboards, and provider plugins for monitoring stacks like Datadog, Grafana, and Prometheus.
  • Marketplace packaging: NVCF packaging for platforms like OpenShift Marketplace.

Agentic Workloads

Support for workflows that go beyond single-step GPU inference, combining GPU and CPU steps with multi-step orchestration.

Q3 priorities (July - September 2026):

  • CPU-only functions: CPU-only function support through dynamic resource and instance type changes, with user-defined CPU and memory settings. This is the foundation for agentic workflows that mix GPU inference with CPU-based steps.

Looking for contributions:

  • Agentic workflow integrations: Workflow harness integrations, including LangChain-style frameworks, so developers can orchestrate multi-step agentic workflows on NVCF.
  • Reference architectures: Public examples and reference architectures demonstrating functions and tasks in a single workflow.

Contribution Areas

These are the areas where community contributions can have the most impact on NVCF. Pick what interests you, explore the open issues, and start a conversation. Make sure your PR is tied to an issue and follows the guidelines from CONTRIBUTING.md.

Look for issues labeled good-first-issue and help-wanted to find starting points, or open a new issue if you have an idea that is not tracked yet.

  • Billing integrations: Usage metering, invoice generation, and billing provider integrations for self-hosted operators.
  • Observability vendor integrations: Example configurations, dashboards, and provider plugins for monitoring stacks like Datadog, Grafana, and Prometheus.
  • Agentic workflow harnesses: Integrations with orchestration frameworks like LangChain for multi-step agentic workflows.
  • Authentication and authorization: Identity and access control integrations, such as Keycloak and Envoy, for securing endpoints and managing permissions.
  • Vanity endpoint routing: Custom endpoint URLs and endpoint-pattern mapping for branded and multi-tenant deployments.
  • Marketplace packaging: NVCF packaging for platforms like OpenShift Marketplace.
  • Pluggable infrastructure: Swap in your own infrastructure components, such as Cassandra, DNS, and vault-compatible secret storage, instead of relying on defaults.
  • Reference architectures: Public examples and reference architectures demonstrating functions and tasks in a single workflow.

Existing Tracking Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions