NVIDIA NIM Provider

M31 Autonomous (M31A) supports NVIDIA NIM (Inference Microservices) as a third LLM provider alongside OpenRouter and Zen.

Overview

Source: internal/provider/nvidia/client.go

NVIDIA NIM provides access to NVIDIA's hosted AI models via an OpenAI-compatible API. The integration includes:

Full LLMProvider interface implementation
Completion-only model filtering (skips non-chat models)
Retry logic with exponential backoff
Health check with latency classification
First-run wizard and settings UI integration

Configuration

API Key

The NVIDIA API key is resolved in this order:

Environment variable: M31A_NVIDIA_API_KEY
Standard fallback: NVIDIA_API_KEY
OS keychain: m31a/nvidia
Config file: provider.nvidia.api_key

Config File

[provider]
default = "nvidia"

[provider.nvidia]
api_key = "nvapi-..."
nvidia_base_url = "https://integrate.api.nvidia.com/v1"

Environment Variables

Variable	Description
`M31A_NVIDIA_API_KEY`	Primary API key variable
`NVIDIA_API_KEY`	Fallback API key variable

Features

Model Catalog

Fetches available models from the NVIDIA NIM API. Completion-only models (e.g., codellama, starcoder) are automatically filtered out of the chat UI.

Streaming

Standard SSE streaming compatible with the OpenAI chat completions format. Supports text deltas, tool call chunks, and usage tracking.

Retry Logic

Transient errors (HTTP 500, 502, 503) trigger automatic retry with exponential backoff (max 2 retries). Non-retryable errors (401 unauthorized, 402 payment required, 429 rate limited) are returned immediately.

Health Check

Health check latency is classified into three levels:

Classification	Latency	Meaning
Live	< 500ms	Healthy
Slow	< 2 seconds	Degraded but functional
Degraded	> 2 seconds	May affect performance

Model Cache

Uses the shared provider model cache with TTL-based invalidation:

Setting	Default	Description
TTL	5 minutes	Fresh cache lifetime
Stale TTL	24 hours	Fallback cache lifetime

First-Run Setup

The first-run wizard includes NVIDIA NIM as a provider option with:

Provider-specific API key placeholder (nvapi-...)
NVIDIA key validation (checks nvapi- prefix)
Model list fetching and display

Settings UI

The settings editor includes NVIDIA NIM in the 3-provider list:

Provider selection (OpenRouter, Zen, NVIDIA)
API key configuration with secure storage
Health check status display
Short name abbreviation: NVIDIA

Registration

Provider registration is centralized in internal/tui/provider_registration.go:

case "nvidia":
    return nvidia.New(apiKey, nvidia.Options{
        BaseURL:           cfg.Provider.NvidiaBaseURL,
        Version:           version,
        DefaultContextLen: types.DefaultContextLength,
    })

Source Files

File	Purpose
`internal/provider/nvidia/client.go`	NVIDIA NIM client implementation
`internal/config/types.go`	`NvidiaAPIKey`, `NvidiaBaseURL` config fields
`internal/config/loader.go`	API key resolution from env/keychain/config
`internal/tui/provider_registration.go`	Centralized provider factory
`internal/tui/firstrun_model.go`	First-run wizard integration
`internal/tui/settings_model.go`	Settings UI integration

Uh oh!

NVIDIA NIM Provider

NVIDIA NIM Provider

Overview

Configuration

API Key

Config File

Environment Variables

Features

Model Catalog

Streaming

Retry Logic

Health Check

Model Cache

First-Run Setup

Settings UI

Registration

Source Files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally