-
-
Notifications
You must be signed in to change notification settings - Fork 0
NVIDIA NIM Provider
M31 Autonomous (M31A) supports NVIDIA NIM (Inference Microservices) as a third LLM provider alongside OpenRouter and Zen.
Source: internal/provider/nvidia/client.go
NVIDIA NIM provides access to NVIDIA's hosted AI models via an OpenAI-compatible API. The integration includes:
- Full
LLMProviderinterface implementation - Completion-only model filtering (skips non-chat models)
- Retry logic with exponential backoff
- Health check with latency classification
- First-run wizard and settings UI integration
The NVIDIA API key is resolved in this order:
- Environment variable:
M31A_NVIDIA_API_KEY - Standard fallback:
NVIDIA_API_KEY - OS keychain:
m31a/nvidia - Config file:
provider.nvidia.api_key
[provider]
default = "nvidia"
[provider.nvidia]
api_key = "nvapi-..."
nvidia_base_url = "https://integrate.api.nvidia.com/v1"| Variable | Description |
|---|---|
M31A_NVIDIA_API_KEY |
Primary API key variable |
NVIDIA_API_KEY |
Fallback API key variable |
Fetches available models from the NVIDIA NIM API. Completion-only models (e.g., codellama, starcoder) are automatically filtered out of the chat UI.
Standard SSE streaming compatible with the OpenAI chat completions format. Supports text deltas, tool call chunks, and usage tracking.
Transient errors (HTTP 500, 502, 503) trigger automatic retry with exponential backoff (max 2 retries). Non-retryable errors (401 unauthorized, 402 payment required, 429 rate limited) are returned immediately.
Health check latency is classified into three levels:
| Classification | Latency | Meaning |
|---|---|---|
| Live | < 500ms | Healthy |
| Slow | < 2 seconds | Degraded but functional |
| Degraded | > 2 seconds | May affect performance |
Uses the shared provider model cache with TTL-based invalidation:
| Setting | Default | Description |
|---|---|---|
| TTL | 5 minutes | Fresh cache lifetime |
| Stale TTL | 24 hours | Fallback cache lifetime |
The first-run wizard includes NVIDIA NIM as a provider option with:
- Provider-specific API key placeholder (
nvapi-...) - NVIDIA key validation (checks
nvapi-prefix) - Model list fetching and display
The settings editor includes NVIDIA NIM in the 3-provider list:
- Provider selection (OpenRouter, Zen, NVIDIA)
- API key configuration with secure storage
- Health check status display
- Short name abbreviation:
NVIDIA
Provider registration is centralized in internal/tui/provider_registration.go:
case "nvidia":
return nvidia.New(apiKey, nvidia.Options{
BaseURL: cfg.Provider.NvidiaBaseURL,
Version: version,
DefaultContextLen: types.DefaultContextLength,
})| File | Purpose |
|---|---|
internal/provider/nvidia/client.go |
NVIDIA NIM client implementation |
internal/config/types.go |
NvidiaAPIKey, NvidiaBaseURL config fields |
internal/config/loader.go |
API key resolution from env/keychain/config |
internal/tui/provider_registration.go |
Centralized provider factory |
internal/tui/firstrun_model.go |
First-run wizard integration |
internal/tui/settings_model.go |
Settings UI integration |