FEATURE: BYOM provider registration for generic local inference endpoints (non-Anthropic)

## Problem Statement

Copilot CLI supports BYOM providers since v1.0.32+, but the supported providers are limited to Anthropic-specific configurations. There is no configuration path for **generic local inference endpoints** (e.g., LM Studio, Ollama, llama.cpp) that serve OpenAI-compatible APIs.

This means:
- Local models cannot be registered as BYOM providers for main session tool calls
- Subagent tasks dispatched via `runSubagent` have no local routing path
- Extensions like VSCode-LMStudio-Bridge cannot expose local models to the Copilot ecosystem

## How Other Agents Handle This

### Opencode (anomalyco/opencode)
- Provider config system with explicit provider registration
- Supports llama.cpp, Ollama, and generic OpenAI-compatible endpoints
- Subagents inherit session model; provider configuration flows through all dispatch paths

### Claude Code (anthropics/claude-code)
- Defaults to cloud but overridable via `SG_AGENTIC_MODEL` env var
- Model selection is per-request, not session-bound
- Allows explicit model override for subagent tasks

### Codex (openai/codex)
- `ModelProvider` abstraction with routing layer (`models_endpoint.rs`)
- Supports multiple providers including Amazon Bedrock and local endpoints
- Clean separation between tool execution and model inference

## What We Need

1. **Generic BYOM provider registration**: Allow configuration of arbitrary OpenAI-compatible endpoints as BYOM providers
2. **Subagent model inheritance**: Ensure subagent tasks respect the session model when it is a local endpoint
3. **Provider priority/fallback**: Support tiered routing (local first, cloud fallback with warning)

## Security Implications

- **Data leakage**: Workspace context transmitted to cloud without user consent when local models are available
- **Cost opacity**: Untracked cloud API usage from subagent tasks
- **Trust erosion**: Silent cloud routing violates local-only expectations
- **Compliance risk**: Sensitive code may be processed by third-party inference services

## Related Issues

- github/copilot-cli#3565 — multiplier guard silently downgrades subagent model
- github/copilot-cli#3301 — local model feature request
- ian-morgan99/VSCode-LMStudio-Bridge#337 — bridge-level gap analysis
- ian-morgan99/VSCode-LMStudio-Bridge#336 — zero BYOM integration in bridge

## Acceptance Criteria

1. [ ] Generic OpenAI-compatible endpoints can be registered as BYOM providers
2. [ ] Subagent tasks respect session model when it is a local endpoint
3. [ ] Provider priority/fallback configuration is supported
4. [ ] Cost/privacy warnings display when cloud fallback triggers
5. [ ] Documentation updated with routing behavior and configuration options

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: BYOM provider registration for generic local inference endpoints (non-Anthropic) #3624

Problem Statement

How Other Agents Handle This

Opencode (anomalyco/opencode)

Claude Code (anthropics/claude-code)

Codex (openai/codex)

What We Need

Security Implications

Related Issues

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

FEATURE: BYOM provider registration for generic local inference endpoints (non-Anthropic) #3624

Description

Problem Statement

How Other Agents Handle This

Opencode (anomalyco/opencode)

Claude Code (anthropics/claude-code)

Codex (openai/codex)

What We Need

Security Implications

Related Issues

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions