You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expand the copilot engine tier in engine.sh from its current single-model configuration (o4-mini only) to support model chains similar to the Claude tier. GitHub Models API provides access to 20+ models (o4-mini, GPT-5 mini, MiniMax-M1, Meta Llama 4 Scout) through the same API key and trust boundary. Adding COPILOT_*_MODEL_CHAIN variables would provide rate-limit resilience and enable per-tier model selection within GitHub-hosted infrastructure — directly addressing Discussion #631's request to evaluate OpenAI, Llama, and additional providers.
Market Signal
GitHub Models marketplace now hosts 20+ models from multiple providers (OpenAI, Meta, Microsoft, Mistral, MiniMax) accessible via a unified API. Custom agents and sub-agents reached GA across all Copilot plans in June 2026. The GitHub Models API uses the same authentication (GITHUB_TOKEN or PAT with Copilot subscription) and same billing as existing Copilot usage. SWE-bench data (June 2026) shows Meta Llama 4 and MiniMax M3 matching or exceeding Claude Sonnet 4.5 on coding tasks at significantly lower cost. GPT-5 mini offers strong coding performance at lower rate limits than o4-mini.
User Signal
Discussion #631 specifically requests evaluating "additional vendors, models" including "OpenAI" and "Llama." engine.sh already supports COPILOT_API_MODEL as a configurable env var (line 107) but only configures a single model per tier. The copilot engine block (lines 96-119) has no model chain — all tiers use the same o4-mini. The Claude engine demonstrates the chain pattern (CLAUDE_TRIAGE_MODEL_CHAIN, etc.) with graceful rate-limit fallback. The comment "No in-engine chain for Copilot — single GitHub Models endpoint" (line 114) explicitly notes this as a known gap.
Technical Opportunity
The _claude_chain_invoke() pattern (line 410) provides the template: walk a comma-separated list of models, try each in sequence, fall back on rate-limit. A _copilot_chain_invoke() function would mirror this for the GitHub Models API via gh copilot --model. The COPILOT_API_MODEL env var already handles model selection; extending to COPILOT_TRIAGE_MODEL_CHAIN etc. fits the existing architecture.
Audit: openai/o4-mini (keep simple until chain proves reliable)
All models are within the same GitHub trust boundary — no new API keys, accounts, or data-privacy considerations.
Assessment
Dimension
Score
Rationale
Feasibility
med
Mirrors proven Claude chain architecture; requires testing each model's compatibility with existing review prompts
Impact
med
Improves copilot engine resilience; opens access to additional models within existing trust boundary; responds to #631
Urgency
med
Enhancement to existing capability; not blocking but strategically valuable for provider diversification
Adversarial Review
Strongest objection: GitHub Models API model availability and rate limits are opaque — there is no public SLA or guaranteed uptime for non-OpenAI models. Adding multiple models creates more failure modes and testing surface. The existing o4-mini is sufficient and battle-tested in the copilot engine.
Rebuttal: The engine.sh architecture already handles model chain fallback gracefully — if a model is unavailable or rate-limited, it tries the next in the chain. This is the same pattern that makes Claude chains robust. Adding models to the copilot chain does not increase fragility; it increases resilience by providing fallback options when o4-mini hits rate limits. The implementation mirrors proven Claude chain code, minimizing new test surface. GitHub Models uses the same trust boundary — no new API keys, accounts, or data-privacy considerations.
Suggested Next Step
Inventory models available on GitHub Models API that are suitable for code review tasks. Prototype a _copilot_chain_invoke() function mirroring _claude_chain_invoke(). Define per-tier chains (e.g., COPILOT_TRIAGE_MODEL_CHAIN='openai/o4-mini,meta/llama-4-scout'). Add pricing rows to model-pricing.tsv for each candidate model.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Expand the copilot engine tier in
engine.shfrom its current single-model configuration (o4-mini only) to support model chains similar to the Claude tier. GitHub Models API provides access to 20+ models (o4-mini, GPT-5 mini, MiniMax-M1, Meta Llama 4 Scout) through the same API key and trust boundary. AddingCOPILOT_*_MODEL_CHAINvariables would provide rate-limit resilience and enable per-tier model selection within GitHub-hosted infrastructure — directly addressing Discussion #631's request to evaluate OpenAI, Llama, and additional providers.Market Signal
GitHub Models marketplace now hosts 20+ models from multiple providers (OpenAI, Meta, Microsoft, Mistral, MiniMax) accessible via a unified API. Custom agents and sub-agents reached GA across all Copilot plans in June 2026. The GitHub Models API uses the same authentication (
GITHUB_TOKENor PAT with Copilot subscription) and same billing as existing Copilot usage. SWE-bench data (June 2026) shows Meta Llama 4 and MiniMax M3 matching or exceeding Claude Sonnet 4.5 on coding tasks at significantly lower cost. GPT-5 mini offers strong coding performance at lower rate limits than o4-mini.User Signal
Discussion #631 specifically requests evaluating "additional vendors, models" including "OpenAI" and "Llama."
engine.shalready supportsCOPILOT_API_MODELas a configurable env var (line 107) but only configures a single model per tier. The copilot engine block (lines 96-119) has no model chain — all tiers use the same o4-mini. The Claude engine demonstrates the chain pattern (CLAUDE_TRIAGE_MODEL_CHAIN, etc.) with graceful rate-limit fallback. The comment "No in-engine chain for Copilot — single GitHub Models endpoint" (line 114) explicitly notes this as a known gap.Technical Opportunity
The
_claude_chain_invoke()pattern (line 410) provides the template: walk a comma-separated list of models, try each in sequence, fall back on rate-limit. A_copilot_chain_invoke()function would mirror this for the GitHub Models API viagh copilot --model. TheCOPILOT_API_MODELenv var already handles model selection; extending toCOPILOT_TRIAGE_MODEL_CHAINetc. fits the existing architecture.Candidate model chains:
openai/o4-mini,meta/llama-4-scout(fast, cost-efficient)openai/o4-mini,openai/gpt-5-mini(reasoning-capable fallback)openai/o4-mini(keep simple until chain proves reliable)All models are within the same GitHub trust boundary — no new API keys, accounts, or data-privacy considerations.
Assessment
Adversarial Review
Strongest objection: GitHub Models API model availability and rate limits are opaque — there is no public SLA or guaranteed uptime for non-OpenAI models. Adding multiple models creates more failure modes and testing surface. The existing o4-mini is sufficient and battle-tested in the copilot engine.
Rebuttal: The
engine.sharchitecture already handles model chain fallback gracefully — if a model is unavailable or rate-limited, it tries the next in the chain. This is the same pattern that makes Claude chains robust. Adding models to the copilot chain does not increase fragility; it increases resilience by providing fallback options when o4-mini hits rate limits. The implementation mirrors proven Claude chain code, minimizing new test surface. GitHub Models uses the same trust boundary — no new API keys, accounts, or data-privacy considerations.Suggested Next Step
Inventory models available on GitHub Models API that are suitable for code review tasks. Prototype a
_copilot_chain_invoke()function mirroring_claude_chain_invoke(). Define per-tier chains (e.g.,COPILOT_TRIAGE_MODEL_CHAIN='openai/o4-mini,meta/llama-4-scout'). Add pricing rows tomodel-pricing.tsvfor each candidate model.Beta Was this translation helpful? Give feedback.
All reactions