Preview estimated AI Credits, model routing, token usage, and runtime before running Copilot Auto / agent tasks #200824
Replies: 1 comment
-
|
Thank you for your interest in contributing to our community! We currently only accept discussions created through the GitHub UI using our provided discussion templates. Please re-submit your discussion by navigating to the appropriate category and using the template provided. This discussion has been closed because it was not submitted through the expected format. If you believe this was a mistake, please reach out to the maintainers. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Original: It would seem to me that the GitHub team has plenty of examples of varying complexity code bases, prompt replies, and GitHub Copilot output. Couldn't some work be done to create a database for GH Copilot to reference, and then based on solution complexity (described explicitly or determined implicitly by GH Copilot), couldn't GH Copilot estimate what model it would use, how many tokens it would cost and how long it would take to implement? This information would be invaluable when trying to actually figure out how much to budget for a particular feature/product, which could then inform budgets in the gh agent workflows.
GPT 5.5 Pro Refinement: I would like GitHub Copilot to provide a pre-flight estimate before running tasks that may consume significant GitHub AI Credits, especially when using Auto Model Selection, Copilot Chat agent mode, Copilot cloud agent, or GitHub Agentic Workflows.
Today, Copilot can select models automatically and GitHub’s billing model is based on token usage, including input, output, and cached tokens, converted into GitHub AI Credits. That makes sense as a usage-based model, but it creates a planning problem: before I ask Copilot to implement a feature, fix a bug across several files, generate tests, or run an agent workflow, I do not have a practical way to estimate how expensive that request is likely to be.
The feature I am asking for is not an exact quote or a guarantee. A confidence-banded estimate would be extremely useful. Before starting a task, Copilot could show something like:
Estimated task profile: medium complexity
Likely model route: Auto, expected to use [model family or model tier]
Estimated input tokens: 40k–80k
Estimated output tokens: 5k–15k
Estimated cached tokens: 10k–30k
Estimated GitHub AI Credits: 120–260
Estimated runtime: 8–20 minutes
Confidence: medium
Main cost drivers: repository size, number of files likely to be inspected, test generation, multi-step planning, tool use
This would help developers and teams make better decisions before spending budget. It would be especially useful for organizations using Copilot Business or Copilot Enterprise where AI Credits are pooled, budgets exist at user, organization, cost-center, and enterprise levels, and users may be blocked after a budget is exhausted. A pre-flight estimate would reduce surprise spend, reduce failed or blocked agent sessions, and help teams decide whether a task should be handled by Chat, cloud agent, a cheaper model, a higher-capability model, or a human developer.
A strong MVP could be:
Add a “Preview cost and runtime” option before executing a Copilot agent task.
Show estimated model route, token range, AI Credit range, elapsed-time range, and confidence level.
Show the top cost drivers in plain language.
Let users set a per-task maximum, such as “do not run if estimated usage exceeds 500 AI Credits.”
After completion, show estimated vs actual model usage, tokens, AI Credits, runtime, and files changed so users can learn and refine future prompts.
For GitHub Agentic Workflows, this could also be exposed as workflow metadata or a dry-run mode, for example:
copilot:
estimate: true
max_ai_credits: 500
max_runtime_minutes: 30
on_estimate_exceeded: require_approval
This would make Copilot much easier to use in budgeted engineering processes. Product managers and engineering leads could estimate whether a feature is appropriate for Copilot, compare AI-assisted implementation cost across tasks, set sensible budgets, and decide which workflows should require approval.
I understand that exact prediction may be difficult because Auto Model Selection depends on task complexity, model availability, system health, cache behavior, tool orchestration, and how many iterations are required. But even a calibrated estimate with low/medium/high confidence would be much better than no visibility before execution.
GitHub likely already has many of the signals needed to power this: repository size, indexed context size, language mix, dependency graph, number of relevant files, historical task telemetry, prompt complexity, likely tool calls, prior token usage by model, and actual usage reporting after completion. The estimate could be based on anonymized aggregate patterns and local repository metadata without exposing private code or guaranteeing a specific model.
This would make usage-based billing feel more predictable, make Auto Model Selection easier to trust, and make Copilot agents more practical for teams that need to manage engineering budgets.
Beta Was this translation helpful? Give feedback.
All reactions