microsoft · ntrogh · May 26, 2026 · May 26, 2026 · May 26, 2026 · May 26, 2026
diff --git a/build/sitemap.xml b/build/sitemap.xml
@@ -610,6 +610,11 @@
         <changefreq>weekly</changefreq>
         <priority>0.8</priority>
     </url>
+    <url>
+        <loc>https://code.visualstudio.com/docs/copilot/guides/optimize-usage</loc>
+        <changefreq>weekly</changefreq>
+        <priority>0.8</priority>
+    </url>
     <url>
         <loc>https://code.visualstudio.com/docs/copilot/guides/customize-copilot-guide</loc>
         <changefreq>weekly</changefreq>

diff --git a/docs/copilot/agents/agent-tools.md b/docs/copilot/agents/agent-tools.md
@@ -97,7 +97,7 @@ When you select the **Autopilot** permission level, the agent behaves differentl
 Autopilot is available in the Chat view when the `setting(chat.autopilot.enabled)` setting is enabled (on by default).
 
 > [!NOTE]
-> Autopilot uses premium requests in the same way that these are used when you are working in the standard interactive interface. This means that as the agent continues to work autonomously, it can consume multiple requests.
+> Autopilot consumes AI credits in the same way as the standard interactive interface. Learn more about [usage-based billing](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals).
 
 ## Tool approval
 

diff --git a/docs/copilot/agents/agents-tutorial.md b/docs/copilot/agents/agents-tutorial.md
@@ -18,7 +18,10 @@ Keywords:
 This tutorial walks you through using different types of agents in Visual Studio Code. You build a todo app from scratch, add a theme toggle, and redesign the layout by delegating work across local, plan, background, and cloud agents.
 
 > [!TIP]
-> If you don't yet have a Copilot subscription, you can use Copilot for free by signing up for the [Copilot Free plan](https://github.com/github-copilot/signup) and get a monthly limit of inline suggestions and chat interactions.
+> If you don't yet have a Copilot subscription, you can use Copilot for free by signing up for the [Copilot Free plan](https://github.com/github-copilot/signup) and get a monthly allowance of inline suggestions and AI credits.
+
+> [!IMPORTANT]
+> **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, Max, and Student plans are temporarily paused.
 
 <div class="docs-action" data-show-in-doc="false" data-show-in-sidebar="true" title="Test web apps with browser agent tools">
 Use browser agent tools to build and automatically test web applications.

diff --git a/docs/copilot/agents/third-party-agents.md b/docs/copilot/agents/third-party-agents.md
@@ -104,7 +104,7 @@ To disable the OpenAI Codex agent, disable or uninstall the [OpenAI Codex](https
 * A Copilot Pro+ subscription for authentication
 * For local sessions, the [OpenAI Codex](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt) extension
 
-OpenAI Codex in VS Code enables you to use your Copilot Pro+ subscription to authenticate and access Codex without additional setup. Get more information about [GitHub Copilot billing and premium requests](https://docs.github.com/en/copilot/concepts/billing/copilot-requests) in the GitHub documentation.
+OpenAI Codex in VS Code enables you to use your Copilot Pro+ subscription to authenticate and access Codex without additional setup. Get more information about [GitHub Copilot usage-based billing](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals) in the GitHub documentation.
 
 ### Start a Codex session
 

diff --git a/docs/copilot/ai-powered-suggestions.md b/docs/copilot/ai-powered-suggestions.md
@@ -25,10 +25,7 @@ Follow a hands-on tutorial to build your first app with AI in VS Code.
 ## Prerequisites
 
 * Visual Studio Code installed on your machine. Follow these steps to [set up VS Code](/docs/setup/setup-overview.md).
-* Access to a GitHub Copilot subscription. Follow these steps to [set up GitHub Copilot](/docs/copilot/setup.md). You can set up Copilot Free to get a monthly limit of inline suggestions and chat interactions.
-
-> [!IMPORTANT]
-> **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, and student plans are temporarily paused. Additionally, we are tightening weekly usage limits. See [GitHub Copilot usage limits](https://docs.github.com/copilot/concepts/usage-limits).
+* Access to a GitHub Copilot subscription. Follow these steps to [set up GitHub Copilot](/docs/copilot/setup.md). You can set up Copilot Free to get a monthly allowance of inline suggestions and AI credits.
 
 ## Ghost text suggestions
 

diff --git a/docs/copilot/best-practices.md b/docs/copilot/best-practices.md
@@ -42,7 +42,7 @@ AI in VS Code offers several interaction modes. Choosing the right one for the t
 
 | Tool | Best for | Example |
 |------|----------|---------|
-| [Inline suggestions](/docs/copilot/ai-powered-suggestions.md) | Staying in the flow while writing code | Code completions, variable names, boilerplate |
+| [Inline suggestions](/docs/copilot/ai-powered-suggestions.md) | Staying in the flow while writing code | Inline suggestions, variable names, boilerplate |
 | [Ask (chat)](/docs/copilot/chat/copilot-chat.md) | Questions, brainstorming, exploring ideas | "How does authentication work in this project?" |
 | [Inline chat](/docs/copilot/chat/inline-chat.md) | Targeted, in-place edits without switching context | Refactoring a function, adding error handling |
 | [Agents](/docs/copilot/agents/overview.md) | Multi-file changes that require autonomous planning and tool use | Implementing a feature end-to-end |
@@ -133,11 +133,13 @@ Each AI model has different strengths. Some are better at reasoning, others exce
 
 * **Use BYOK for additional control.** Bring your own API key for more model choices and hosting options.
 
+* **Consider credit consumption.** More capable models consume more [AI credits](/docs/copilot/concepts/language-models.md#ai-credits-and-model-costs) per token. Auto model selection balances quality and cost automatically. For more tips, see [optimize AI credit usage](/docs/copilot/guides/optimize-usage.md).
+
 For more information, see [selecting AI models](/docs/copilot/customization/language-models.md) and [available models for Copilot Chat](https://docs.github.com/en/copilot/using-github-copilot/ai-models/changing-the-ai-model-for-copilot-chat).
 
 ## Plan first, then implement
 
-For complex changes that span multiple files, separate planning from implementation. This approach prevents the AI from solving the wrong problem.
+For complex changes that span multiple files, separate planning from implementation. This approach prevents the AI from solving the wrong problem and avoids spending [AI credits](/docs/copilot/concepts/language-models.md#ai-credits-and-model-costs) on code that needs to be thrown away.
 
 1. **Explore.** Use ask mode or a subagent to read the relevant code and understand how it works before making changes.
 1. **Plan.** Use the [Plan agent](/docs/copilot/agents/planning.md) to create a structured implementation plan. Review and refine the plan before executing.
@@ -164,19 +166,21 @@ For more information, see [GitHub Copilot security](/docs/copilot/security.md) a
 
 AI responses might degrade as the conversation fills with irrelevant context. Manage your sessions proactively.
 
-* **Start new sessions for unrelated tasks.** Don't keep piling unrelated questions into one conversation. Context pollution reduces response quality.
+* **Start new sessions for unrelated tasks.** Don't keep piling unrelated questions into one conversation. Context pollution reduces response quality and wastes tokens on irrelevant history.
 
 * **Remove irrelevant history.** Delete past questions and responses that are no longer relevant, or start a fresh session.
 
-* **Compact context.** Use [/compact](/docs/copilot/chat/copilot-chat-context.md#context-compaction) and provide instructions to selectively compact the context and retain only the most relevant information.
+* **Compact context.** Use [/compact](/docs/copilot/chat/copilot-chat-context.md#context-compaction) and provide instructions to selectively compact the context and retain only the most relevant information. Compacting reduces the tokens sent with each subsequent request, which helps [manage AI credit usage](/docs/copilot/guides/optimize-usage.md).
 
 * **Use subagents for investigation.** Hint the AI to perform research and exploration in isolation by using [subagents](/docs/copilot/agents/subagents.md) so the findings don't clutter your main context.
 
 * **Choose the right session type.** Use local sessions for quick tasks on your current code that need your immediate attention, background tasks for tasks that can run locally and isolated from your main context, or cloud sessions that can benefit from team-collaboration.
 
 * **Scale with parallel sessions.** Run multiple sessions in parallel for independent tasks to save time and keep contexts separate. You can have multiple sessions running at once, across local, background, and cloud environments, and switch between them via the [sessions list](/docs/copilot/chat/chat-sessions.md#sessions-list) in VS Code.
 
-For more information, see [session management](/docs/copilot/chat/chat-sessions.md) and [workspace indexing](/docs/copilot/reference/workspace-context.md).
+* **Fork instead of re-prompting.** Use [`/fork`](/docs/copilot/chat/chat-sessions.md#fork-a-chat-session) to explore alternatives without losing context, instead of starting over and re-establishing context from scratch.
+
+For more information, see [session management](/docs/copilot/chat/chat-sessions.md), [workspace indexing](/docs/copilot/reference/workspace-context.md), and [optimize AI credit usage](/docs/copilot/guides/optimize-usage.md).
 
 ## Work with large codebases
 

diff --git a/docs/copilot/chat/copilot-chat-context.md b/docs/copilot/chat/copilot-chat-context.md
@@ -133,7 +133,7 @@ As you send more requests in a conversation, the control updates to reflect the
 
 ## Context compaction
 
-As a conversation grows, the accumulated messages and context can fill up the model's context window. Context compaction summarizes the conversation history to free up space, so you can continue working in the same session without losing important details.
+As a conversation grows, the accumulated messages and context can fill up the model's context window. Context compaction summarizes the conversation history to free up space, so you can continue working in the same session without losing important details. Compacting also reduces the number of tokens sent with each subsequent request, which helps manage [AI credit consumption](/docs/copilot/guides/optimize-usage.md).
 
 ### Automatic compaction
 

diff --git a/docs/copilot/chat/copilot-chat.md b/docs/copilot/chat/copilot-chat.md
@@ -19,8 +19,8 @@ Follow a hands-on tutorial to experience local, background, and cloud agents in
 
 * Access to [GitHub Copilot](/docs/copilot/setup.md). If you don't have a subscription, you can use Copilot for free by signing up for the [Copilot Free plan](https://github.com/github-copilot/signup).
 
-    > [!IMPORTANT]
-    > **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, and student plans are temporarily paused. Additionally, we are tightening weekly usage limits. See [GitHub Copilot usage limits](https://docs.github.com/copilot/concepts/usage-limits).
+> [!IMPORTANT]
+> **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, Max, and Student plans are temporarily paused. 
 
 ## Access chat in VS Code
 

diff --git a/docs/copilot/concepts/language-models.md b/docs/copilot/concepts/language-models.md
@@ -72,13 +72,15 @@ You can switch models at any time, based on your needs for a particular task. Fo
 
 Auto model selection combines two systems to route each request to the optimal model. One system tracks real-time model health and availability, while the other evaluates task complexity. Together, they match each task to the model that can solve it most efficiently, reserving higher-cost reasoning models for problems that need them and routing simpler tasks to faster models.
 
-Auto selects from multiple models and respects your organization's [model access settings](https://docs.github.com/en/copilot/how-tos/use-ai-models/configure-access-to-ai-models). Auto won't select models that have a premium request multiplier greater than 1x, models excluded by administrator policies, or models restricted by data-residency policies. If none of the preferred models are available or you run out of premium requests, auto falls back to a model at 0x multiplier.
+Auto selects from multiple models and respects your organization's [model access settings](https://docs.github.com/en/copilot/how-tos/use-ai-models/configure-access-to-ai-models). Auto won't select models excluded by administrator policies or models restricted by data-residency policies.
 
 For more details, see [About Copilot auto model selection](https://docs.github.com/en/copilot/concepts/auto-model-selection) in the GitHub documentation.
 
-### Premium requests and multipliers
+### AI credits and model costs
 
-Different models consume premium requests at different rates, expressed as a multiplier. For example, a model with a 2x multiplier uses two premium requests per interaction. When you use auto model selection, VS Code applies a variable [model multiplier](https://docs.github.com/en/copilot/concepts/billing/copilot-requests#model-multipliers) based on the selected model. If you are on a paid Copilot plan, auto applies a 10% multiplier discount.
+Each Copilot plan includes a monthly allowance of [AI credits](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals). Different models consume AI credits at different rates, based on the model and the number of tokens processed. More capable models cost more per token, while lighter models extend your usage further. When you use auto model selection, VS Code routes each request to an efficient model that balances quality and cost.
+
+Other factors also affect credit consumption, such as [thinking effort](/docs/copilot/customization/language-models.md#configure-thinking-effort) (higher effort produces more thinking tokens), context window size, and tool usage. For practical tips on reducing credit consumption, see [optimize AI credit usage](/docs/copilot/guides/optimize-usage.md).
 
 Learn how to [choose and configure language models](/docs/copilot/customization/language-models.md) in VS Code.
 

diff --git a/docs/copilot/concepts/tools.md b/docs/copilot/concepts/tools.md
@@ -43,6 +43,7 @@ Use the **Configure Tools** button in the chat input field to enable or disable
 Limiting the available tools can help in several ways:
 
 * **Preserve context**: every tool call produces output that consumes space in the [context window](/docs/copilot/concepts/language-models.md#context-window). Fewer tools means the agent is less likely to make unnecessary calls that fill up the context.
+* **Reduce credit consumption**: unnecessary tool calls increase token usage and consume more [AI credits](/docs/copilot/concepts/language-models.md#ai-credits-and-model-costs). Disabling tools you don't need for a task helps keep costs down.
 * **Get more relevant results**: when fewer tools are available, the agent focuses on the most appropriate ones rather than choosing from a large set.
 * **Improve performance**: a smaller tool set reduces the decision space for the model, which can speed up responses.
 

diff --git a/docs/copilot/copilot-cloud-agent.md b/docs/copilot/copilot-cloud-agent.md
@@ -45,7 +45,10 @@ Ensure you are signed into the GitHub Pull Request extension with the correct Gi
 You can also manage coding agent sessions from a dedicated chat editor and view a **Chat Sessions** view by enabling the experimental setting `setting(chat.agentSessionsViewLocation)`.
 
 > [!TIP]
-> If you don't have Copilot access yet, you can sign up for the [Copilot Free plan](https://github.com/features/copilot/plans) to get a monthly limit of interactions.
+> If you don't have Copilot access yet, you can sign up for the [Copilot Free plan](https://github.com/features/copilot/plans) to get a monthly allowance of inline suggestions and AI credits.
+
+> [!IMPORTANT]
+> **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, Max, and Student plans are temporarily paused.
 
 ## Assign work to Copilot cloud agent in VS Code
 
@@ -274,7 +277,7 @@ You can monitor progress through the session logs accessible from the pull reque
 
 ### What security protections does Copilot cloud agent have?
 
-Copilot cloud agent includes built-in security protections and operates within GitHub's security framework. For detailed information about security measures, permissions, and branch protection compatibility, see the [GitHub Copilot cloud agent security documentation](https://docs.github.com/en/copilot/concepts/about-copilot-coding-agent#built-in-security-protections).
+Copilot cloud agent includes built-in security protections and operates within GitHub's security framework. For detailed information about security measures, permissions, and branch protection compatibility, see the [GitHub Copilot cloud agent security documentation](https://docs.github.com/en/copilot/concepts/agents/cloud-agent/risks-and-mitigations).
 
 ### Can I extend Copilot cloud agent with external tools?
 

diff --git a/docs/copilot/customization/hooks.md b/docs/copilot/customization/hooks.md
@@ -449,7 +449,7 @@ The `Stop` hook can prevent the agent from stopping:
 | `reason` | string | Required when decision is `"block"`. Tells the agent why it should continue. |
 
 > [!IMPORTANT]
-> When a `Stop` hook blocks the agent from stopping, the agent continues running and the additional turns consume [premium requests](https://docs.github.com/en/copilot/managing-copilot/monitoring-usage-and-entitlements/about-premium-requests). Always check the `stop_hook_active` field to prevent the agent from running indefinitely.
+> When a `Stop` hook blocks the agent from stopping, the agent continues running and the additional turns consume [AI credits](https://docs.github.com/en/copilot/concepts/billing/usage-based-billing-for-individuals). Always check the `stop_hook_active` field to prevent the agent from running indefinitely.
 
 ## SubagentStart
 

diff --git a/docs/copilot/customization/language-models.md b/docs/copilot/customization/language-models.md
@@ -42,6 +42,9 @@ Some models support configurable thinking effort, which controls how much reason
 
 By default, VS Code sets recommended effort levels and has adaptive reasoning enabled, where the model dynamically determines how much to think based on the complexity of each request. For most use cases, the defaults work well.
 
+> [!TIP]
+> Higher thinking effort produces more thinking tokens, which increases [AI credit](/docs/copilot/concepts/language-models.md#ai-credits-and-model-costs) consumption. Only increase thinking effort for genuinely complex tasks. Learn more about [optimizing AI credit usage](/docs/copilot/guides/optimize-usage.md).
+
 To configure the thinking effort:
 
 1. Open the model picker in the chat input field and select a reasoning model.
@@ -68,9 +71,6 @@ To use auto model selection, select **Auto** from the model picker in chat. You
 
 ![Screenshot of a chat response, showing the selected model on hover.](../images/language-models/chat-response-selected-model.png)
 
-> [!IMPORTANT]
-> **Starting April 20, 2026**, new sign-ups for Copilot Pro, Copilot Pro+, and student plans are temporarily paused. Additionally, we are tightening weekly usage limits. If you hit a weekly limit and you have premium requests remaining, you can continue using Copilot with auto model selection. See [GitHub Copilot usage limits](https://docs.github.com/copilot/concepts/usage-limits).
-
 ## Manage language models
 
 You can use the language models editor to view all available models, choose which models are shown in the model picker, and add more models by adding from built-in providers or from extension-provided model providers.