App Version
v3.23.6
API Provider
VS Code Language Model API
Model Used
claude 3.7 sonnet, claude sonnet 4, etc instead of gpt-4.1
Roo Code Task Links (Optional)
Hi roocode team,
I've encountered an issue where using the VSCode LM API in RooCode seems to exhaust the GitHub Copilot chat request quota significantly faster than using GitHub Copilot directly in VSCode.
Observations
Generating longer chat responses (e.g., explanations, refactor suggestions) in RooCode leads to rapid quota exhaustion.
The same tasks and prompts in native Github Copilot with Chat Agent Auto Fix are more quota-efficient.
This suggests that RooCode may be using more tokens or chat invocations per request.
🔁 Steps to Reproduce
- Connect GitHub Copilot to RooCode using the VSCode LM API.
- Perform one of the following coding scenarios:
- Monitor how quickly the quota is consumed over several similar requests.
- Repeat the same process in GitHub Copilot
- Compare quota consumption.
💥 Outcome Summary
Expected chat quota usage similar to GitHub Copilot in VSCode, but got much faster quota exhaustion when using RooCode via the VSCode LM API.
📄 Relevant Logs or Errors (Optional)
Here’s an example of a long chat response in GitHub Copilot that consumes less quota. The quota is only deducted again when a ‘continue’ or follow-up iteration is triggered

App Version
v3.23.6
API Provider
VS Code Language Model API
Model Used
claude 3.7 sonnet, claude sonnet 4, etc instead of gpt-4.1
Roo Code Task Links (Optional)
Hi roocode team,
I've encountered an issue where using the VSCode LM API in RooCode seems to exhaust the GitHub Copilot chat request quota significantly faster than using GitHub Copilot directly in VSCode.
Observations
Generating longer chat responses (e.g., explanations, refactor suggestions) in RooCode leads to rapid quota exhaustion.
The same tasks and prompts in native Github Copilot with Chat Agent Auto Fix are more quota-efficient.
This suggests that RooCode may be using more tokens or chat invocations per request.
🔁 Steps to Reproduce
Case A: Writing a Unit Test
Create a new Go/JavaScript/Python file.
Write a complex task case file or you can use existing project that have complex task case.
Ask Copilot (via RooCode) to generate a unit test for that function.
Repeat the same prompt in native GitHub Copilot (VSCode chat sidebar).
Case B: Starting an App from Scratch
Open a new file and type:
"Create a basic Express.js server"or"Generate a REST API in Go with routing and middleware".Observe the response length and token usage from RooCode.
Try the same request using GitHub Copilot directly.
💥 Outcome Summary
Expected chat quota usage similar to GitHub Copilot in VSCode, but got much faster quota exhaustion when using RooCode via the VSCode LM API.
📄 Relevant Logs or Errors (Optional)
Here’s an example of a long chat response in GitHub Copilot that consumes less quota. The quota is only deducted again when a ‘continue’ or follow-up iteration is triggered
