VSCode LM API in RooCode Exhausts GitHub Copilot Chat Quota Faster Than Native Github Copilot with Chat Agent Auto Fix Usage

### App Version

v3.23.6

### API Provider

VS Code Language Model API

### Model Used

claude 3.7 sonnet, claude sonnet 4, etc instead of gpt-4.1

### Roo Code Task Links (Optional)

Hi roocode team,

I've encountered an issue where using the VSCode LM API in RooCode seems to exhaust the GitHub Copilot chat request quota significantly faster than using GitHub Copilot directly in VSCode.

**Observations**
Generating longer chat responses (e.g., explanations, refactor suggestions) in RooCode leads to rapid quota exhaustion.

The same tasks and prompts in native Github Copilot with Chat Agent Auto Fix are more quota-efficient.

This suggests that RooCode may be using more tokens or chat invocations per request.



### 🔁 Steps to Reproduce

1. Connect GitHub Copilot to RooCode using the VSCode LM API.
2. Perform one of the following coding scenarios:

-   **Case A: Writing a Unit Test**
    
    -   Create a new Go/JavaScript/Python file.
        
    -   Write a complex task case file or you can use existing project that have complex task case.
        
    -   Ask Copilot (via RooCode) to generate a unit test for that function.
        
    -   Repeat the same prompt in native GitHub Copilot (VSCode chat sidebar).
        
-   **Case B: Starting an App from Scratch**
    
    -   Open a new file and type:  
        `"Create a basic Express.js server"` or  
        `"Generate a REST API in Go with routing and middleware"`.
        
    -   Observe the response length and token usage from RooCode.
        
    -   Try the same request using GitHub Copilot directly.
3. Monitor how quickly the quota is consumed over several similar requests.
4. Repeat the same process in GitHub Copilot
5. Compare quota consumption.



### 💥 Outcome Summary

Expected chat quota usage similar to GitHub Copilot in VSCode, but got much faster quota exhaustion when using RooCode via the VSCode LM API.

### 📄 Relevant Logs or Errors (Optional)

Here’s an example of a long chat response in GitHub Copilot that consumes less quota. The quota is only deducted again when a ‘continue’ or follow-up iteration is triggered
<img width="604" height="1682" alt="Image" src="https://github.com/user-attachments/assets/de410d64-c6ef-4168-afe9-2e2027d16837" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VSCode LM API in RooCode Exhausts GitHub Copilot Chat Quota Faster Than Native Github Copilot with Chat Agent Auto Fix Usage #5598

App Version

API Provider

Model Used

Roo Code Task Links (Optional)

🔁 Steps to Reproduce

💥 Outcome Summary

📄 Relevant Logs or Errors (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VSCode LM API in RooCode Exhausts GitHub Copilot Chat Quota Faster Than Native Github Copilot with Chat Agent Auto Fix Usage #5598

Description

App Version

API Provider

Model Used

Roo Code Task Links (Optional)

🔁 Steps to Reproduce

💥 Outcome Summary

📄 Relevant Logs or Errors (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions