Summary
Foundry Toolkit for VS Code is unable to show models in the model catalog. Prompt/agent discovery works, but model discovery repeatedly hits 429 TooManyRequests / QuotaExceeded from both Azure Foundry catalog and GitHub model catalog APIs.
Environment
- Product: Foundry Toolkit for VS Code
- Extension version: 1.4.2
- VS Code: VS Code Insiders
- OS: Windows
- Local service:
Inference.Service.Agent.exe
- Local API observed:
http://localhost:5272/foundry/list
- Azure Foundry catalog region in error:
eastus2
Reproduction Steps
- Sign in to Azure from VS Code.
- Open Foundry Toolkit.
- Open or refresh the model catalog/model list.
- Observe that models do not appear correctly.
- Check
Foundry Toolkit.log.
Expected Behavior
Foundry Toolkit should load the model catalog without repeatedly triggering throttling. If a catalog request is throttled, it should honor Retry-After, back off, and avoid duplicate concurrent refreshes.
Actual Behavior
Foundry Toolkit repeatedly calls the local /foundry/list endpoint, which then attempts to fetch model catalog data. The catalog calls are throttled with 429 TooManyRequests / QuotaExceeded, and models are not shown correctly.
Prompt/agent discovery appears to work, but model catalog discovery fails.
Error Details
Observed repeated errors in Foundry Toolkit.log:
Exception fetching models from Azure Foundry catalog
System.Exception: Failed: Fetching model list from Foundry Catalog
---> System.Net.Http.HttpRequestException: Received too many requests in a short amount of time. Retry again after 1 seconds.
"statusCode": 429
"code": "UserError"
"message": "Received too many requests in a short amount of time. Retry again after 1 seconds."
"innerError": {
"code": "QuotaExceeded",
"innerError": {
"code": "TooManyRequests"
}
}

Summary
Foundry Toolkit for VS Code is unable to show models in the model catalog. Prompt/agent discovery works, but model discovery repeatedly hits
429 TooManyRequests/QuotaExceededfrom both Azure Foundry catalog and GitHub model catalog APIs.Environment
Inference.Service.Agent.exehttp://localhost:5272/foundry/listeastus2Reproduction Steps
Foundry Toolkit.log.Expected Behavior
Foundry Toolkit should load the model catalog without repeatedly triggering throttling. If a catalog request is throttled, it should honor
Retry-After, back off, and avoid duplicate concurrent refreshes.Actual Behavior
Foundry Toolkit repeatedly calls the local
/foundry/listendpoint, which then attempts to fetch model catalog data. The catalog calls are throttled with429 TooManyRequests/QuotaExceeded, and models are not shown correctly.Prompt/agent discovery appears to work, but model catalog discovery fails.
Error Details
Observed repeated errors in
Foundry Toolkit.log: