fix(core): sanitize SSE-corrupted JSON and domain strings in error classification#21702
Conversation
…assification SSE serialization can inject stray commas into the JSON error response body, causing JSON.parse() to fail and domain validation to miss corrupted domain strings. This is a mitigation until the root cause in server-side SSE chunking or client-side SSE parsing can be diagnosed and fixed. - Add sanitizeJsonString() to handle comma-whitespace-comma patterns in JSON - Add isCloudCodeDomain() to strip non-alphanumeric chars from domain strings - Add test cases for both SSE-corrupted JSON parsing and domain sanitization
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a client-side mitigation to address issues arising from SSE stream corruption, which was causing Google API 429 errors to be misclassified. By sanitizing both JSON error bodies and domain strings before processing, the system can now correctly identify quota exhaustion and validation requirements, ensuring the appropriate fallback mechanisms (like AI credits) are triggered rather than entering infinite retry loops. This significantly improves the robustness of error handling for API responses. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a client-side mitigation for SSE stream corruption that can lead to incorrect error classification. The changes involve sanitizing corrupted JSON strings by removing duplicate commas and cleaning up domain strings before validation. New tests are added to cover these sanitization cases. My main feedback is to improve the robustness of the JSON sanitization logic to handle more complex corruption patterns, such as multiple consecutive commas.
Apply the regex replacement in a loop so that patterns like ',,,'' are fully collapsed, not just reduced to ',,' on a single pass.
|
✅ Patch workflow(s) dispatched successfully! 📋 Details:
🔗 Track Progress: |
|
🚀 Patch PR Created! 📋 Patch Details:
📝 Next Steps:
🔗 Track Progress: |
|
🚀 Patch Release Started! 📋 Release Details:
⏳ Status: The patch release is now running. You'll receive another update when it completes. 🔗 Track Progress: |
|
/patch preview |
|
✅ Patch workflow(s) dispatched successfully! 📋 Details:
🔗 Track Progress: |
|
🚀 Patch PR Created! 📋 Patch Details:
📝 Next Steps:
🔗 Track Progress: |
|
🚀 Patch Release Started! 📋 Release Details:
⏳ Status: The patch release is now running. You'll receive another update when it completes. 🔗 Track Progress: |
|
✅ Patch Release Complete! 📦 Release Details:
🎉 Status: Your patch has been successfully released and published to npm! 📝 What's Available:
🔗 Links: |
Summary
Mitigate SSE stream corruption that causes 429 QUOTA_EXHAUSTED errors to be incorrectly classified as retryable, leading to unnecessary retry loops instead of triggering the AI credits fallback flow.
This is a client-side mitigation until the root cause — either in server-side SSE chunking or client-side SSE stream parsing — can be diagnosed and fixed.
Details
When the API returns a 429 error with
alt=sse, the JSON error body can arrive with a stray comma injected at a line boundary. The observed corruption pattern is:This
comma-whitespace-commapattern causes two cascading failures:JSON parsing failure:
JSON.parse()fails on the corrupted body, soparseGoogleApiErrorreturnsnull. Without structured error details,classifyGoogleErrorfalls through to treating the 429 as a genericRetryableQuotaError.Domain validation failure: Even when parsing eventually succeeds through an alternate code path, the extracted domain string contains a trailing comma (
"cloudcode-pa.googleapis.com,"), which fails theCLOUDCODE_DOMAINS.includes()exact-match check.The combined effect:
TerminalQuotaErroris never thrown, the AI credits fallback UI never triggers, and the client enters a retry loop.Changes
googleErrors.ts: AddsanitizeJsonString()that collapsescomma-whitespace-commapatterns (regex/,(\s*),/g) before allJSON.parse()calls in the error parsing pipeline.googleQuotaErrors.ts: AddisCloudCodeDomain()that strips non-alphanumeric characters (except.and-) from domain strings before comparing againstCLOUDCODE_DOMAINS.Related Issues
Related: #21704
How to Validate
npm test -w @google/gemini-cli-core -- src/utils/googleErrors.test.ts src/utils/googleQuotaErrors.test.tsshould parse a gaxios error with SSE-corrupted JSON containing stray commasshould parse a gaxios error with SSE-corrupted JSON in response.datashould return TerminalQuotaError for Cloud Code QUOTA_EXHAUSTED with SSE-corrupted domainshould return ValidationRequiredError with SSE-corrupted domainPre-Merge Checklist