Skip to content

fix(core): decode API error messages from raw bytes to readable text#23283

Open
cRAN-cg wants to merge 15 commits intogoogle-gemini:mainfrom
cRAN-cg:fix/api-error-byte-decode-19851
Open

fix(core): decode API error messages from raw bytes to readable text#23283
cRAN-cg wants to merge 15 commits intogoogle-gemini:mainfrom
cRAN-cg:fix/api-error-byte-decode-19851

Conversation

@cRAN-cg
Copy link
Copy Markdown

@cRAN-cg cRAN-cg commented Mar 20, 2026

Summary

  • Fixes API error messages displayed as raw comma-separated byte values (e.g. 91,123,10,32,32,34,101,114,...) instead of human-readable text when the @google/genai SDK's ApiError contains a response body coerced from Uint8Array via .toString()
  • Adds decodeByteCodedString() utility that detects comma-separated byte patterns and decodes them to UTF-8
  • Applies byte-decoding in errorParsing.ts, errors.ts, and googleErrors.ts before displaying or parsing error messages

Fixes #19851

Test plan

  • Added unit tests in errors.test.ts for decodeByteCodedString() — pure byte strings, prefixed byte strings, normal strings unchanged, empty strings
  • Added unit tests in errors.test.ts for getErrorMessage() — byte-coded ApiError messages, prefixed byte messages
  • Added unit tests in googleErrors.test.ts for parseGoogleApiError() — byte-coded error parsing, status-prefixed byte messages
  • All 53 existing + new tests pass (errors.test.ts: 36, googleErrors.test.ts: 17)

Detect comma-separated byte values from Uint8Array.toString() in
API error messages and decode them to UTF-8 text.

Fixes google-gemini#19851
@cRAN-cg cRAN-cg requested a review from a team as a code owner March 20, 2026 19:01
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug where API error messages, particularly those originating from the @google/genai SDK, were presented to users as unreadable byte sequences. By introducing a robust byte-decoding utility and integrating it into the core error parsing mechanisms, the change significantly enhances the clarity and usability of error messages, improving the developer and user experience when encountering API failures.

Highlights

  • API Error Message Decoding: Fixed an issue where API error messages from the @google/genai SDK were displayed as raw, comma-separated byte values instead of human-readable text.
  • New Utility Function: Introduced a decodeByteCodedString() utility to detect and decode byte-coded strings into UTF-8 text, handling cases with and without prefixes.
  • Error Handling Integration: Applied the new byte-decoding logic within errorParsing.ts, errors.ts, and googleErrors.ts to ensure all relevant error messages are properly formatted and readable.
  • Comprehensive Testing: Added extensive unit tests for decodeByteCodedString() and its integration into getErrorMessage() and parseGoogleApiError(), covering various byte-coded and normal string scenarios.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses an issue where API error messages were being displayed as raw byte strings. The core change introduces utility functions to detect and decode these byte-coded strings into readable text, and applies this decoding in various error handling paths. The added unit tests are comprehensive and cover the new functionality well.

My main feedback is regarding code duplication. The new byte-decoding logic has been implemented identically in two separate files (errors.ts and googleErrors.ts). I've left comments suggesting this logic be extracted into a shared utility file to improve maintainability and avoid future inconsistencies.

Comment thread packages/core/src/utils/errors.ts Outdated
Comment on lines 59 to 121
/**
* Detects if a string looks like comma-separated byte values (e.g. from a
* Uint8Array that was coerced to string via its default toString()) and
* decodes it to readable UTF-8 text.
*
* Example input: "91,123,10,32,32,34,101,114,114,111,114,34,58,32,123,10"
* Example output: '[{\n "error": {\n'
*/
export function decodeByteCodedString(value: string): string {
if (!value || !value.includes(',')) {
return value;
}

// The message may have a prefix like "got status: 429 Too Many Requests. "
// followed by the byte-coded body. Try to find where the byte codes start.
const decoded = tryDecodeBytes(value);
if (decoded !== null) {
return decoded;
}

// Try splitting on ". " to find a prefix + byte-coded body
const dotIndex = value.lastIndexOf('. ');
if (dotIndex !== -1) {
const prefix = value.substring(0, dotIndex + 2);
const rest = value.substring(dotIndex + 2);
const decodedRest = tryDecodeBytes(rest);
if (decodedRest !== null) {
return prefix + decodedRest;
}
}

return value;
}

/**
* Attempts to decode a string of comma-separated byte values into UTF-8 text.
* Returns the decoded string, or null if the input doesn't look like byte codes.
*/
function tryDecodeBytes(value: string): string | null {
const parts = value.split(',');
// Require at least a few parts to avoid false positives on normal text with commas
if (parts.length < 4) {
return null;
}
const bytes: number[] = [];
for (const part of parts) {
const trimmed = part.trim();
// Each part must be a non-negative integer in the byte range 0-255
if (!/^\d{1,3}$/.test(trimmed)) {
return null;
}
const num = Number(trimmed);
if (num > 255 || (trimmed.length > 1 && trimmed.startsWith('0'))) {
return null;
}
bytes.push(num);
}
try {
return String(friendlyError);
return new TextDecoder('utf-8').decode(new Uint8Array(bytes));
} catch {
return 'Failed to get error details';
return null;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is significant code duplication between the newly added decodeByteCodedString and tryDecodeBytes functions here, and the tryDecodeByteCodedMessage and tryDecodeByteString functions in packages/core/src/utils/googleErrors.ts. The logic is identical.

To improve maintainability and avoid having to fix bugs in two places, this shared logic should be extracted into a new, single utility file (e.g., packages/core/src/utils/byteDecoder.ts). Both errors.ts and googleErrors.ts can then import and use this common utility, which would eliminate the duplicated code.

Given that errors.ts already depends on googleErrors.ts, placing the new utility in a separate file is necessary to prevent circular dependencies.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please re - review, have resolved the comments

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gemini review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the code duplication. I've reviewed the changes, and the extraction of decodeByteCodedString and tryDecodeBytes into packages/core/src/utils/byteDecoder.ts successfully resolves the issue. The relevant files now correctly import and utilize this shared utility, improving maintainability and preventing circular dependencies as suggested.

Comment thread packages/core/src/utils/googleErrors.ts Outdated
Comment on lines +387 to +446
/**
* Detects if a string looks like comma-separated byte values (e.g. from a
* Uint8Array.toString()) and decodes it to readable UTF-8 text. This handles
* API error responses where the body is returned as raw byte codes instead of
* decoded text.
*
* The message may have a prefix (e.g., "got status: 429 Too Many Requests. ")
* followed by the byte-coded body.
*/
function tryDecodeByteCodedMessage(value: string): string {
if (!value || !value.includes(',')) {
return value;
}

const decoded = tryDecodeByteString(value);
if (decoded !== null) {
return decoded;
}

// Try splitting on ". " to find a prefix + byte-coded body
const dotIndex = value.lastIndexOf('. ');
if (dotIndex !== -1) {
const prefix = value.substring(0, dotIndex + 2);
const rest = value.substring(dotIndex + 2);
const decodedRest = tryDecodeByteString(rest);
if (decodedRest !== null) {
return prefix + decodedRest;
}
}

return value;
}

/**
* Attempts to decode a string of comma-separated byte values into UTF-8 text.
* Returns the decoded string, or null if the input doesn't look like byte codes.
*/
function tryDecodeByteString(value: string): string | null {
const parts = value.split(',');
if (parts.length < 4) {
return null;
}
const bytes: number[] = [];
for (const part of parts) {
const trimmed = part.trim();
if (!/^\d{1,3}$/.test(trimmed)) {
return null;
}
const num = Number(trimmed);
if (num > 255 || (trimmed.length > 1 && trimmed.startsWith('0'))) {
return null;
}
bytes.push(num);
}
try {
return new TextDecoder('utf-8').decode(new Uint8Array(bytes));
} catch {
return null;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new functions tryDecodeByteCodedMessage and tryDecodeByteString are identical to decodeByteCodedString and tryDecodeBytes in packages/core/src/utils/errors.ts. This code duplication can lead to maintenance issues in the future.

I recommend extracting this logic into a new, shared utility file. This new file can then be imported by both googleErrors.ts and errors.ts, ensuring the logic is defined in a single place. This will make the code easier to maintain and prevent potential inconsistencies.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please re - review, have resolved the comments

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gemini review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code duplication issue has been successfully resolved. The logic for decoding byte-coded messages has been extracted into the new packages/core/src/utils/byteDecoder.ts utility file, and errorParsing.ts, errors.ts, and googleErrors.ts now correctly import and utilize this shared functionality. This is a great improvement for maintainability.

@gemini-cli gemini-cli bot added priority/p1 Important and should be addressed in the near term. area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! labels Mar 20, 2026
cRAN-cg added 4 commits March 21, 2026 09:36
Move decodeByteCodedString and tryDecodeBytes into a dedicated
byteDecoder.ts module, eliminating identical implementations in
errors.ts and googleErrors.ts. Both files now import from the
shared utility, preventing future divergence.

Fixes google-gemini#19851
@cRAN-cg
Copy link
Copy Markdown
Author

cRAN-cg commented Mar 21, 2026

/assign

@cRAN-cg
Copy link
Copy Markdown
Author

cRAN-cg commented Mar 21, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new decodeByteCodedString utility to convert comma-separated byte strings (often from Uint8Array.toString()) into readable UTF-8 text, including cases with non-byte-coded prefixes. This utility is integrated into the core error parsing functions (parseAndFormatApiError, getErrorMessage, parseGoogleApiError) to improve the readability of error messages, and new tests were added to cover these decoding scenarios. A review comment highlights that the current prefix detection logic in decodeByteCodedString is too specific and brittle, suggesting a more robust regex-based approach to handle diverse prefix formats.

Comment thread packages/core/src/utils/byteDecoder.ts Outdated
Comment on lines +29 to +39
// Try splitting on ". " to find a prefix + byte-coded body
// (e.g., "got status: 429 Too Many Requests. 91,123,10,...")
const dotIndex = value.lastIndexOf('. ');
if (dotIndex !== -1) {
const prefix = value.substring(0, dotIndex + 2);
const rest = value.substring(dotIndex + 2);
const decodedRest = tryDecodeBytes(rest);
if (decodedRest !== null) {
return prefix + decodedRest;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current logic for detecting a prefix by searching for . is too specific and brittle. It will fail to decode byte-coded strings that have different prefixes, such as API Error: 91,123,....

A more robust approach would be to use a regular expression to find a potential byte sequence at the end of the string. This would handle various prefix formats without being tied to a specific separator.

Suggested change
// Try splitting on ". " to find a prefix + byte-coded body
// (e.g., "got status: 429 Too Many Requests. 91,123,10,...")
const dotIndex = value.lastIndexOf('. ');
if (dotIndex !== -1) {
const prefix = value.substring(0, dotIndex + 2);
const rest = value.substring(dotIndex + 2);
const decodedRest = tryDecodeBytes(rest);
if (decodedRest !== null) {
return prefix + decodedRest;
}
}
// If the whole string isn't bytes, it might have a prefix. Try to find
// a byte-like sequence at the end of the string.
const match = value.match(/((?:\d{1,3},)+\d{1,3})$/);
if (match) {
const byteString = match[1];
const decodedRest = tryDecodeBytes(byteString);
if (decodedRest !== null) {
const prefix = value.substring(0, value.length - byteString.length);
return prefix + decodedRest;
}
}

cRAN-cg added 3 commits March 21, 2026 21:41
… ". "

The ". " split was too brittle and missed prefixes like "API Error: "
or "status 429 ". Use a regex to match a trailing comma-separated
number sequence, handling any prefix format.
- Raise minimum byte count from 8 to 16 (smallest real JSON error
  is 14+ bytes)
- Use TextDecoder({ fatal: true }) to reject invalid UTF-8 sequences
- Add isPrintableText check to reject decoded output dominated by
  control characters
- Add tests for invalid UTF-8, control char rejection, and threshold
@cRAN-cg
Copy link
Copy Markdown
Author

cRAN-cg commented Mar 21, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust solution for decoding API error messages that are incorrectly formatted as comma-separated byte strings. The new decodeByteCodedString utility is well-designed with several heuristics to prevent false positives and is accompanied by a comprehensive suite of unit tests. The utility is correctly integrated into the existing error handling pathways in errorParsing.ts, errors.ts, and googleErrors.ts, ensuring that these garbled error messages are now presented in a human-readable format. The changes are of high quality and effectively address the issue.

Note: Security Review is unavailable for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality help wanted We will accept PRs from all issues marked as "help wanted". Thanks for your support! priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API error messages displayed as raw byte codes instead of decoded text when model capacity is exhausted

1 participant