Skip to content

fix(at-command): prevent stack overflow from regex backtracking on large inputs#27580

Open
Sauravdas007 wants to merge 2 commits into
google-gemini:mainfrom
Sauravdas007:main
Open

fix(at-command): prevent stack overflow from regex backtracking on large inputs#27580
Sauravdas007 wants to merge 2 commits into
google-gemini:mainfrom
Sauravdas007:main

Conversation

@Sauravdas007
Copy link
Copy Markdown

Summary

Fixes #27539

This PR replaces the regex-based @ command parser with an iterative scanner to prevent catastrophic backtracking when processing large pasted inputs.

Fixes #27539.

Root Cause

The previous implementation relied on a complex regular expression:

(?:(?:"(?:[^"]*)")|(?:\\.|[^ \t\n\r,;!?()\[\]{}.]|\.(?!$|[ \t\n\r])))+

Under large malformed inputs (logs, JSON dumps, session histories, etc.) containing @-prefixed content, the overlapping alternations inside a greedy repetition could trigger excessive backtracking, eventually causing:

RangeError: Maximum call stack size exceeded

Changes

  • Replaced regex-based parsing with a deterministic character-by-character scanner.

  • Added explicit handling for:

    • quoted paths
    • escaped characters
    • standalone @
    • delimiter termination
  • Preserved existing output structure and command categorization behavior.

Benefits

  • Eliminates regex backtracking risk.
  • Linear-time parsing behavior.
  • Improved stability when users paste large logs or session exports.
  • Easier to reason about and maintain than the previous regex.

Testing

  • Existing at-command test suite executed.
  • Reproduced large pasted-input scenario that previously triggered stack overflow.
  • Verified parser behavior for quoted paths, escaped characters, and standard file references.

adding state-machine version of parseAllAtCommands()
to resolve // regex error
@Sauravdas007 Sauravdas007 requested a review from a team as a code owner May 30, 2026 13:59
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical performance issue where complex regular expressions caused stack overflow errors during the processing of large or malformed inputs. By transitioning to an iterative, manual scanning approach, the system now handles large pastes with linear-time complexity, ensuring stability and reliability for the '@' command parsing logic.

Highlights

  • Regex Replacement: Replaced the regex-based parser with a deterministic character-by-character scanner to eliminate catastrophic backtracking.
  • Performance and Stability: Achieved linear-time parsing complexity, preventing 'Maximum call stack size exceeded' errors when processing large inputs.
  • Feature Parity: Maintained support for quoted paths, escaped characters, and standard delimiter termination while preserving existing output structures.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the regex-based parsing of @ commands in atCommandProcessor.ts with a manual iterative character scanner to better handle quoted paths and escaped characters. However, the new scanner introduces two correctness regressions: it incorrectly includes trailing periods in file paths and swallows the rest of the line when encountering unclosed quotes. A code suggestion has been provided to address these issues by checking for closing quotes and explicitly breaking on trailing periods.

Comment on lines +120 to +144
let inQuotes = false;

while (i < query.length) {
const ch = query[i];

// Handle quoted paths
if (ch === '"') {
inQuotes = !inQuotes;
i++;
continue;
}

// Handle escaped characters
if (ch === '\\' && i + 1 < query.length) {
i += 2;
continue;
}

// We strip the @ before unescaping so that unescapePath can handle quoted paths correctly on Windows.
const atPath = '@' + unescapePath(fullMatch.substring(1));
parts.push({ type: 'atPath', content: atPath });
// Stop at delimiters when not inside quotes
if (!inQuotes && /[ \t\n\r,;!?()[\]{}]/.test(ch)) {
break;
}

i++;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new iterative scanner introduces two correctness regressions compared to the original regex parser:

  1. Trailing Periods: In the original regex, a period . was only matched if it was not followed by the end of the string or whitespace (\.(?!$|[ \t\n\r])). This prevented trailing periods at the end of sentences (e.g., Please check @foo.txt.) from being incorrectly included in the file path. The new scanner incorrectly includes trailing periods, which breaks path resolution.
  2. Unclosed Quotes: If a quote is unclosed (e.g., @foo"bar baz), the original regex would treat the quote as a regular character and stop at the space delimiter. The new scanner sets inQuotes = true and swallows the rest of the line, including delimiters.

We can resolve both issues elegantly by checking for a closing quote using indexOf (to avoid swallowing delimiters on unclosed quotes) and explicitly breaking on trailing periods.

      while (i < query.length) {
        const ch = query[i];

        // Handle quoted paths
        if (ch === '"') {
          const closingQuoteIndex = query.indexOf('"', i + 1);
          if (closingQuoteIndex !== -1) {
            i = closingQuoteIndex + 1;
            continue;
          } else {
            i++;
            continue;
          }
        }

        // Handle escaped characters
        if (ch === '\\' && i + 1 < query.length) {
          i += 2;
          continue;
        }

        // Stop at delimiters
        if (/[ \t\n\r,;!?()[\\]{}]/.test(ch)) {
          break;
        }

        // Stop at trailing periods
        if (ch === '.' && (i + 1 === query.length || /[ \t\n\r]/.test(query[i + 1]))) {
          break;
        }

        i++;
      }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RangeError: Maximum call stack size exceeded

1 participant