Skip to content

feat!: treat * inside quotes as glob, support \* escape for literal asterisk#157

Merged
fohte merged 7 commits intomainfrom
fohte/quoted-literal-glob
Mar 15, 2026
Merged

feat!: treat * inside quotes as glob, support \* escape for literal asterisk#157
fohte merged 7 commits intomainfrom
fohte/quoted-literal-glob

Conversation

@fohte
Copy link
Copy Markdown
Owner

@fohte fohte commented Mar 6, 2026

Why

  • * inside quotes ('...' / "...") is treated as QuotedLiteral (exact match only), not as a glob wildcard
    • Pattern npx -c 'renovate-config-validator *' does not match inputs like renovate-config-validator foo.json because * is not expanded as a glob
    • Quotes should only act as grouping for tokens containing spaces, but currently they also suppress glob expansion

What

  • Allow * inside quotes to act as a glob wildcard, and introduce \* escape syntax for matching a literal * character
    • Remove LexToken::QuotedLiteral and PatternToken::QuotedLiteral, unifying them into Literal
    • Add backslash escape support to the lexer and matcher

Open with Devin

…terisk

Quoted strings (`'...'` / `"..."`) previously suppressed glob expansion
via the `QuotedLiteral` token type, making `*` inside quotes match only
the literal asterisk character. This prevented patterns like
`'renovate-config-validator *'` from matching arbitrary file arguments.

Change quotes to act as grouping only (preserving spaces in a single
token) while allowing `*` to work as a glob wildcard. Users who need a
literal `*` can now escape it with `\*`, consistent with gitignore and
editorconfig conventions.

- Remove `LexToken::QuotedLiteral` and `PatternToken::QuotedLiteral`
- Add backslash escape support in lexer (`consume_until`, `consume_word`)
- Update matcher (`literal_matches`, `glob_match`) to handle `\*` as
  literal asterisk via `has_unescaped_glob`, `unescape`,
  `split_on_unescaped_glob` helpers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the pattern matching capabilities by reinterpreting how wildcards behave within quoted strings. Previously, * in quotes was treated as a literal, limiting the expressiveness of patterns. Now, * inside quotes correctly acts as a glob, allowing for more flexible and intuitive matching, particularly for commands with arguments that might contain wildcards. To maintain the ability to match a literal asterisk, a new \* escape mechanism has been implemented, providing a clear way to differentiate between glob and literal interpretations. This change streamlines the pattern matching logic by unifying token types and improves the overall usability and power of the system.

Highlights

  • Globbing in Quoted Strings: The behavior of the * wildcard within quoted strings ("..." or '...') has been changed. Previously, * was treated as a literal character, but now it functions as a glob wildcard, matching zero or more arbitrary characters.
  • Literal Asterisk Escape: To match a literal * character (preventing glob expansion), a new escape sequence \* has been introduced. This escape works both inside and outside quoted strings.
  • Simplified Tokenization: The distinct LexToken::QuotedLiteral and PatternToken::QuotedLiteral types have been removed. Quoted strings are now uniformly treated as Literal tokens, with quotes serving solely for grouping tokens that contain spaces.
  • Lexer and Matcher Updates: The lexer and pattern matcher have been updated to correctly process backslash escapes and to implement the new globbing behavior for * within literal tokens, including those originating from quoted strings.
  • Documentation and Test Coverage: The pattern matching documentation has been updated to reflect these changes, and new integration tests have been added to verify the correct behavior of globbing and escaping within quoted patterns.
Changelog
  • docs/src/content/docs/architecture/pattern-matching.md
    • Updated the description of the Literal token to clarify that quotes now act as grouping only.
    • Revised the matching behavior description for Literal pattern tokens to include glob and escape handling.
  • docs/src/content/docs/pattern-syntax/overview.md
    • Modified the description for 'Quoted literal' to explain that quotes are for grouping only and * is still a glob, with \* used for literal asterisks.
  • docs/src/content/docs/pattern-syntax/wildcards.md
    • Rewrote the section on quoted literals to explain the new behavior where * is a glob and \* is for escaping.
    • Added new YAML examples demonstrating the use of \* for literal asterisks and * for globbing within quotes.
  • src/rules/pattern_lexer.rs
    • Removed the LexToken::QuotedLiteral enum variant.
    • Modified the tokenize function to process quoted strings as LexToken::Literal.
    • Implemented backslash escape handling within the consume_word and consume_until functions.
    • Updated unit tests for quoted strings to align with the new glob and escape processing.
  • src/rules/pattern_matcher.rs
    • Removed PatternToken::QuotedLiteral handling from match_tokens_core and extract_placeholder_all functions.
    • Adjusted optional_flags_absent to only consider PatternToken::Literal.
    • Enhanced literal_matches to correctly identify unescaped glob wildcards and handle backslash escapes.
    • Introduced new helper functions: has_unescaped_glob, unescape, split_on_unescaped_glob, and ends_with_unescaped_glob to manage glob and escape logic.
    • Removed PatternToken::QuotedLiteral from match_single_token.
    • Updated unit tests for quoted literal matching to reflect the new glob and escape behavior.
  • src/rules/pattern_parser.rs
    • Removed the PatternToken::QuotedLiteral enum variant.
    • Updated build_pattern_from_tokens and build_pattern_tokens to map former QuotedLiteral instances to PatternToken::Literal.
    • Removed the mapping for LexToken::QuotedLiteral in lex_to_pattern_value.
    • Adjusted unit tests to reflect the removal of QuotedLiteral.
  • tests/integration/config_to_rule_evaluation.rs
    • Updated integration tests to validate the new behavior of * as a glob within quotes.
    • Added new test cases to specifically verify \* for literal asterisk matching.
    • Included tests for quoted glob patterns containing spaces to cover the primary use case.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as resolved.

…glob

# Conflicts:
#	src/rules/pattern_matcher.rs
#	tests/integration/config_to_rule_evaluation.rs
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 93.02326% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.82%. Comparing base (225ccda) to head (d823f33).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/rules/pattern_lexer.rs 91.89% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #157      +/-   ##
==========================================
+ Coverage   89.81%   89.82%   +0.01%     
==========================================
  Files          50       50              
  Lines       10325    10345      +20     
==========================================
+ Hits         9273     9292      +19     
- Misses       1052     1053       +1     
Flag Coverage Δ
Linux 89.66% <90.69%> (-0.05%) ⬇️
macOS 91.00% <93.02%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Remove redundant `\*` escape example from "Quotes and Glob" section
that duplicated the example already shown in "Escaping `*` with
Backslash". Consolidate `quoted_star_acts_as_glob` and
`escaped_star_is_literal` integration tests into a single
parameterized `quoted_and_escaped_star_matching` test per the
repository's rstest style guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
formatdoc! is more concise and idiomatic for inline string
interpolation with indoc-style formatting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Merge origin/main into the quoted-literal-glob branch, resolving
conflicts in the restructured pattern_matcher module (now split into
mod.rs, token_matching.rs, flag_utils.rs) and updated integration
tests. Remove QuotedLiteral from the new submodules and update
property-based tests to reflect the new semantics (quotes are
grouping only, `*` acts as glob, `\*` for literal).
…dling

Two bugs found by Devin review:

1. Merging QuotedLiteral into Literal caused should_consume_as_value to
   reject quoted flag-like values (e.g., "-v" in `grep -e "-v" *`),
   preventing them from being consumed as flag values. Restore
   LexToken::QuotedLiteral so the parser can distinguish quoted tokens
   from bare literals for flag-value association, while still converting
   them to PatternToken::Literal (glob-enabled) for matching.

2. consume_until treated `\` + closing delimiter as an escape sequence,
   eating the closing quote and causing "unclosed quote" errors for
   patterns like `cmd 'hello\\'`. Fix by not escaping when the next
   character is the closing delimiter.
devin-ai-integration[bot]

This comment was marked as resolved.

…ackslash

The backslash handler in consume_word unconditionally consumed the next
character, even when it was a word-boundary character (space, tab,
brackets, quotes). This caused patterns containing a backslash followed
by a space-delimited token to be tokenized as a single word instead of
two separate tokens.

Also fix unescape_and_match to preserve trailing backslashes as literal
instead of silently dropping them, and correct the test command in
backslash_before_closing_quote to use a shell-escaped backslash.
@fohte fohte merged commit 23c7789 into main Mar 15, 2026
10 checks passed
@fohte fohte deleted the fohte/quoted-literal-glob branch March 15, 2026 07:42
@fohte-bot fohte-bot Bot mentioned this pull request Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant