Skip to content

grep: support \s and \S in BRE mode#51

Closed
wondr-wclabs wants to merge 1 commit into
uutils:mainfrom
wondr-wclabs:codex/bre-space-shorthands
Closed

grep: support \s and \S in BRE mode#51
wondr-wclabs wants to merge 1 commit into
uutils:mainfrom
wondr-wclabs:codex/bre-space-shorthands

Conversation

@wondr-wclabs
Copy link
Copy Markdown
Contributor

Closes #31.

This enables Oniguruma's whitespace escape operator for RegexMode::Basic so GNU BRE extensions \s and \S are recognized in the default -G mode. The change is intentionally scoped to BRE: fixed-string mode should stay literal, ERE already uses Syntax::gnu_regex(), and PCRE has its own syntax path.

I chose the Onig syntax flag rather than pre-processing patterns because it keeps escaping semantics inside the regex engine and avoids hand-maintaining a second parser for these shorthands. The new test cases live in the existing bre_gnu_extensions coverage alongside other GNU BRE extensions.

Validation run locally:

  • cargo fmt --all -- --check
  • cargo test bre_gnu_extensions -- --nocapture
  • printf 'a b\nxy\n' | cargo run --quiet -- -e '\s' -> a b
  • printf 'aS b\n \nx\n' | cargo run --quiet -- -c '\S' -> 2
  • cargo test
  • cargo test --no-fail-fast
  • cargo clippy --all-targets --workspace -p uu_grep -- -D warnings
  • git diff --check

@wondr-wclabs
Copy link
Copy Markdown
Contributor Author

Closing this as a duplicate of #37. While re-checking the issue list after publishing, I saw that #37 already targets the same GNU BRE / compatibility gap for #31. Leaving two PRs against the same issue would add review noise without a distinct behavior change, so this one should not stay open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

\s and \S are not honored in basic-regexp (-G) mode like GNU

1 participant