grep: honor GNU buffer anchors#56
Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
| #[test] | ||
| fn gnu_buffer_anchors() { | ||
| let (_s, mut c) = ucmd(); | ||
| c.args(&[r"\`c"]) | ||
| .pipe_in("cat\nscat\ndog\n") | ||
| .succeeds() | ||
| .stdout_only("cat\n"); | ||
|
|
||
| let (_s, mut c) = ucmd(); | ||
| c.args(&[r"t\'"]) | ||
| .pipe_in("cat\ntar\ndog\n") | ||
| .succeeds() | ||
| .stdout_only("cat\n"); | ||
|
|
||
| let (_s, mut c) = ucmd(); | ||
| c.args(&["-E", r"\`c"]) | ||
| .pipe_in("cat\nscat\ndog\n") | ||
| .succeeds() | ||
| .stdout_only("cat\n"); | ||
|
|
||
| let (_s, mut c) = ucmd(); | ||
| c.args(&["-E", r"t\'"]) | ||
| .pipe_in("cat\ntar\ndog\n") | ||
| .succeeds() | ||
| .stdout_only("cat\n"); | ||
| } |
There was a problem hiding this comment.
This seems like it could be expressed in just two tests, no?
There was a problem hiding this comment.
Agreed. I collapsed this to two command invocations: one BRE case and one ERE case. Each pattern now covers both GNU buffer anchors in one command:
BRE: \`c\|r\'
ERE: \`c|r\'
That keeps the mode distinction explicit while avoiding four near-identical command setups. Focused validation after the change: cargo test --test test_grep gnu_buffer_anchors.
e20d12f to
0ec43c8
Compare
|
I also rebased this branch onto current After the rebase, this PR is back to the intended scope: enabling Oniguruma's GNU buffer-anchor operator for BRE/ERE plus the two compact anchor tests. Local validation on the cleaned branch: |
lhecker
left a comment
There was a problem hiding this comment.
Thanks, anthropic/claude-sonnet-4-6! 🥲
Fixes #33.
GNU grep recognizes
\`and\'as start/end buffer anchors in both BRE and ERE mode. In this implementation, grep searches one record/line at a time, so those GNU buffer anchors behave like the record-local start/end anchors for the cases in this issue. The currentSyntax::grep()andSyntax::gnu_regex()setup did not enable Oniguruma's GNU buffer-anchor operator, so the escapes were treated as literal backtick/apostrophe characters.This enables
SYNTAX_OPERATOR_ESC_GNU_BUF_ANCHORonly forRegexMode::BasicandRegexMode::Extended. I left Fixed untouched because escapes are literals there, and left Perl untouched because-Pshould follow the PCRE-style syntax path rather than GNU BRE/ERE extensions.Validation:
cargo fmt --all -- --checkcargo testcargo clippy --all-targets --workspace -puu_grep -- -D warningsgit diff --checkprintf 'cat\ndog\n' | cargo run --quiet -- -e "t\\'"now printscatprintf 'cat\ndog\n' | cargo run --quiet -- -e '\c'now printscat`