Skip to content

\s and \S are not honored in basic-regexp (-G) mode like GNU #31

@sylvestre

Description

@sylvestre

GNU grep supports the shorthands \s (whitespace) and \S (non-whitespace) as extensions in basic regular expressions (the default, -G). uu_grep only recognises them under -E; in BRE it treats \s/\S as the literal characters s/S. \w/\W are honored in both modes, so the gap is specific to \s/\S under BRE.

Found by the differential fuzzer (fuzz_grep).

Rust (incorrect)

$ printf 'a b\nxy\n' | ./target/release/grep -e '\s'
# Output: (none) — '\s' is treated as the literal letter 's'
# Exit code: 1

GNU (correct)

$ printf 'a b\nxy\n' | LC_ALL=C /usr/bin/grep -e '\s'
a b
# Exit code: 0

With -E both agree, and \w/\W agree in both modes. The count form makes the gap obvious: for input aS b / / x, grep -c '\S' is 2 in GNU but 1 in uu_grep, and grep -c '\s' is 2 in GNU but 0 in uu_grep.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions