Skip to content

handles --binary-files=without-match on invalid-UTF-8 lines differently from GNU (treats a matching line as binary and suppresses it) #9

@sylvestre

Description

@sylvestre

When --binary-files=without-match (a.k.a. -I) is given and a matching line
contains an invalid UTF-8 byte, uu_grep classifies the file as binary and
suppresses all output, exiting 1 (no match). GNU grep, under LC_ALL=C, does
not treat a lone high byte like \x9d as binary: it prints the matching line and
exits 0.

Rust (incorrect)

$ printf 'a\x9db\n' | ./target/release/grep --binary-files=without-match a
# Output: (none)
# Exit code: 1

GNU (correct)

$ printf 'a\x9db\n' | LC_ALL=C /usr/bin/grep --binary-files=without-match a
# Output: a\x9db
# Exit code: 0

Confirmed against GNU grep 3.11. The same divergence occurs with a file argument
(grep --binary-files=without-match a file), not just stdin.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions