Skip to content

cksum: Match GNU behavior for digest length errors in --check mode#11499

Merged
cakebaker merged 2 commits intouutils:mainfrom
RenjiSann:cksum-untagged-mismatch
Mar 27, 2026
Merged

cksum: Match GNU behavior for digest length errors in --check mode#11499
cakebaker merged 2 commits intouutils:mainfrom
RenjiSann:cksum-untagged-mismatch

Conversation

@RenjiSann
Copy link
Copy Markdown
Collaborator

This PR fixes #11202, as well as a few other related issues

@RenjiSann RenjiSann requested a review from sylvestre March 26, 2026 00:50
@RenjiSann RenjiSann self-assigned this Mar 26, 2026
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 26, 2026

Merging this PR will not alter performance

✅ 300 untouched benchmarks
⏩ 46 skipped benchmarks1


Comparing RenjiSann:cksum-untagged-mismatch (6723393) with main (32b93b6)

Open in CodSpeed

Footnotes

  1. 46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

@AldanTanneo AldanTanneo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of confusing naming regarding bit/byte length, see comments

also a minor correctness issue

(AlgoKind::Shake128 | AlgoKind::Shake256, Some(bit_len)) => Some(bit_len.div_ceil(8)),
(AlgoKind::Shake128, None) => Some(sum::Shake128::DEFAULT_BIT_SIZE.div_ceil(8)),
(AlgoKind::Shake256, None) => Some(sum::Shake256::DEFAULT_BIT_SIZE.div_ceil(8)),
(AlgoKind::Blake2b | AlgoKind::Blake3, Some(byte_len)) => Some(byte_len * 8),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digest_char_length_hint now stores a bit length instead of a byte length hint? the name should probably change then.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

return None;
}

let byte_len_hint = bit_len_hint.map(|n| n / 8);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the division should probably be a .div_ceil(8), since for instance 4 bits still take a full byte to be represented in a truncated XOF output.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks 👍

// If the algorithm is SHA2 and no length hint was provided, compute it
// from the size of the expected checksum.
let algo_byte_len = algo_byte_len
.or_else(|| (algo_kind == AlgoKind::Sha2).then(|| expected_checksum.len() * 8));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong variable name? why do we multiply by 8 if this is a byte length

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change the name, but when algo is Blake, this variable holds bytes 😅

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#11500 is direly needed 😭

/// When checking untagged format lines, non-XOF non-legacy algorithms
/// should report "improperly formatted lines" if the digest length isn't
/// equivalent to this.
pub fn expected_digest_byte_len(self) -> Option<usize> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong function name: returns a bit length, not a byte length.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, thanks

@RenjiSann RenjiSann force-pushed the cksum-untagged-mismatch branch from ac866ea to 59977fb Compare March 26, 2026 09:45
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Note: The gnu test tests/expand/bounded-memory is now being skipped but was previously passing.

@RenjiSann RenjiSann requested a review from AldanTanneo March 26, 2026 11:03
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.
Congrats! The gnu test tests/cp/link-heap is now passing!
Note: The gnu test tests/misc/write-errors was skipped on 'main' but is now failing.

@RenjiSann RenjiSann force-pushed the cksum-untagged-mismatch branch from 67e1369 to 4ebde21 Compare March 26, 2026 15:04
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/cut/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tail/tail-n0f is now passing!

Copy link
Copy Markdown
Contributor

@AldanTanneo AldanTanneo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now (modulo the upcoming bit/byte uniformization)

Comment on lines +609 to +626
fn test_check_sha_tagged_no_hint() {
// When checking a tagged line with SHA2 but with missing length, guess
// from the size of the digest, and raise "improperly formatted" if the
// size isn't a valid SHA length.

let (at, mut ucmd) = at_and_ucmd!();
at.touch("a");
at.touch("b");

let checksum = "SHA2 (a) = d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f";
let invalid = "SHA2 (b) = xxxx";

ucmd.arg("-c")
.pipe_in(format!("{checksum}\n{invalid}"))
.succeeds()
.stdout_contains("a: OK")
.stderr_contains("WARNING: 1 line is improperly formatted");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is incorrect because I get a failure when running the command manually with GNU cksum:

$ touch a b
$ printf "SHA2 (a) = d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f\nSHA2 (b) = xxxx" | cksum -c
cksum: 'standard input': no properly formatted checksum lines found
$ echo $?
1

If I replace SHA2 with SHA224, however, I get the same result as in the test:

$ printf "SHA224 (a) = d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f\nSHA224 (b) = xxxx" | cksum -c
a: OK
cksum: WARNING: 1 line is improperly formatted

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a mistake somewhere when testing. I'm investigating

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I hallaucinated this feature, I pushed a fix


let (at, mut ucmd) = at_and_ucmd!();
at.touch("null");
at.write("one", "A");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you create this file as it isn't used afterwards.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was meant to be the filename used in the invalid checksum, thanks for noticing it !
I changed the test a bit to use both files, and I renamed them to a and b.

.pipe_in(format!("{invalid_checksum}\n{good_checksum}"))
.succeeds()
.stdout_contains("null: OK")
.stdout_does_not_contain("one: FAILED")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment.

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/pipe-f is now passing!

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/cut/bounded-memory (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Note: The gnu test tests/printf/printf-surprise is now being skipped but was previously passing.
Note: The gnu test tests/rm/many-dir-entries-vs-OOM is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/pipe-f is now passing!

Copy link
Copy Markdown
Contributor

@cakebaker cakebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. Can you please do a rebase and reduce the number of commits?

@RenjiSann RenjiSann force-pushed the cksum-untagged-mismatch branch from c0c831f to 6723393 Compare March 27, 2026 14:01
@RenjiSann
Copy link
Copy Markdown
Collaborator Author

It looks good. Can you please do a rebase and reduce the number of commits?

Did it, and rebased at the same time.

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tty/tty-eof (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/date/resolution (passes in this run but fails in the 'main' branch)

@cakebaker cakebaker merged commit e104fc8 into uutils:main Mar 27, 2026
161 of 162 checks passed
@cakebaker
Copy link
Copy Markdown
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sha256sum/sha1sum fail if file contains mixed-hash types

3 participants