Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explanation of ”Comparison result is always the same“ in PR is technically correct, but unclear #11744

Open
ryao opened this issue Dec 19, 2022 · 3 comments
Labels
acknowledged GitHub staff acknowledges this issue C++ question Further information is requested

Comments

@ryao
Copy link

ryao commented Dec 19, 2022

libarchive/libarchive#1818

Here is a succinct version of the code involved:

	size_t		 min_frame_size;
	size_t		 max_frame_size;
...
		if (min_frame_size < 0 ||
		    min_frame_size >= (intmax_t)SIZE_MAX) {
...
}
...
		if (max_frame_size < 1024 ||
		    max_frame_size >= (intmax_t)SIZE_MAX) {
...

In a PR that a colleague had opened, CodeQL correctly flagged both min_frame_size >= (intmax_t)SIZE_MAX and max_frame_size >= (intmax_t)SIZE_MAX and gave technically correct descriptions of the problem.

However, the colleague who saw this thought it had flagged a bug, but was wrong in explaining why. More specifically, when he told a number of us about this, we initially thought that CodeQL just got lucky by having a false positive on something that needed correction, but not for the reason CodeQL originally found it.

The explanations offered in the PR were:

  • Comparison is always true because min_frame_size >= 0.
  • Comparison is always true because max_frame_size >= 1024.

I went back and forth from initially believing him, to thinking that CodeQL was right, to thinking I was wrong to say that CodeQL was right when my initial explanation that CodeQL was right was found to be wrong to confirming that CodeQL was right. Honestly, I felt that I had made a spectacle of myself midway through this process when my initial explanation that CodeQL was right turned out to be incorrect.

The reason why CodeQL is right is that on a 32-bit system, SIZE_MAX is 4294967295U. Cast this to a signed integer, and you are comparing against -1.

On a 32-bit system, his code was doing min_frame_size >= -1 and max_frame_size >= -1, which are always true. The same is true for 64-bit systems, although with the higher magnitude value 2^64 - 1 becoming -1.

When the first condition is false, then we know that min_frame_size >= 0 in the first one or max_frame_size >= 1024 in the second, so CodeQL was correct to point this out. However, without a more detailed explanation, it is easy for programmers to misunderstand the report.

I had already been aware of this since I read about this exact bug from something published by pvs-studio.com a month ago, but I could not remember it at first and after remembering that I had previously seen something on that site that resembled this, I could not find it when I looked. Had I actually found that explanation, it would have saved me some minor grief from thinking that I had made a spectacle of myself by digging deeper into it.

I suspect that there is a better explanation provided in the code scanning reports that only those with write access to the repository can see, but even if that is a better explanation, it is not helpful here because the contributor who opened the PR that CodeQL flagged cannot see that. This might be less of a problem if #11021 were fixed.

@ryao ryao added the question Further information is requested label Dec 19, 2022
@ryao
Copy link
Author

ryao commented Dec 19, 2022

It would have been better if the explanations were:

  • Comparison is always true because min_frame_size >= 0 and typecast causes comparison to become min_frame_size >= -1.
  • Comparison is always true because max_frame_size >= 1024 and typecast causes comparison to become max_frame_size >= -1.

@jketema
Copy link
Contributor

jketema commented Dec 19, 2022

Hi @ryao. Thanks for your observations. I agree with you that the alert message can be much improved.

I suspect that there is a better explanation provided in the code scanning reports that only those with write access to the repository can see

This is not the case. For every query we just return a location and a message (and you already found those) and for some queries this is enriched with one or more paths that tell you were data is coming from. The latter does not apply to the query at hand and, in fact, the explanations you mention are all there is.

I filed an internal ticket for improving the query explanation.

@ryao
Copy link
Author

ryao commented Dec 19, 2022

@jketema Thanks for the prompt reply and for letting me know about the code scanning report saying nothing else. That saved me from forking to check out of curiosity.

That said, should this be marked with the acknowledged and C++ labels?

@jketema jketema added the C++ label Dec 19, 2022
@hmakholm hmakholm added the acknowledged GitHub staff acknowledges this issue label Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged GitHub staff acknowledges this issue C++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants