Skip to content

Conversation

@ihsinme
Copy link
Contributor

@ihsinme ihsinme commented Nov 28, 2020

Good afternoon.
This is my first query. Therefore, I am ready for criticism and correction. Hope for hints.

This error occurs quite often and can have consequences when working with memory, when the size of a piece is calculated as a difference.

into experimental this reveals a dangerous comparison
@jbj
Copy link
Contributor

jbj commented Nov 30, 2020

Thanks for the contribution! I'm curious to see the results of this query in practice.

Our contribution guidelines for experimental are in https://github.com/github/codeql/blob/main/CONTRIBUTING.md, and the one I'm most interested in is step 5: The query must have at least one true positive result on some revision of a real project.

To get you started, I've set the query running on 97 real-world projects: https://lgtm.com/query/240395435018513940/. You can use that URL to see the results and to modify the query and re-run it on those same projects. I've made one change to your query already, replacing getType() with getFullyConverted().getUnspecifiedType() to fix false positives like this one.

Please look through the results, revise the query as necessary, and report on how many good and bad results there are. A good result is one that the project maintainers are likely to fix. As you look over the results, I recommend that you consider whether the query and/or its results are applicable for a bounty from GitHub Security Lab: https://securitylab.github.com/bounties.

@geoffw0
Copy link
Contributor

geoffw0 commented Nov 30, 2020

This is an interesting query.

A quick sample of results looks promising. There are some of the form @jbj spotted:

(int)(unsigned_value1 - unsigned_value2) >= 0

where the cast is 'too late' in that the subtraction has already been performed on unsigned numbers, but both unsigned overflows and the conversion back to signed are well defined and probably produce the correct result in the end. We should exclude these cases.

For the > 0 cases there's a risk that in some cases negative numbers were never expected and this is just a (correct) alternative way of writing != 0. If this proves to be a common issue in practice we might be able to make improvements using range analysis.

For the >= 0 case I'm keen to see any result that is both correct and not already detected by cpp/unsigned-comparison-zero.

I ran some statistics comparing this query to cpp/constant-comparison and cpp/unsigned-comparison-zero, to my surprise in this small run the query has quite a lot of results but no overlaps with the other two queries.

@ihsinme ihsinme closed this Nov 30, 2020
@ihsinme
Copy link
Contributor Author

ihsinme commented Nov 30, 2020

I'm sorry I'm probably too worried.

@ihsinme ihsinme reopened this Nov 30, 2020
@ihsinme
Copy link
Contributor Author

ihsinme commented Nov 30, 2020

again sorry for accidentally closing.

Thanks for the contribution! I'm curious to see the results of this query in practice.

thanks for your recommendations, I'm working on them now.

Our contribution guidelines for experimental are in https://github.com/github/codeql/blob/main/CONTRIBUTING.md, and the one I'm most interested in is step 5: The query must have at least one true positive result on some revision of a real project.

I want to draw your attention to the fact that I have one positive experience with this error. openssl/openssl#13515.

To get you started, I've set the query running on 97 real-world projects: https://lgtm.com/query/240395435018513940/. You can use that URL to see the results and to modify the query and re-run it on those same projects. I've made one change to your query already, replacing getType() with getFullyConverted().getUnspecifiedType() to fix false positives like this one.

I also plan to select some of the triggers from https://lgtm.com/query/240395435018513940/ and suggest a fix for them, but it will take time.

Please look through the results, revise the query as necessary, and report on how many good and bad results there are. A good result is one that the project maintainers are likely to fix. As you look over the results, I recommend that you consider whether the query and/or its results are applicable for a bounty from GitHub Security Lab: https://securitylab.github.com/bounties.

thanks for the hint. I ask you to clarify whether I should wait for the acceptance of PR or can I send an application now.

@ihsinme
Copy link
Contributor Author

ihsinme commented Nov 30, 2020

This is an interesting query.

thanks.

A quick sample of results looks promising. There are some of the form @jbj spotted:

(int)(unsigned_value1 - unsigned_value2) >= 0

where the cast is 'too late' in that the subtraction has already been performed on unsigned numbers, but both unsigned overflows and the conversion back to signed are well defined and probably produce the correct result in the end. We should exclude these cases.

I agree that this code will work correctly. I will think about how to exclude it.

For the > 0 cases there's a risk that in some cases negative numbers were never expected and this is just a (correct) alternative way of writing != 0. If this proves to be a common issue in practice we might be able to make improvements using range analysis.

For the >= 0 case I'm keen to see any result that is both correct and not already detected by cpp/unsigned-comparison-zero.

absolutely right. but I suggested that this exception be included in the description. since in my opinion, the use of ranges does not guarantee the elimination of false detection.

I ran some statistics comparing this query to cpp/constant-comparison and cpp/unsigned-comparison-zero, to my surprise in this small run the query has quite a lot of results but no overlaps with the other two queries.

I will definitely watch it.

@jbj
Copy link
Contributor

jbj commented Nov 30, 2020

I want to draw your attention to the fact that I have one positive experience with this error. openssl/openssl#13515.

If this is detected by the query, that's good enough for experimental.

I agree that this code will work correctly. I will think about how to exclude it.

It's excluded by the getType change I made to your query.

I ask you to clarify whether I should wait for the acceptance of PR or can I send an application now.

If you're going for The Bug Slayer (discover a new vulnerability), my understanding is that you need to go through the CVE process for a specific bug first and then open an issue. If you're going for All for one, one for all (add a new query), you should open an issue now.

@ihsinme
Copy link
Contributor Author

ihsinme commented Dec 1, 2020

analyzing project data, I created 7 PRs in different projects. I hope to demonstrate their result soon.

I would like to separately note the situation in which PR will probably not be accepted. it is the use of macros for comparison. like in the project radareorg / cutter #define R_MAX (x, y) (((x)> (y))? (x) :( y)). I would not call it a false detection.

@jbj
Copy link
Contributor

jbj commented Dec 1, 2020

It's a pattern we often see that seemingly nonsensical code comes out of macro expansions. Here's an example of how to exclude it:

. Requires
import semmle.code.cpp.commons.Exclusions

@ihsinme
Copy link
Contributor Author

ihsinme commented Dec 2, 2020

good afternoon.
sorry for the little delay with the analysis report.

  1. as I said above, I sent 7 PRs for projects. the results are as follows.
    https://gitlab.com/gnutls/gnutls/-/merge_requests/1364 took PR
    fix invalid unsigned arithmetic. videolan/vlc#118 agreed, thanked but indicated that he does not accept patches through the git hub
    fix invalid unsigned arithmetic. libretro/libretro-uae#351 agreed, but indicated that the error was not in his code and the project was no longer supported
    fix invalid unsigned arithmetic. tesseract-ocr/tesseract#3163 during the analysis, I found that the expression is always greater than zero, but for some reason I started refactoring.

the following projects still haven't responded.
alibaba/AliSQL#106
spring/spring#549
ryanmelt/qtbindings#171

  1. relatively

For the > 0 cases there's a risk that in some cases negative numbers were never expected and this is just a (correct) alternative way of writing != 0. If this proves to be a common issue in practice we might be able to make improvements using range analysis.
I still ask you to consider the possibility of leaving this false detection, indicating it in the help file. in my opinion, this can be useful for researchers, as evidenced by the tesseract-ocr experience indirectly. at the same time I am doubtful about the possibilities of the ranges to exclude this situation completely.

  1. as I already wrote, the willingness to make changes to macro comparisons is questionable.
    therefore, I tend to exclude them from analysis using @jbj's hint.

  2. relatively

For the >= 0 case I'm keen to see any result that is both correct and not already detected by cpp/unsigned-comparison-zero.
I didn’t find such a response in manual analysis, but I draw your attention to the fact that in the help file I also describe this situation.

  1. regarding statistic analysis, I am in the process. but I do not think that it will affect this PR.

at the end I ask you to consider that this is my first experience working with you. and I would be grateful for a hint in my next steps.

@jbj
Copy link
Contributor

jbj commented Dec 3, 2020

Thanks, @ihsinme! The next step is to get this PR merged. All that's formally required is that you autoformat the QL code: in VSCode, press Ctrl+Shift+P and type Format Document<Enter>. I also strongly encourage you to incorporate the changes discussed on this PR: exclude macros and replace getType() with getFullyConverted().getUnspecifiedType().

You've opened github/securitylab#208 to get a bounty for getting the query merged, and the ambition level there is up to you. You can create follow-up PRs to make the query more production-ready, following the steps in https://github.com/github/codeql/blob/main/docs/supported-queries.md. The query in its current state might already be eligible for a bounty (that's up to Security Lab, not me), but a more production-ready query would be eligible for a higher bounty. You can indicate on github/securitylab#208 that you're done improving the query. Until then, we'll continue to review any PRs you make to improve the query.

@ihsinme
Copy link
Contributor Author

ihsinme commented Dec 4, 2020

First of all, I am grateful to you for your benevolent attitude towards the newcomer. it really matters to me.
secondly, I corrected my PR, if something is wrong, tell me.
thirdly, unfortunately I write in notepad and I don't have VSCode. if I can do the alignment differently, tell me.

jbj
jbj previously approved these changes Dec 4, 2020
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

The formatting looks good to me, so I've triggered the CI tests. When they pass, I'm happy for this PR to be merged.

jbj added 2 commits December 4, 2020 13:25
Tweak whitespace, also in the alert message.
@jbj jbj merged commit bc340e2 into github:main Dec 4, 2020
@rdmarsh2
Copy link
Contributor

rdmarsh2 commented Dec 4, 2020

The autoformatter can also be run via the cli: codeql query format -i <file> will format and overwrite the file, other options can be used to print the formatted result to standard out or to a different file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants