-
Notifications
You must be signed in to change notification settings - Fork 1.7k
CPP: Add query for CWE-369: Divide By Zero. #10431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ihsinme, sorry it took me a while to getting around to looking at this. I agree that there's a performance problem that needs to be fixed here - in the vast majority of projects I tried this query on, it timed out. I was able to produce a handful of promising results, however.
cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.ql
Outdated
Show resolved
Hide resolved
@geoffw0, thanks for the idea, I fixed it. it would be great if you could give advice on how to compare the speed of different sections of code, different predicates, functions, or at least different queries. Expr getMulDivOperand(Expr e1) {
result = e1 or
result = e1.(MulExpr).getAnOperand() or
result = e1.(DivExpr).getLeftOperand()
}
...
getMulDivOperand(globalValueNumber(div.getRV()).getAnExpr()) = findVal and is the slowest and I don't know if it will work faster if I change it to Expr getMulDivOperand(Expr e1) {
result = globalValueNumber(e1).getAnExpr() or
result = globalValueNumber(e1).getAnExpr().(MulExpr).getAnOperand() or
result = globalValueNumber(e1).getAnExpr().(DivExpr).getLeftOperand()
}
...
getMulDivOperand(div.getRV()) = findVal and in general, it is very difficult for me to understand how the optimizer will behave in a given situation, but simply comparing the execution time is not the best idea (there is caching and the next run may be faster than the previous one) Of course, I can split this query into 2 or more and they will work better, but it's not very pretty. |
There's a document about troubleshooting performance issues here: https://codeql.github.com/docs/writing-codeql-queries/troubleshooting-query-performance/ That said its tricky, there tends to be quite a lot of wobble in timings and QL is optimized globally so a change can have a large and unexpected effect on something else far away. There is usually something we can ultimately 'blame' and fix, and usually its a predicate that's generating a very large set of results when it doesn't need to (such as a 'cartesian product'). You can look at the evaluator log (in particular the "Most expensive predicates" table) for some hints about where to start. You can access this after running a query in VSCode by right-clicking on the query run in the query history, "Show Evaluator Log (Summary Text)".
Not necessarily. We generally only split queries when that is clearer or more useful to the end user, not for performance reasons. |
My intuition is that your first version of |
@geoffw0 i made some adjustments, please rescan the code. tell me what this line Thank you. |
I don't know, I saw something similar in the evaluator log myself. I'll see if I can find someone who can explain it... |
Hi again @ihsinme, Apparently when a predicate is large and complicated (particularly lots of negation, aggregates or nested disjunction) the evaluator sometimes breaks it up, and the component parts don't always get given useful names. That's where None of this is really a problem in itself, but if you break up some of your bigger predicates you might get more directly useful output from the evaluator log. |
Hi @geoffw0.
Thanks. |
Global value numbering is expensive to compute but shouldn't be causing a timeout by itself - and should be cached once its been computed for one query on any particular project.
I don't understand what you're trying to say here.
It may be that in each place where Have you tried the middle road of making it:
and replacing each use of
Its not so simple as one thing calling another. Predicates may be inlined into other predicates, evaluated differently relative to different contexts, broken into parts by the optimizer etc. You can look through the log file for areas that took a long time (clock time) and then look for |
@geoffw0 thanks for the direction, if I can interpret the entry however, the caching issue remains, ie. the first time it returns some values, but the second time they are nulled or changed. |
That depends how your running your queries. In VSCode, you can press control+shift+P and select the command "CodeQL: Clear Cache". If you're running from the command line I believe there are arguments to do it. |
@geoffw0 look please. I think I fixed the problem with speed. |
@geoffw0 i hope you have time to look at my corrections. in addition, we can discuss the addition of |
Yep, sorry, I'll take a look today... |
This is definitely an improvement. I tried the query on several projects locally, all of which timed out before the change, and now one of them finishes (the others still time out). 👍 It appears much of the remaining slowness is caused by a large evaluation of
to:
which tells the optimizer to figure out the right-hand-side of the equality first, then feed it into the left (and not vice-versa). Please keep use of There still appears to be a problem with the way
I'm not sure if this helps or makes no difference. Certainly doesn't seem to do any harm. |
thanks for the idea i will think about it |
hi @geoffw0. fixed predicate |
@geoffw0 I'm afraid you won't see it. |
I know, I feel a bit sad about it given the amount I've used the platform. Still, code scanning lives on... This branch is still timing out on a little over half of the runs. I feel like we're getting closer but we're not quite there. |
I improved the query, now all projects from my collection are scanned. |
@geoffw0 take a look, please |
Hi @ihsinme - the new version is definitely an improvement on smaller projects but still doesn't work well at all on medium-to-large projects. It looks like the issue has to do with the liberal use of global value numbering in this query. I have struggled to find any significant improvements without just removing some of these GVN uses. However I am not sure which ones are essential for getting good results from this query and which ones are not. |
@geoffw0 thanks for the answer. |
The query finished on a little under half the projects I ran it on. Some (arbitrary) examples where it didn't finish: |
Good afternoon @geoffw0 . |
That sounds like a good tradeoff. Feel free to leave comments describing the removed logic in case we decide to repair it in future. |
@geoffw0 could you tell me if the current version of the request meets the requirements for speed. Thank you. |
Hi @ihsinme , yes, this is performing nicely now and completed on all the projects I tried it on! Please do update the tests then I'll run the PR checks. |
@geoffw0 i fixed the test suite. |
QHelp previews: cpp/ql/src/experimental/Security/CWE/CWE-369/DivideByZeroUsingReturnValue.qhelpDivide by zero using return valuePossible cases of division by zero when using the return value from functions. ExampleThe following example shows the use of a function with an error when using the return value and without an error. ...
a = getc(f);
if (a < 123) ret = 123/a; // BAD
...
if (a != 0) ret = 123/a; // GOOD
...
References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests passed, performance is good, I'm happy with this PR!
Thank you for your contribution.
@geoffw0 This PR is not merged, can I do anything? |
Sorry, I had approved it, I thought you could merge it. Merged. |
good afternoon.
this query should find situations where the return value from some function might become part of the denominator and result in a division by zero.
this error is widely represented in projects.
detecting:
CVE-2018-18190
.it is worth noting that the detection would be much wider, for example, it would cover colors such as
CVE-2021-34069
ifGlobalValueNumbering
could work withFieldAccess
.I also ask for help in optimizing this query, after trying to make it look laconic, I got extremely poor performance, while quite trivial changes, such as moving
msg
to a common place or removingdivFn
, lead to an even greater slowdown.It is difficult for me to move without having an understanding of the work of the optimizer, so I ask for any advice. if you don’t have them, I’ll have to move back, moving away from the laconic look.
Thanks.