-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Java: Add models for flow- and taint-preserving functions in Commons ObjectUtils #5344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java: Add models for flow- and taint-preserving functions in Commons ObjectUtils #5344
Conversation
exists(DataFlow::Node src, DataFlow::Node sink, TaintFlowConf conf | conf.hasFlow(src, sink) | | ||
sink.getLocation() = location and | ||
element = sink.toString() and | ||
value = "y" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also match the empty string ""
here, see #5347
// but we don't support value-preserving varargs functions at the moment. | ||
"org.apache.commons.lang3;ObjectUtils;false;clone;;;Argument;ReturnValue;value", | ||
"org.apache.commons.lang3;ObjectUtils;false;cloneIfPossible;;;Argument;ReturnValue;value", | ||
"org.apache.commons.lang3;ObjectUtils;false;CONST;;;Argument;ReturnValue;value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also match CONST_BYTE
and CONST_SHORT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these won't work due to a type mismatch? Not 100% sure though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean they won't work for dataflow because CodeQL does not support dataflow with different types? Would it at least be good then to model them as taint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should work just fine. The type system that's used for pruning data flow allows for implicit casts, it effectively does not distinguish between the different numeric primitive types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Please elaborate. |
I tried defining value-preserving edges from the actual arguments to the implicit varargs array created at the callsite, and from the array to the return value, but no flow resulted, presumably due to type differences between the array and its elements? |
18af0ff
to
80b0ab7
Compare
@aschackmull applied the above comments, believe this is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably had some internal discussion about this, but in case varargs arguments should be covered with value
, the following review comments mention the necessary changes.
Please ignore them if flow from a varargs argument should still be treated as taint
.
Though there is some ambiguity for dataflow then, e.g. consider this (contrived) example:
public static Object[] copy(Object[]... args) {
return args.clone();
}
public static Object[] get(Object[]... args) {
return args[r.nextInt(args.length)];
}
Unless there is some special CSV model syntax, for one of the methods dataflow cannot be modeled correctly.
Maybe it would make sense to introduce a syntax similar to Argument[...]
to indicate that the input / output is an argument of the implicit varargs array?
E.g. for the above example code:
org.example;MyClass;false;copy;;;Argument;ReturnValue;value
org.example;MyClass;false;get;;;Argument[...];ReturnValue;value
override predicate row(string row) { | ||
row = | ||
[ | ||
// Note all the functions annotated with `taint` flow really should have `value` flow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment might now be redundant.
"org.apache.commons.lang3;ObjectUtils;false;CONST_BYTE;;;Argument;ReturnValue;value", | ||
"org.apache.commons.lang3;ObjectUtils;false;CONST_SHORT;;;Argument;ReturnValue;value", | ||
"org.apache.commons.lang3;ObjectUtils;false;defaultIfNull;;;Argument;ReturnValue;value", | ||
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;taint", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;taint", | |
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this has to be "taint" a while yet until we support proper specification of array contents. The argument is an array (possibly implicitly created), but the method does not return that exact array, so this is not value-preserving from Argument
to ReturnValue
. The proper value-preserving spec is from Content[Array] of Argument
to ReturnValue
, but that isn't supported yet.
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;taint", | ||
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;taint", | ||
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;taint", | ||
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;taint", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;taint", | |
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;taint", | |
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;taint", | |
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;taint", | |
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;value", | |
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;value", | |
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;value", | |
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but needs rebase to solve conflict.
…Utils. These should all be value-preserving, but we don't support value-preserving varargs methods yet.
Co-authored-by: Marcono1234 <Marcono1234@users.noreply.github.com>
04614d4
to
82a000b
Compare
@aschackmull rebased. |
|
||
override predicate hasActualResult(Location location, string element, string tag, string value) { | ||
tag = "hasTaintFlow" and | ||
exists(DataFlow::Node src, DataFlow::Node sink, Conf conf | conf.hasFlow(src, sink) | | ||
exists(DataFlow::Node src, DataFlow::Node sink, TaintFlowConf conf | conf.hasFlow(src, sink) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well add and not any(ValueFlowConf conf | conf.hasFlow(src, sink))
to make the expected output shorter - there's no need to report taint-flow if we have value-flow as that will always be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
These should all be value-preserving, but we don't support value-preserving varargs methods yet.