Skip to content

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Mar 5, 2021

These should all be value-preserving, but we don't support value-preserving varargs methods yet.

@smowton smowton requested a review from a team as a code owner March 5, 2021 18:04
exists(DataFlow::Node src, DataFlow::Node sink, TaintFlowConf conf | conf.hasFlow(src, sink) |
sink.getLocation() = location and
element = sink.toString() and
value = "y"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also match the empty string "" here, see #5347

// but we don't support value-preserving varargs functions at the moment.
"org.apache.commons.lang3;ObjectUtils;false;clone;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;cloneIfPossible;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;CONST;;;Argument;ReturnValue;value",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also match CONST_BYTE and CONST_SHORT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these won't work due to a type mismatch? Not 100% sure though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean they won't work for dataflow because CodeQL does not support dataflow with different types? Would it at least be good then to model them as taint?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should work just fine. The type system that's used for pruning data flow allows for implicit casts, it effectively does not distinguish between the different numeric primitive types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@aschackmull
Copy link
Contributor

but we don't support value-preserving varargs methods yet.

Please elaborate.

@smowton
Copy link
Contributor Author

smowton commented Mar 9, 2021

but we don't support value-preserving varargs methods yet.

Please elaborate.

I tried defining value-preserving edges from the actual arguments to the implicit varargs array created at the callsite, and from the array to the return value, but no flow resulted, presumably due to type differences between the array and its elements?

@smowton smowton force-pushed the smowton/feature/commons-object-utils branch from 18af0ff to 80b0ab7 Compare March 9, 2021 15:55
@smowton
Copy link
Contributor Author

smowton commented Mar 9, 2021

@aschackmull applied the above comments, believe this is ready

Copy link
Contributor

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably had some internal discussion about this, but in case varargs arguments should be covered with value, the following review comments mention the necessary changes.
Please ignore them if flow from a varargs argument should still be treated as taint.

Though there is some ambiguity for dataflow then, e.g. consider this (contrived) example:

public static Object[] copy(Object[]... args) {
    return args.clone();
}

public static Object[] get(Object[]... args) {
    return args[r.nextInt(args.length)];
}

Unless there is some special CSV model syntax, for one of the methods dataflow cannot be modeled correctly.
Maybe it would make sense to introduce a syntax similar to Argument[...] to indicate that the input / output is an argument of the implicit varargs array?
E.g. for the above example code:

org.example;MyClass;false;copy;;;Argument;ReturnValue;value
org.example;MyClass;false;get;;;Argument[...];ReturnValue;value

override predicate row(string row) {
row =
[
// Note all the functions annotated with `taint` flow really should have `value` flow,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment might now be redundant.

"org.apache.commons.lang3;ObjectUtils;false;CONST_BYTE;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;CONST_SHORT;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;defaultIfNull;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;firstNonNull;;;Argument;ReturnValue;value",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this has to be "taint" a while yet until we support proper specification of array contents. The argument is an array (possibly implicitly created), but the method does not return that exact array, so this is not value-preserving from Argument to ReturnValue. The proper value-preserving spec is from Content[Array] of Argument to ReturnValue, but that isn't supported yet.

Comment on lines +397 to +420
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;taint",
"org.apache.commons.lang3;ObjectUtils;false;max;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;median;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;min;;;Argument;ReturnValue;value",
"org.apache.commons.lang3;ObjectUtils;false;mode;;;Argument;ReturnValue;value",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Same as above.

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but needs rebase to solve conflict.

smowton and others added 4 commits March 11, 2021 16:22
…Utils.

These should all be value-preserving, but we don't support value-preserving varargs methods yet.
Co-authored-by: Marcono1234 <Marcono1234@users.noreply.github.com>
@smowton smowton force-pushed the smowton/feature/commons-object-utils branch from 04614d4 to 82a000b Compare March 11, 2021 16:23
@smowton
Copy link
Contributor Author

smowton commented Mar 11, 2021

@aschackmull rebased.


override predicate hasActualResult(Location location, string element, string tag, string value) {
tag = "hasTaintFlow" and
exists(DataFlow::Node src, DataFlow::Node sink, Conf conf | conf.hasFlow(src, sink) |
exists(DataFlow::Node src, DataFlow::Node sink, TaintFlowConf conf | conf.hasFlow(src, sink) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well add and not any(ValueFlowConf conf | conf.hasFlow(src, sink)) to make the expected output shorter - there's no need to report taint-flow if we have value-flow as that will always be the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@smowton smowton changed the title Add models for flow- and taint-preserving functions in Commons ObjectUtils Java: Add models for flow- and taint-preserving functions in Commons ObjectUtils Mar 12, 2021
@aschackmull aschackmull merged commit c9786df into github:main Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants