Skip to content

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Feb 9, 2021

No description provided.

@smowton smowton requested a review from a team February 9, 2021 14:23
@smowton smowton changed the title Add support for Apache Commons Lang StringUtils Java: Add support for Apache Commons Lang StringUtils Feb 9, 2021
}

override predicate returnsTaintFrom(int arg) {
arg = [0 .. getNumberOfParameters()] and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arg = [0 .. getNumberOfParameters()] and
arg = [0 .. getNumberOfParameters() - 1] and

Not sure if this makes a difference, but since the index is 0-based, 1 has to be subtracted from the number of parameters.

Comment on lines 168 to 172
private Type getAnExcludedParameterType() {
result instanceof PrimitiveType or
result.(RefType).hasQualifiedName("java.nio.charset", "Charset") or
result.(RefType).hasQualifiedName("java.util", "Locale")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to exclude the primitive array types (besides char[]) in the join methods?

Comment on lines 174 to 193
bindingset[arg, methodName]
private predicate isExcludedParameter(string methodName, int arg) {
methodName.matches(["appendIfMissing%", "prependIfMissing%"]) and arg = [2, 3]
or
methodName.matches(["remove%", "split%", "substring%", "strip%"]) and arg != 0
or
methodName.matches(["chomp", "getBytes", "replace%", "toString", "unwrap"]) and arg = 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate and getAnExcludedParameterType don't depend on this. Would it be better to make them top-level predicates?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no need to make a spurious cartesian product like this (although the optimiser is likely able to fix this problem). Definitely make getAnExcludedParameterType top-level. As for this one, you could instead change methodName to this.getName(), that would also remove the need for bindingset.

void test() throws Exception {

// All these calls should convey taint to `sink` except as noted.
sink(StringUtils.abbreviate(taint(), 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sort of data flow tests quickly become unwieldy - perhaps it's time to slowly start using the inline test expectation library?

@smowton smowton force-pushed the smowton/feature/commons-stringutils branch 2 times, most recently from 247f098 to 1b80a97 Compare February 11, 2021 17:52
@smowton
Copy link
Contributor Author

smowton commented Feb 11, 2021

@aschackmull @joefarebrother all suggestions applied, including writing the binding for inline-expectations vs. Java.

(
this.getName().matches(["appendIfMissing%", "prependIfMissing%"]) and arg = [2, 3]
or
this.getName().matches(["remove%", "split%", "substring%", "strip%"]) and arg != 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we replace arg != 0 with arg = [1 .. getNumberOfParameters() - 1] then we don't need the first line of this predicate since then arg will be bound in all the disjuncts.

@aschackmull
Copy link
Contributor

The inline expectations library copy should be added in config/identical-files.json.

SafeJoinArrayType() { getName().regexpMatch("byte|double|float|int|long|short") }
}

private class ApacheStringUtilsTaintPreservingMethod extends TaintPreservingCallable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this might fit nicer in java/ql/src/semmle/code/java/frameworks/apache/Lang.qll. See also #5134

@smowton smowton force-pushed the smowton/feature/commons-stringutils branch from 1b80a97 to 985766d Compare February 12, 2021 15:00
@smowton
Copy link
Contributor Author

smowton commented Feb 12, 2021

@aschackmull all changes applied

@smowton smowton force-pushed the smowton/feature/commons-stringutils branch 2 times, most recently from 5fea772 to ae83a93 Compare February 15, 2021 14:30
@smowton
Copy link
Contributor Author

smowton commented Feb 15, 2021

@aschackmull changes applied except as noted

Comment on lines 73 to 75
private class SafeJoinArrayType extends PrimitiveType {
SafeJoinArrayType() { getName().regexpMatch("byte|double|float|int|long|short") }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private class SafeJoinArrayType extends PrimitiveType {
SafeJoinArrayType() { getName().regexpMatch("byte|double|float|int|long|short") }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh. Mind approving so I can merge once this is done?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't approve just yet, as there are still changes I expect (and any push will remove approval). Also, could you please keep the review fixes in their own commits? Force-pushes make review difficult. (Force-pushing when rebasing is fine, though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK will do in future. This change is pushed already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protip btw, you can click the words "force-pushed" to see the diff before vs. after the push

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but in my experience that doesn't always work very well. It seems like what you get is somewhat ill-defined and a best-effort approach at most.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. clicking on the second-to-last "force-pushed" shows me several unrelated changes, possibly related to a rebase, but if that's then mixed with actual changes then those parts are very hard to separate.

@smowton smowton force-pushed the smowton/feature/commons-stringutils branch from ae83a93 to 9a667dc Compare February 16, 2021 12:24
aschackmull
aschackmull previously approved these changes Feb 16, 2021
@aschackmull aschackmull merged commit 5188ad1 into github:main Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants