Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS: expand the js/exception-xss to handle more types of exceptional flow #2512

Merged
merged 16 commits into from
Dec 20, 2019

Conversation

erik-krogh
Copy link
Contributor

I expanded the scope of the js/exception-xss query.

The following new exceptional flows are considered:

  • Accessing a tainted field name on null/undefined.
    E.g: try {null[tainted]} catch(err) {..}
  • Callback functions with an error argument.
    E.g: foo(tainted, (err, v) => ..);
  • Exceptional flow from promise executor to catch handler.
    E.g: new Promise((resolve, reject) => {throw tainted}).catch((err) => {..})

The results are still largely untested.
I'm decently confident in the performance of the current version, but I haven't tested it yet.

I put a cache modifier on the isAdditionalFlowStep predicate as I otherwise got a massive performance penalty on some benchmarks (oldj/SwitchHosts in particular).
For some reason the Configuration::BarrierGuardNode::internalBlocks_dispred#ffb predicate began taking a LOT of time (the query is 4X slower on some benchmarks without the cache modifier).

Additionally I've added (and used) the getEnclosingCall predicate.
This predicate allows me to support the following pattern where the tainted value is inside some object/array when it escapes to an unknown function.

try {
   unknown({foo: tainted});
} catch(err) {
   element.innerHtml = err;
}

@erik-krogh erik-krogh added the JS label Dec 11, 2019
@erik-krogh erik-krogh requested a review from a team as a code owner December 11, 2019 10:33
@@ -65,16 +125,20 @@ module ExceptionXss {

override predicate isSanitizer(DataFlow::Node node) { node instanceof Xss::Shared::Sanitizer }

cached
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't immediately make sense to me: this predicate is only used in one query, so caching it is not useful. I suspect the performance benefits you were observing were due to cached preventing harmful magic. Could you try replacing this with pragma[nomagic]? If that has the same performance benefits, I would suggest using it instead since it more clearly communicates the intention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding pragma[nomagic] on isAdditionalFlowStep does not fix the performance.

I also tried to add pragma[nomagic] to internalBlocks, and that got me from a ~4X slowdown to a ~2X slowdown.

I'll experiment some more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there are a few other pragmas that cached implies, notably pragma[noinline].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a whole lot of benchmarking I found that adding pragma[noinline,nomagic] on Configuration::internalBlocks fixes the performance.
Adding just pragma[noinline] or pragma[nomagic] is not enough, both are needed to avoid a performance regression on oldj/SwitchHosts.

But with this change we will need a larger performance evaluation before we can merge.

Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very uncertain about the getEnclosingCall predicate, both in terms of performance and precision. Can we perhaps delay merging that feature? It seems independent of the rest from the features of this PR.

@erik-krogh
Copy link
Contributor Author

I got a performance result (that includes the now removed getEnclosingCall).
And performance looks somewhat OK.

Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, with getEnclosingCall gone, I have some hope that the pragma[xyz] can go away at the end of this PR. Lets iterate once more on the error parameter characteristic first though.

@erik-krogh
Copy link
Contributor Author

Thanks, with getEnclosingCall gone, I have some hope that the pragma[xyz] can go away.

The pragma can go away now, I just did the experiment.

Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for pivoting and arguing with myself. I would like to get a second pair of eyes on the reachability analysis.

Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close. My only comments now are about visibility, style and documentation.

)
}

DataFlow::Node getErrorParam() { result = errorParameter }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docstring.

DataFlow::Node getErrorParam() { result = errorParameter }
}

// `someFunction(.. <pred> .., (<result>, value) => {...}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing docstring.

)
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be private.

This predicates makes up a verbose alias for getExceptionTarget.
How about renaming this predicate to getExceptionTarget, and explaining in the docstring how it adds additional cases to the builtin syntactic Expr::getExceptionTarget?
Alternatively we should just inline this predicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of renaming, and then explaining how it adds cases.

}

new Promise(resolve => inner(foo, resolve)).catch((e) => {
$('myId').html(e); // NOT OK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor.
This line is definitely not indented correctly.
The are other similar indentation inconsistencies in this file.

Can you autoformat the entire file, and update the expected output?

/**
* A data flow edge from the exceptional return of the promise executor to the promise catch handler.
*/
class PromiseExceptionalStep extends DataFlow::AdditionalFlowStep {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this step really feature complete enough yet? Perhaps we should just keep it private, and make a note of the features that we are aware that it is missing:

  • support for flow of exceptions to the catch block of an await
  • support for flow of exceptions to the first catch-handler in a chain of promise-steps: new Promise(res, rej).then(..).then(..).catch(e => FLOWS_HERE).then(...).catch(e => BUT_NOT_HERE)

Also, since this is a default taint step for all TaintTracking::Configuration, I want to make sure that we have done an evaluation with --suite security.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put in a comment about what is missing.
I'll maybe return to it at some and add more flow edges.

I'll get an evaluation going.

erik-krogh and others added 3 commits December 16, 2019 13:32
Copy link
Collaborator

@max-schaefer max-schaefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor suggestions, but overall this looks fairly uncontroversial. 👍

javascript/ql/src/semmle/javascript/StandardLibrary.qll Outdated Show resolved Hide resolved
/**
* A data flow edge from the exceptional return of the promise executor to the promise catch handler.
* This only adds an edge from the exceptional return of the promise Executor and to a `.catch()` handler.
* Missing are (at least):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like something that should be put into an issue, not a code comment.

or
exists(DataFlow::PropRef prop |
node = DataFlow::valueNode(prop.getPropertyNameExpr()) and
isNullOrUndefined(prop.getBase().analyze().getAType())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure about this condition. Our type inference is very conservative, and in particular is likely to infer undefined as a possible type of just about everything (apart from constants and sometimes SSA variables).

I think you may want to strengthen this to forex(InferredType t | t = prop.getBase().analyze().getAType() | isNullOrUndefined(t)) to only cover those cases where we are quite sure the base is null-ish.

(Note that we don't want to use forall, since expressions in dead code sometimes don't have any types inferred for them.)

Comment on lines 102 to 103
* Gets the data-flow node where exceptions thrown by this expression will
* propagate if this expression causes an exception to be thrown.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a bit repetitive. How about shortening to "Gets the data-flow node to which any exceptions thrown by this expression will propagate"?

erik-krogh and others added 2 commits December 17, 2019 13:05
Co-Authored-By: Max Schaefer <54907921+max-schaefer@users.noreply.github.com>
Copy link
Collaborator

@max-schaefer max-schaefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but we should probably do another (small) performance evaluation.

@erik-krogh
Copy link
Contributor Author

LGTM, but we should probably do another (small) performance evaluation.

I already had a larger performance evaluation running.
It looks somewhat OK to me.

@max-schaefer
Copy link
Collaborator

OK, that looks fairly healthy. Could you retry nuclide and JazminServer? Looks like that should be quick, and it should give us an idea of whether the reported slowdowns are likely to be flukes.

Also, what were the query failures on bwip-js about? And what about the build failures on babel?

@erik-krogh
Copy link
Contributor Author

OK, that looks fairly healthy. Could you retry nuclide and JazminServer? Looks like that should be quick, and it should give us an idea of whether the reported slowdowns are likely to be flukes.

The slowdowns seem to be flukes.
https://git.semmle.com/erik/dist-compare-reports/tree/profiling-erik-krogh.northeurope.cloudapp.azure.com_1576690361008

Also, what were the query failures on bwip-js about?

CWE-807/DifferentKindsComparisonBypass.ql ran into a timeout.
But the exact same timeout happened on both master and moarExceptions.

And what about the build failures on babel?

That has nothing to do with this PR.
The build failure comes from a new file in babel that we cannot parse.
(I'm creating an issue).

The following exception is thrown:

Exception while extracting /home/erik/dev/dist-compare/working/repositories/babel/packages/babel-parser/test/fixtures/core/categorized/invalid-assignment-pattern-2/input.js.
com.semmle.util.exception.CatastrophicError: Assignment patterns should not appear in the AST.
   at com.semmle.js.ast.DefaultVisitor.visit(DefaultVisitor.java:101)

@max-schaefer
Copy link
Collaborator

CWE-807/DifferentKindsComparisonBypass.ql ran into a timeout.
But the exact same timeout happened on both master and moarExceptions.

That is very concerning. Can you file an issue please?

@erik-krogh
Copy link
Contributor Author

@esbena can I get an approve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants