Skip to content

Conversation

erik-krogh
Copy link
Contributor

Now the property searchParams is modeled using a pseudo-property.

Additionally I use another pseudo-property to model that calling a getter on a URLSearchParams retrieves the parsed parameters (see hiddenUrlPseudoProperty and its uses).

This means that the previous behavior is preserved, and tracking of the properties now work interprocedurally.

E.g. with the previous local handling of searchParams we didn't track flow out of this return (found using this query).

@erik-krogh erik-krogh added JS Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish labels Feb 4, 2020
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the code a bit difficult to read so for now I'll just try an relay back my understanding of the code, so we can catch any misunderstandings.

The pseudo-property represents two things lumped into a single property (which is fine):

  1. the search parameters of a URL object
  2. the keys and values of a Map-like object (URLSearchParams)

So a property read x.searchParams is modelled as a load-store step of the pseudo-property in order to transition from case 1 to case 2 when accessing the search parameters of a URL object.

There's then a load step out of the pseudo-property when calling get().


Assuming I got that right, I like the approach. I'd like it if you could make the code a little more accessible, though. Perhaps also add support for handle fragment data while you're at it 👍

*/
predicate isUrlSearchParams(DataFlow::SourceNode params, DataFlow::Node input) {
private predicate isUrlSearchParams(DataFlow::SourceNode params, DataFlow::Node input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a deprecated alias for this for backwards compatibility (even if it doesn't have the original behavior anymore, deprecation warnings are better than compilation errors).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll just remove the private modifier.
The name is still fitting for the behavior, so I don't feel like adding yet another alias.

override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
pred = source and succ = this
isUrlSearchParams(succ, pred)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not bound in here (and likewise for the other member predicates)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two ways of solving this.

  1. Do the same as StringManipulationTaintStep: extend DataFlow::ValueNode and just bind succ = this.
  2. Split out the taint-steps into multiple classes, and add a characteristic predicates for each of the new classes. (Will become even more verbose).

I went with 1) for now.

/**
* Holds if the property `prop` should be copied from the object `pred` to the object `succ`.
*
* This step is used to copy a value the value of our pseudo-property that can later be accessed using a `get` or `getAll` call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This step is used to copy a value the value of our pseudo-property that can later be accessed using a `get` or `getAll` call.
* This step is used to copy the value of our pseudo-property that can later be accessed using a `get` or `getAll` call.

*/
override predicate loadStoreStep(DataFlow::Node pred, DataFlow::Node succ, string prop) {
prop = hiddenUrlPseudoProperty() and
exists(DataFlow::PropRead write | write = succ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this PropRead in a variable called write? 😕

@erik-krogh
Copy link
Contributor Author

I found the code a bit difficult to read so for now I'll just try an relay back my understanding of the code, so we can catch any misunderstandings.

The pseudo-property represents two things lumped into a single property (which is fine):

  1. the search parameters of a URL object
  2. the keys and values of a Map-like object (URLSearchParams)

So a property read x.searchParams is modelled as a load-store step of the pseudo-property in order to transition from case 1 to case 2 when accessing the search parameters of a URL object.

There's then a load step out of the pseudo-property when calling get().

Assuming I got that right, I like the approach. I'd like it if you could make the code a little more accessible, though. Perhaps also add support for handle fragment data while you're at it

Your understanding is correct.

I had the two cases lumped into the same pseudo-property because a load-store step only supported loading and storing the same property name.
I changed it such that a load-store step can load one property and store another.
I'm thereby better able to distinguish the two cases in the code.

I'll look into fragment data.

@erik-krogh
Copy link
Contributor Author

An evaluation was uneventful.

@erik-krogh erik-krogh marked this pull request as ready for review February 6, 2020 12:16
@erik-krogh erik-krogh requested a review from a team as a code owner February 6, 2020 12:16
@erik-krogh erik-krogh removed the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Feb 6, 2020
@asgerf
Copy link
Contributor

asgerf commented Feb 20, 2020

Sorry for not following up on this earlier.

I like the solution, but there are a few issues that mean we're not realizing the full benefit of this change. Overall I'd like us to be able to flag this sample vuln:

function getUrl() {
    return new URL(document.location);
}
$(getUrl().hash.substring(1)); // NOT OK

There are two issues with this at the moment:

  • The .hash property is a sanitizer in the Xss query, which is very restrictive, but was necessary for avoiding FPs due to variations of the $(location.hash) pattern. This was one of the main motivations to do more precise tracking of URLs. Due to this sanitizer, we still don't flag this vuln:

  • Barriers/sanitizers currently block flow even when the tracked value is inside a property. Depending on how we resolve the above issue, this may or may not become a problem. I've discussed it with @max-schaefer and I'll experiment with changing this.

The PR would be good to land IMO, but I'd like to have some more data to verify that we're doing the right thing. I'll experiment a little bit to see what the best solution is so we can get this PR landed.

@erik-krogh
Copy link
Contributor Author

Overall I'd like us to be able to flag this sample vuln:

function getUrl() {
    return new URL(document.location);
}
$(getUrl().hash.substring(1)); // NOT OK

If I merge in #2919 the above example will be flagged, and e.g. $(window.location.hash) is still not flagged.

But only if the source has flowlabel "data" (The source in the Xss query has flowlabel "taint").

Here are some examples of new flow-edges, they are not really interesting. (The results might be better once #2919 hits LGTM).

I'll fix the merge conflict after #2919 has been merged, and I might also do another evaluation at that point.

@erik-krogh
Copy link
Contributor Author

I did a new evaluation.
Performance looks ok, but no new results.

Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add this to DomBasedXss::Configuration we should be able to handle the hash example:

override predicate isAdditionalLoadStoreStep(
  DataFlow::Node pred, DataFlow::Node succ, string predProp, string succProp
) {
  exists(DataFlow::PropRead read |
    pred = read.getBase() and
    succ = read and
    read.getPropertyName() = "hash" and
    predProp = "hash" and
    succProp = "$UrlSuffix"
  )
}

override predicate isAdditionalLoadStep(DataFlow::Node pred, DataFlow::Node succ, string prop) {
  exists(DataFlow::MethodCallNode call, string name |
    name = "substr" or name = "substring" or name = "slice"
  |
    call.getMethodName() = name and
    not call.getArgument(0).getIntValue() = 0 and
    pred = call.getReceiver() and
    succ = call and
    prop = "$UrlSuffix"
  )
}

(this is why I wanted sanitizers to not block objects)

But only if the source has flowlabel "data"

This would make it ignore all sanitizers, due to #2919, so it's kind of a no-go.

I'm a little sad that we haven't been able to find any concrete results from this (not for lack of trying), but it seems reasonably safe. We should probably avoid sinking more time into this until we have some motivating (real) examples on hand.

@@ -0,0 +1,3 @@
nodes
edges
#select
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove .actual file

@erik-krogh
Copy link
Contributor Author

If we add this to DomBasedXss::Configuration we should be able to handle the hash example:

I'll get that in there, then I'll run one last evaluation based on that.

@erik-krogh
Copy link
Contributor Author

If we add this to DomBasedXss::Configuration we should be able to handle the hash example:

I'll get that in there, then I'll run one last evaluation based on that.

Here is an evaluation on many benchmarks just with Xss.ql.
Still no new results, but performance is good.

@semmle-qlci semmle-qlci merged commit 0feb7f8 into github:master Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants