-
Notifications
You must be signed in to change notification settings - Fork 1.8k
JS: using pseudo-properties to model URL parsing #2761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the code a bit difficult to read so for now I'll just try an relay back my understanding of the code, so we can catch any misunderstandings.
The pseudo-property represents two things lumped into a single property (which is fine):
- the search parameters of a URL object
- the keys and values of a Map-like object (URLSearchParams)
So a property read x.searchParams
is modelled as a load-store step of the pseudo-property in order to transition from case 1 to case 2 when accessing the search parameters of a URL object.
There's then a load step out of the pseudo-property when calling get()
.
Assuming I got that right, I like the approach. I'd like it if you could make the code a little more accessible, though. Perhaps also add support for handle fragment data while you're at it 👍
*/ | ||
predicate isUrlSearchParams(DataFlow::SourceNode params, DataFlow::Node input) { | ||
private predicate isUrlSearchParams(DataFlow::SourceNode params, DataFlow::Node input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a deprecated alias for this for backwards compatibility (even if it doesn't have the original behavior anymore, deprecation warnings are better than compilation errors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll just remove the private
modifier.
The name is still fitting for the behavior, so I don't feel like adding yet another alias.
override predicate step(DataFlow::Node pred, DataFlow::Node succ) { | ||
pred = source and succ = this | ||
isUrlSearchParams(succ, pred) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this
is not bound in here (and likewise for the other member predicates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see two ways of solving this.
- Do the same as
StringManipulationTaintStep
: extendDataFlow::ValueNode
and just bindsucc = this
. - Split out the taint-steps into multiple classes, and add a characteristic predicates for each of the new classes. (Will become even more verbose).
I went with 1) for now.
/** | ||
* Holds if the property `prop` should be copied from the object `pred` to the object `succ`. | ||
* | ||
* This step is used to copy a value the value of our pseudo-property that can later be accessed using a `get` or `getAll` call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* This step is used to copy a value the value of our pseudo-property that can later be accessed using a `get` or `getAll` call. | |
* This step is used to copy the value of our pseudo-property that can later be accessed using a `get` or `getAll` call. |
*/ | ||
override predicate loadStoreStep(DataFlow::Node pred, DataFlow::Node succ, string prop) { | ||
prop = hiddenUrlPseudoProperty() and | ||
exists(DataFlow::PropRead write | write = succ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this PropRead
in a variable called write
? 😕
…different properties
Your understanding is correct. I had the two cases lumped into the same pseudo-property because a load-store step only supported loading and storing the same property name. I'll look into fragment data. |
An evaluation was uneventful. |
Sorry for not following up on this earlier. I like the solution, but there are a few issues that mean we're not realizing the full benefit of this change. Overall I'd like us to be able to flag this sample vuln: function getUrl() {
return new URL(document.location);
}
$(getUrl().hash.substring(1)); // NOT OK There are two issues with this at the moment:
The PR would be good to land IMO, but I'd like to have some more data to verify that we're doing the right thing. I'll experiment a little bit to see what the best solution is so we can get this PR landed. |
If I merge in #2919 the above example will be flagged, and e.g. But only if the source has flowlabel Here are some examples of new flow-edges, they are not really interesting. (The results might be better once #2919 hits LGTM). I'll fix the merge conflict after #2919 has been merged, and I might also do another evaluation at that point. |
I did a new evaluation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add this to DomBasedXss::Configuration
we should be able to handle the hash
example:
override predicate isAdditionalLoadStoreStep(
DataFlow::Node pred, DataFlow::Node succ, string predProp, string succProp
) {
exists(DataFlow::PropRead read |
pred = read.getBase() and
succ = read and
read.getPropertyName() = "hash" and
predProp = "hash" and
succProp = "$UrlSuffix"
)
}
override predicate isAdditionalLoadStep(DataFlow::Node pred, DataFlow::Node succ, string prop) {
exists(DataFlow::MethodCallNode call, string name |
name = "substr" or name = "substring" or name = "slice"
|
call.getMethodName() = name and
not call.getArgument(0).getIntValue() = 0 and
pred = call.getReceiver() and
succ = call and
prop = "$UrlSuffix"
)
}
(this is why I wanted sanitizers to not block objects)
But only if the source has flowlabel "data"
This would make it ignore all sanitizers, due to #2919, so it's kind of a no-go.
I'm a little sad that we haven't been able to find any concrete results from this (not for lack of trying), but it seems reasonably safe. We should probably avoid sinking more time into this until we have some motivating (real) examples on hand.
@@ -0,0 +1,3 @@ | |||
nodes | |||
edges | |||
#select |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove .actual
file
I'll get that in there, then I'll run one last evaluation based on that. |
Here is an evaluation on many benchmarks just with |
Now the property
searchParams
is modeled using a pseudo-property.Additionally I use another pseudo-property to model that calling a getter on a
URLSearchParams
retrieves the parsed parameters (seehiddenUrlPseudoProperty
and its uses).This means that the previous behavior is preserved, and tracking of the properties now work interprocedurally.
E.g. with the previous local handling of
searchParams
we didn't track flow out of this return (found using this query).