Skip to content

Conversation

RasmusWL
Copy link
Member

@RasmusWL RasmusWL commented Mar 26, 2020

When running ql/python/ql/src/Security/CWE-079/ReflectedXss.ql against the database for flask.

Iitially there were 10 million result-tuples for iterable_unpacking_descent.

With this change, we're down to roughly 2100.

Fixes the problem introduced in #2700, and disabled in #2973

When running ql/python/ql/src/Security/CWE-079/ReflectedXss.ql against the
database for flask.

Iitially there were 10 million result-tuples for iterable_unpacking_descent.

With this change, we're down to roughly 2100,
@RasmusWL
Copy link
Member Author

@RasmusWL RasmusWL marked this pull request as ready for review March 27, 2020 12:23
@RasmusWL RasmusWL requested a review from a team as a code owner March 27, 2020 12:23
@RasmusWL RasmusWL requested review from BekaValentine and tausbn and removed request for a team March 27, 2020 12:23
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment/question, otherwise I think this is okay. I stared at it for a while, and I couldn't see any obvious problems (but I'm not sure I trust that there are none based on this! 😬).
I'm curious to see how the performance tests pan out.
I would also like to see the RA generated/executed on a large snapshot for the predicates involved (i.e. the main predicate and the two helper predicates).

result = parent_kind.getMember()
or
// recursive case
result = taint_at_depth(parent_kind.getMember(), depth-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a depth > 1 constraint (so that the three disjuncts are indeed disjoint)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! This won't change the results, just the executions that's that we have to evaluate 👍

@RasmusWL
Copy link
Member Author

I'm curious to see how the performance tests pan out.

https://jenkins.internal.semmle.com/job/Changes/job/Python-Differences/33/ looks fine. Are you asking for a dist-compare besides this?

I would also like to see the RA generated/executed on a large snapshot for the predicates involved (i.e. the main predicate and the two helper predicates).

Alright. Is saltstack/salt large enough for you with ~600k lines of python code? and I think I need some guidance on the RA generated/executed since I wouldn't know exactly which parts to pick from the query logs.

Copy link
Contributor

@BekaValentine BekaValentine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I especially like the long comment with an explanatory example, it helps put the implementation into a context of intention.

@RasmusWL
Copy link
Member Author

I ran ReflectedXss.ql on the current snapshot for saltstack/salt, and I didn't see any problems. (this I the same query I used doing my other tests)

I created a gist with the query logs, but that only shows the first 8 thousand lines, so you need to look at the raw results to see all 56k lines 😉

Looking at all the places it says iterable_unpacking_descent I didn't see anything of concern.

Is this sort of what you were asking for @tausbn? -- when you said:

I would also like to see the RA generated/executed on a large snapshot for the predicates involved (i.e. the main predicate and the two helper predicates).

@tausbn
Copy link
Contributor

tausbn commented Mar 30, 2020

Alright. Is saltstack/salt large enough for you with ~600k lines of python code? and I think I need some guidance on the RA generated/executed since I wouldn't know exactly which parts to pick from the query logs.

Sure, salt should be plenty big. What I meant by the RA would amount to:

  1. Running an appropriate query, generating a log with tuple counts (since this has to do with taint tracking, I imagine most taint tracking queries would do).
  2. Looking for the place in the log where the iterable_unpacking_descent and taint_at_depth predicates are evaluated, to see whether they result in a "reasonable" amount of tuples, whether there are any bad join orders or cartesian products, etc. Basically this amounts to searching for those names until you find "tuple counts for " followed by the name of the predicate.

For 1. you'll most likely want to use a recent version of VSCode, with the most recent improvements to log handling. (Or you could also just run it manually.)

@tausbn
Copy link
Contributor

tausbn commented Mar 30, 2020

Oh, hah. I was in the process of writing my comment when you added yours. I'll take a quick look at the results.

Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RA and tuple counts look fine to me. Merging.

@tausbn tausbn merged commit b4fbfa0 into github:master Mar 30, 2020
@RasmusWL RasmusWL deleted the python-fix-iterable-unpacking-taint-CP branch March 30, 2020 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants