Python: Use `Query.qll` suffix for dataflow configuration definitions #8511

RasmusWL · 2022-03-21T13:37:35Z

Adopting the standard setup of defining the data flow configurations for our path-problem queries in a file called ...Query.qll, so it's very obvious that this file should only be used by the query. (there is a CI check for this on our own codebase).

Like JS did in #6450

When adopting this, I wanted us to do the same as other languages in terms of setup. For simple queries that only require a single data-flow configuration, all langauges define that in the ...Query.qll file. JS/Ruby always use the name Configuration for this configuration, which is also the most straightforward thing for us to do.

JS .ql file, Query.qll file
Ruby .ql file, Query.qll file
C# .ql file, Query.qll file
Java
- for the command injection query that I checked for other languages, they use a setup with a private configuration, and a predicate that exposes the results:
  .ql file, Query.qll file
- but for something like XSLT injection, they do like the rest, exposing the configuration from the Query.qll file, and using that directly in the .ql file, Query.qll file

However, it's not quite as obvious how we should handle cases where we need multiple configurations, currently ServerSideRequestForgeryQuery.qll, WeakSensitiveDataHashingQuery.qll, and LdapInjectionQuery.qll. I noted that LdapInjectionQuery.qll actually used a very different pattern than the two others. Instead of defining separate modules for each configuration, they just had different names within the same module 🤔

Honestly, the approach in LdapInjectionQuery.qll seems simpler and requires less boiler-plate code, so I will suggest adopting that approach for ServerSideRequestForgeryQuery.qll. For WeakSensitiveDataHashingQuery.qll I really do think that having the two separate ql-modules makes sense, since they both use different sinks/sources, and could have different sanitizers (exposing them as top-level configurations means that we have to rewrite source instanceof Source to source instanceof WeakSensitiveDataHashingCustomizations::NormalHashFunction::Source 😠)

According to git grep "extends TaintTracking::Configuration" **/*Query.qll | cut -d ':' -f 1 | uniq -c | grep -v "1 " the only file that has more than one extends TaintTracking::Configuration is csharp/ql/lib/semmle/code/csharp/security/dataflow/UnsafeDeserializationQuery.qll -- all 3 used directly in the query. So not too much inspiration to match against.

This commit in itself makes everything break, but should make it easy to follow the overall changes being made.

and move all the old deprecated aliases to that file. We now have a situation where all queries should work as they did before, and we just have these new Query.qll files that contain the implementation. (deprecation comes later)

So we stick to the naming conventions. This rename is OK, since the new file was only just introduced in this PR.

AHA! This change happened because we are no longer importing all the old deprecated implementation.

yoff

LGTM
It took me a few iterations and jumping between the single-commit-view and the all-commits-view, but I think the commit structure actually made it quite easy to understand the transformation (I just had to check that nothing was missed) 👍

RasmusWL · 2022-04-06T09:59:14Z

It took me a few iterations and jumping between the single-commit-view and the all-commits-view, but I think the commit structure actually made it quite easy to understand the transformation (I just had to check that nothing was missed)

After doing this I wasn't quite sure if I had made the right choice, so that makes me happy ☺️

github-actions bot added the Python label Mar 21, 2022

RasmusWL force-pushed the use-query-suffix branch from baa1046 to f2e3b7e Compare March 21, 2022 13:44

RasmusWL added 3 commits March 21, 2022 14:53

Python: Adopt Query.qll suffix for dataflow config defs

1bf8fa6

This commit in itself makes everything break, but should make it easy to follow the overall changes being made.

Python: Re-introduce old dataflow configs .qll files

0125aea

and move all the old deprecated aliases to that file. We now have a situation where all queries should work as they did before, and we just have these new Query.qll files that contain the implementation. (deprecation comes later)

Python: Autoformat

db86a18

RasmusWL force-pushed the use-query-suffix branch from f2e3b7e to db86a18 Compare March 21, 2022 13:54

RasmusWL added 3 commits March 21, 2022 15:03

Python: Deprecate old non-Query.qll dataflow defs

695553b

Python: ReflectedXSS -> ReflectedXss for new Query file

b8dee25

So we stick to the naming conventions. This rename is OK, since the new file was only just introduced in this PR.

Python: Add change-note

978ef05

github-actions bot added the documentation label Mar 21, 2022

Python: Update path-injection .expected

88184ba

AHA! This change happened because we are no longer importing all the old deprecated implementation.

RasmusWL marked this pull request as ready for review March 21, 2022 21:33

RasmusWL requested a review from a team as a code owner March 21, 2022 21:33

yoff approved these changes Apr 6, 2022

View reviewed changes

RasmusWL merged commit 4d2a3b3 into github:main Apr 6, 2022

RasmusWL deleted the use-query-suffix branch April 6, 2022 09:59

pholleran mentioned this pull request Jul 18, 2022

FINOS Security Scanning - Software Project Contribution and Onboarding finos/community#201

Closed

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Use `Query.qll` suffix for dataflow configuration definitions #8511

Python: Use `Query.qll` suffix for dataflow configuration definitions #8511

Uh oh!

RasmusWL commented Mar 21, 2022 •

edited

Loading

Uh oh!

yoff left a comment

Uh oh!

RasmusWL commented Apr 6, 2022

Uh oh!

Uh oh!

Python: Use Query.qll suffix for dataflow configuration definitions #8511

Python: Use Query.qll suffix for dataflow configuration definitions #8511

Uh oh!

Conversation

RasmusWL commented Mar 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

RasmusWL commented Apr 6, 2022

Uh oh!

Uh oh!

Python: Use `Query.qll` suffix for dataflow configuration definitions #8511

Python: Use `Query.qll` suffix for dataflow configuration definitions #8511

RasmusWL commented Mar 21, 2022 •

edited

Loading