Python: Use Query.qll
suffix for dataflow configuration definitions
#8511
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adopting the standard setup of defining the data flow configurations for our path-problem queries in a file called
...Query.qll
, so it's very obvious that this file should only be used by the query. (there is a CI check for this on our own codebase).Like JS did in #6450
When adopting this, I wanted us to do the same as other languages in terms of setup. For simple queries that only require a single data-flow configuration, all langauges define that in the
...Query.qll
file. JS/Ruby always use the nameConfiguration
for this configuration, which is also the most straightforward thing for us to do..ql file, Query.qll file
However, it's not quite as obvious how we should handle cases where we need multiple configurations, currently
ServerSideRequestForgeryQuery.qll
,WeakSensitiveDataHashingQuery.qll
, andLdapInjectionQuery.qll
. I noted thatLdapInjectionQuery.qll
actually used a very different pattern than the two others. Instead of defining separate modules for each configuration, they just had different names within the same module 🤔Honestly, the approach in
LdapInjectionQuery.qll
seems simpler and requires less boiler-plate code, so I will suggest adopting that approach forServerSideRequestForgeryQuery.qll
. ForWeakSensitiveDataHashingQuery.qll
I really do think that having the two separate ql-modules makes sense, since they both use different sinks/sources, and could have different sanitizers (exposing them as top-level configurations means that we have to rewritesource instanceof Source
tosource instanceof WeakSensitiveDataHashingCustomizations::NormalHashFunction::Source
😠)According to
git grep "extends TaintTracking::Configuration" **/*Query.qll | cut -d ':' -f 1 | uniq -c | grep -v "1 "
the only file that has more than oneextends TaintTracking::Configuration
is csharp/ql/lib/semmle/code/csharp/security/dataflow/UnsafeDeserializationQuery.qll -- all 3 used directly in the query. So not too much inspiration to match against.