-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python: Document MaD format #16789
python: Document MaD format #16789
Conversation
- add a few tests reflecting the documentation - make the mentioned sink-kinds have an effect on relevant queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good docs-wise, thank you! Only some really minor comments
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Ben Ahmady <32935794+subatoi@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really good stuff! I have made a bunch of comments/suggestions, but overall I think this is really solid. 👍
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
data: | ||
- ["invoke.Context", "invoke", "Member[context].Member[Context].Instance"] | ||
|
||
- The first column, **"invoke.Context"**, is the name of the type to reach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The first column, **"invoke.Context"**, is the name of the type to reach. | |
- The first column, **"invoke.Context"**, is the name of the type that we're defining. |
I found "type to reach" to be very vague and confusing. Perhaps this is a better formulation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not feel that we are defining this type, though. In cases where we make up a name, say to model some internal concept, I would say we are defining a type. But here, we are simply describing how one can obtain (an instance of) the type. Given that this can be used in various contexts, I actually like that it is a bit vague.
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
data: | ||
- [ | ||
"django.db.models.FileField!", | ||
"Call.Argument[upload_to:].Parameter[1,filenam:]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Call.Argument[upload_to:].Parameter[1,filenam:]", | |
"Call.Argument[upload_to:].Parameter[1,filename:]", |
How do we know that the argument is named filename
, though? This depends on how the user defined the helper function, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that convention is documented (here, link is also included in our docs). Given the way the documentation is written, I would expect many people to call the parameter 'filename' and I think a good model should recognise that parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also slightly confused about this part. I think it makes more sense to only have Parameter[1]
-- no matter the naming convention, that should still cover everything we care to model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no matter the naming convention, that should still cover everything we care to model.
That may be (i.e. if people define the call-back with name-only parameters, we do not care about that case). Still, I think the example is more illustrative and usable as a reference the way it is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be (i.e. if people define the call-back with name-only parameters, we do not care about that case).
if they define it with keyword only parameters, it wouldn't work, since the arguments are passed positionally (src from django)
Still, I think the example is more illustrative and usable as a reference the way it is now.
I agree it's pretty cool if we had an example where we could show off Parameter[<int>,<keyword>:]
. However, I don't find this to be a convincing example since the filename:
part doesn't add any value. I fear that people reading this will get confused and think they ALWAYS NEED to add the name of the parameter, but maybe I'm reading too much into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they define it with keyword only parameters, it wouldn't work, since the arguments are passed positionally (src from django)
Hm, in that case it should be removed, thanks for digging that up.
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Show resolved
Hide resolved
|
||
|
||
- Since we're adding flow through a function call, we add a tuple to the **summaryModel** extensible predicate. | ||
- The first column, **"builtins"**, begins the search for relevant calls at references to the **builtin** package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... And also otherwise undefined globals with specific names. Perhaps it would be better to not mention the builtins
package explicitly, and just say it finds references to built-in names? (Which includes builtins.whatever
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, in the end it became somewhat different...see if you like.
docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Taus <tausbn@github.com>
- fix wording on `builtins` - add named argument/parameter access path components
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM 👍
Minor detail around code snippets. I've made one suggestion, but you will need to apply it throughout the document
In this example, we'll show how to add flow through calls to `re.compile`. | ||
`re.compile` returns a compiled regular expression for efficient evaluation, but the pattern to be compiled is stored in the `pattern` attribute of the resulting object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While reading through the rendered version, it became clear to me that reStructured text requires double backticks for code snippets (docs).
In this example, we'll show how to add flow through calls to `re.compile`. | |
`re.compile` returns a compiled regular expression for efficient evaluation, but the pattern to be compiled is stored in the `pattern` attribute of the resulting object. | |
In this example, we'll show how to add flow through calls to ``re.compile``. | |
``re.compile`` returns a compiled regular expression for efficient evaluation, but the pattern to be compiled is stored in the ``pattern`` attribute of the resulting object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it should be replaced everywhere now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite what I asked for.
If you look at the rendered version (before, after) the change looks like:
Before
After
However, both of them still only use single backticks, meaning the code snippets are not rendered properly.
The previous version looked better to me, so I would recommend reverting cd7031a and 9d62c21 (the double-qoutes in the examples made them easier to read for me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite what I asked for.
Indeed, I realised once I started looking at the generated artifact 🤦 I should have commented then, so you did not have to pull it out. Thanks for doing that, though. I will get the double back-tics in, and revert to single quotes in headers to align with the other documentation pages (e.g. the one for javascript)). About the double quotes, I removed them on request, but that was mainly a request for consistency, so perhaps they should be in in a consistent way...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will get the double back-tics in, and revert to single quotes in headers
👍
About the double quotes, I removed them on request, but that was mainly a request for consistency, so perhaps they should be in in a consistent way...
Ah, I see they are present in the JS docs as well, so 🤷
In this example, we'll show how to add flow through calls to `re.compile`. | ||
`re.compile` returns a compiled regular expression for efficient evaluation, but the pattern to be compiled is stored in the `pattern` attribute of the resulting object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite what I asked for.
If you look at the rendered version (before, after) the change looks like:
Before
After
However, both of them still only use single backticks, meaning the code snippets are not rendered properly.
The previous version looked better to me, so I would recommend reverting cd7031a and 9d62c21 (the double-qoutes in the examples made them easier to read for me)
data: | ||
- [ | ||
"django.db.models.FileField!", | ||
"Call.Argument[upload_to:].Parameter[1,filenam:]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be (i.e. if people define the call-back with name-only parameters, we do not care about that case).
if they define it with keyword only parameters, it wouldn't work, since the arguments are passed positionally (src from django)
Still, I think the example is more illustrative and usable as a reference the way it is now.
I agree it's pretty cool if we had an example where we could show off Parameter[<int>,<keyword>:]
. However, I don't find this to be a convincing example since the filename:
part doesn't add any value. I fear that people reading this will get confused and think they ALWAYS NEED to add the name of the parameter, but maybe I'm reading too much into it.
259dbf4
to
6524b8e
Compare
Alright, my quote edits had made a mess. I did a small rebase, things should look reasonable now. I am looking forward to seeing the generated artifacts... :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small typo fix in the change note, otherwise this looks good to me! 👍
--- | ||
category: minorAnalysis | ||
--- | ||
* A number of Python queries now support sinks defined vi data extensions. The format of data extensions for Python has been documented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* A number of Python queries now support sinks defined vi data extensions. The format of data extensions for Python has been documented. | |
* A number of Python queries now support sinks defined using data extensions. The format of data extensions for Python has been documented. |
Typo, but also maybe a more plain word here will suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I just saw the approval and not this comment..
Adds documentation for the Python Models as Data format. The structure is heavily based on the JavaScript documentation. Both because that document is quite nice and because it will be easier for users to navigate if the structure is similar.
In addition, this PR