-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Python: Add MarkupSafe
model
#6092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The other approach felt a bit too much like specifying magic strings that you had to get right. (crossing your fingers that no-one writes `HTML` instead of `html`)
Since expectation tests had so many changes from ConceptsTest, I'm going to do the changes for that on in a separate commit. The important part is the changes to taint-tracking, which is highlighted in this commit.
Problematic part is ```codeql /** A escape from string format with `markupsafe.Markup` as the format string. */ private class MarkupEscapeFromStringFormat extends MarkupSafeEscape, Markup::StringFormat { override DataFlow::Node getAnInput() { result in [this.getArg(_), this.getArgByName(_)] and not result = Markup::instance() } override DataFlow::Node getOutput() { result = this } } ``` since the char-pred still holds even if `getAnInput` has no results... I will say that doing it this way feels kinda dirty, and we _could_ fix this by including the logic in `getAnInput` in the char-pred as well. But as I see it, that would just lead to a lot of code duplication, which isn't very nice.
result = API::moduleImport("markupsafe").getMember("Markup") | ||
or | ||
result = API::moduleImport("flask").getMember("Markup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result = API::moduleImport("markupsafe").getMember("Markup") | |
or | |
result = API::moduleImport("flask").getMember("Markup") | |
result = API::moduleImport(["markupsafe", "flask"]).getMember("Markup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although you certainly could, I think having these one separate lines makes the code easier to quickly scan with your eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right :), thought that too with this block.
Co-authored-by: Jorge <46056498+jorgectf@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions, but apart from that this looks good to me. 👍
exists(range.getAnInput()) and | ||
exists(range.getOutput()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what the function of these two lines is. What would go wrong if there was a "half-defined" instance of Escaping
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problematic part is that we do override predicate isSanitizer(DataFlow::Node node) { node = any(HtmlEscaping esc).getOutput() }
in the XSS configuration, and then the example from the commit message of 498703f
class StringFormat extends Markup::InstanceSource, DataFlow::CallCfgNode { | ||
StringFormat() { | ||
exists(DataFlow::AttrRead attr | this.getFunction() = attr | | ||
attr.getAttributeName() = "format" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we plan on supporting %
-style formatting?
Also, would this be a place to use `MethodCallNode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MethodCallNode
for sure 👍 I didn't initially plan on support %
-style formatting, but we can do so :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c270817 adds %-style formatting 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff!
The escaping concept is inspired by what we have in Go, but following the same naming as the encoding/decoding concepts we already have.
Here are some of the interesting decisions around this PR, that are not highlighted as part of the commits/code.
default taint steps for escaping?
I considered adding escaping for other kinds as additional taint steps, so for example in our SQL injection query, if you do escaping for HTML, that would still be a valid taint step. I opted not to do this, since knowing whether some kind of escaping will make user-input safe for SQL injection is sort of a tricky question... what if you applied a chain of escapers, such as
js_escape(html_escape(ldap_escape(user_input)))
, will it then be safe for SQL? I guess the worst that could happen is that we end up with false-positives, with this approach. So if you think this is a really good idea that we should totally do, let me know!isSanitizingStep
Also see the note about not having
isSanitizingStep
. I've talked with @aschackmull about solving this, but we haven't quite agreed on how to actually do it yet. So for now, we can get this PR merged with the less-than-ideal behavior, and fix it down the line 👍