Skip to content

Python: Add MarkupSafe model #6092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 30, 2021
Merged

Python: Add MarkupSafe model #6092

merged 13 commits into from
Jun 30, 2021

Conversation

RasmusWL
Copy link
Member

The escaping concept is inspired by what we have in Go, but following the same naming as the encoding/decoding concepts we already have.

Here are some of the interesting decisions around this PR, that are not highlighted as part of the commits/code.

default taint steps for escaping?

I considered adding escaping for other kinds as additional taint steps, so for example in our SQL injection query, if you do escaping for HTML, that would still be a valid taint step. I opted not to do this, since knowing whether some kind of escaping will make user-input safe for SQL injection is sort of a tricky question... what if you applied a chain of escapers, such as js_escape(html_escape(ldap_escape(user_input))), will it then be safe for SQL? I guess the worst that could happen is that we end up with false-positives, with this approach. So if you think this is a really good idea that we should totally do, let me know!

isSanitizingStep

Also see the note about not having isSanitizingStep. I've talked with @aschackmull about solving this, but we haven't quite agreed on how to actually do it yet. So for now, we can get this PR merged with the less-than-ideal behavior, and fix it down the line 👍

RasmusWL added 6 commits June 16, 2021 19:09
The other approach felt a bit too much like specifying magic strings
that you had to get right. (crossing your fingers that no-one writes
`HTML` instead of `html`)
Since expectation tests had so many changes from ConceptsTest, I'm going
to do the changes for that on in a separate commit. The important part
is the changes to taint-tracking, which is highlighted in this commit.
Problematic part is

```codeql
  /** A escape from string format with `markupsafe.Markup` as the format string. */
  private class MarkupEscapeFromStringFormat extends MarkupSafeEscape, Markup::StringFormat {
    override DataFlow::Node getAnInput() {
      result in [this.getArg(_), this.getArgByName(_)] and
      not result = Markup::instance()
    }

    override DataFlow::Node getOutput() { result = this }
  }
```

since the char-pred still holds even if `getAnInput` has no results...

I will say that doing it this way feels kinda dirty, and we _could_ fix
this by including the logic in `getAnInput` in the char-pred as well.
But as I see it, that would just lead to a lot of code duplication,
which isn't very nice.
Comment on lines +25 to +27
result = API::moduleImport("markupsafe").getMember("Markup")
or
result = API::moduleImport("flask").getMember("Markup")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result = API::moduleImport("markupsafe").getMember("Markup")
or
result = API::moduleImport("flask").getMember("Markup")
result = API::moduleImport(["markupsafe", "flask"]).getMember("Markup")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although you certainly could, I think having these one separate lines makes the code easier to quickly scan with your eyes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right :), thought that too with this block.

Co-authored-by: Jorge <46056498+jorgectf@users.noreply.github.com>
tausbn
tausbn previously approved these changes Jun 30, 2021
Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions, but apart from that this looks good to me. 👍

Comment on lines +309 to +310
exists(range.getAnInput()) and
exists(range.getOutput())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what the function of these two lines is. What would go wrong if there was a "half-defined" instance of Escaping?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problematic part is that we do override predicate isSanitizer(DataFlow::Node node) { node = any(HtmlEscaping esc).getOutput() } in the XSS configuration, and then the example from the commit message of 498703f

class StringFormat extends Markup::InstanceSource, DataFlow::CallCfgNode {
StringFormat() {
exists(DataFlow::AttrRead attr | this.getFunction() = attr |
attr.getAttributeName() = "format" and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan on supporting %-style formatting?

Also, would this be a place to use `MethodCallNode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MethodCallNode for sure 👍 I didn't initially plan on support %-style formatting, but we can do so :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c270817 adds %-style formatting 👍

Copy link
Contributor

@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff!

:shipit:

@tausbn tausbn merged commit fc71a64 into github:main Jun 30, 2021
@RasmusWL RasmusWL deleted the markupsafe-modeling branch June 30, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants