Skip to content

Conversation

asgerf
Copy link
Contributor

@asgerf asgerf commented Aug 3, 2021

Adds support template files in the following dialects: EJS, Mustache, Handlebars, Nunjucks, Hogan, and Swig.

I've tried to keep the commits self-contained, but it's a large PR so it would help to have a high-level overview before diving in.

New sinks recognized

For an EJS snippet like,

<h1><%- title %></h1>

we now recognize title as an XSS sink, since <%- ... %> performs raw interpolation in EJS. For example, it would be vulnerable if instantiated like:

res.render('mytemplate', { title: req.query.title })

Even when a "safe" interpolation tag is used (one that escapes the output), there are a few cases where we recognize it as a code injection sink:

  • In a code attribute, HTML entities are expanded prior to JS parsing, so the escaping is not effective, so we treat these as sinks.
  • In a script tag, HTML entities are not expanded, which usually prevents breaking out of string literals, but if not safely wrapped in a string literal, we treat it as a sink.
  • In the context of an AngularJS template, AngularJS will interpret the output, so we treat these as sinks as well.

Extracting placeholder tags

To support placeholders with non-trivial expressions, like <%- foo.bar %>, we extract the contents of such placeholders as JS with the same dialect we use for Angular2 expressions. (This dialect has internally been renamed to "angular-like template").

The regular JS parser has also been updated to recognize placeholder tags occuring in plain JS context, so we can parse script tags that contain such placeholders. For example, we now get flow from userdata to the innerHTML assignment here:

let data = <%= userdata %>;
elm.innerHTML = data.html;

Wiring up template instantiations and template files

To see what file is rendered by res.render('foo/mytemplate', { ... }), we have to match up the foo/mytemplate string with a file.

Here we use a heuristic algorithm which:

  • Finds all files with the appropriate path suffix, such as foo/mytemplate.html, foo/mytemplate.ejs, or foo/mytemplate/index.html.
  • From these candidates, we pick the one with the lowest common ancestor in the file hierarchy.

For example, given these files,

  • src/lib/app.js containing a call res.render('foo/mytemplate')
  • src/lib/views/foo/mytemplate.html
  • src/lib/mytemplate.html
  • src/docs/foo/mytemplate.html

The src/lib/mytemplate.html file fails the initial suffix test (foo/mytemplate).
The remaining template files have the following lowest common ancestor with the referencing file (src/lib/app.js):

  • src/lib/views/mytemplate.html -> src/lib/
  • src/docs/mytemplate.html -> src/

Since the first one has the lowest common ancestor (or equivalently, the longest common prefix) that one is picked as the target.

Evaluations

@asgerf asgerf added JS depends on internal PR This PR should only be merged in sync with an internal Semmle PR JS:changes-sources-or-sinks Changes taint sources/sinks for the JS analysis labels Aug 3, 2021
@asgerf asgerf force-pushed the js/handlebars-extraction branch 2 times, most recently from 2d9cf44 to 42fe70d Compare August 5, 2021 09:51
@asgerf asgerf removed the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Aug 10, 2021
@asgerf asgerf force-pushed the js/handlebars-extraction branch from 42fe70d to 6e8a47a Compare August 10, 2021 10:17
@asgerf asgerf force-pushed the js/handlebars-extraction branch from 6e8a47a to 2da40b8 Compare August 16, 2021 09:35
@asgerf asgerf marked this pull request as ready for review August 16, 2021 12:26
@asgerf asgerf requested a review from a team as a code owner August 16, 2021 12:26
Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's a mouthful.

Here is a partial review from trying to go through commit-by-commit.

Comment on lines 639 to 648
/** The `include` function, seen as an API node, so we can treat it as a template instantiation. */
private class IncludeFunctionAsEntryPoint extends API::EntryPoint {
IncludeFunctionAsEntryPoint() { this = "IncludeFunctionAsEntryPoint" }

override DataFlow::SourceNode getAUse() {
result = any(TemplatePlaceholderTag tag).getInnerTopLevel().getAVariableUse("include")
}

override DataFlow::Node getARhs() { none() }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that this class is needed, because the tests fail if I remove it.

But I'm not sure where it's needed.
I can't get it to fit with the API::Node usage in getTemplateInput, but maybe that's me missing something.
Could you elaborate in which class/predicate this API::EntryPoint is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed for EjsIncludeCallInTemplate. By ensuring that the callee has an API node, the arguments to call have API nodes as well, which is needed in getTemplateParamsNode, which is needed in getTemplateInput.

Using EjsIncludeCallInTemplate directly in the definition of the entry point leads to negative recursion, unfortunately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, could you add a reference to EjsIncludeCallInTemplate in the qldoc?

asgerf and others added 2 commits August 24, 2021 14:16
Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
erik-krogh
erik-krogh previously approved these changes Aug 24, 2021
Copy link
Contributor

@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Impressive work!

@codeql-ci codeql-ci merged commit 170a069 into github:main Aug 25, 2021
@asgerf
Copy link
Contributor Author

asgerf commented Aug 25, 2021

Thanks for powering through another mega PR @erik-krogh! ✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation JS:changes-sources-or-sinks Changes taint sources/sinks for the JS analysis JS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants