Skip to content
This repository was archived by the owner on Jan 5, 2023. It is now read-only.

Conversation

@max-schaefer
Copy link
Contributor

This PR got started with me trying to understand why tainting function arguments didn't seem to work. The fix for that turned out to be simple; it's in the first commit.

When trying to model taint propagation through writers, I quickly noticed that I often wanted a sort of "backwards" taint propagation: many writer APIs involve wrapping some basic writer bw into a higher-level writer w, and it's quite natural to expect that taint propagates from w back to bw.

To model this nicely, I introduced a new kind of FunctionInput that corresponds to the result of a function. It looks a little weird at first, but is nicely symmetric with the other cases we already support. With a little more API modeling we can then cover a false negative discussed in #108.

Finally, while working on our standard-library modeling I noticed that our models of Printf and friends as sinks for clear-text logging were broken due to a typo, which I fixed.

An evaluation shows no performance regressions, but quite a few new results from the fix to clear-text logging.

I looked at a few and they didn't look a priori implausible. Nevertheless we should consider carefully whether we want all those new results. If not, it's easy enough to remove the newly added sinks (and he change note).

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM sans a minor nit.

pun = result and
init = pun.(DataFlow::SsaNode).getInit()
|
index = -1 and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand this well yet but why is the index start from -1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is more than one result, we number them starting from zero. If there is only a single result, we number it as -1. It's an implementation detail that isn't exposed in the API, so nothing to worry about.

SmtpData() {
// func (c *Client) Data() (io.WriteCloser, error)
exists(Method data, DataFlow::CallNode write, DataFlow::Node writer, int i |
exists(Method data |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since, we are changing this, we should also change the ResponseBody function in HTTP.qll.

private class ResponseBody extends HTTP::ResponseBody::Range, DataFlow::ArgumentNode {
int arg;
ResponseBody() {
exists(DataFlow::CallNode call |
call.getTarget().(Method).implements("net/http", "ResponseWriter", "Write") and
arg = 0
or
(
call.getTarget().hasQualifiedName("fmt", "Fprintf")
or
call.getTarget().hasQualifiedName("io", "WriteString")
) and
call.getArgument(0).getType().hasQualifiedName("net/http", "ResponseWriter") and
arg >= 1
|
this = call.getArgument(arg)
)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle I agree, but in practice that is a little more difficult to change, since that class wants to refer both to the original source of the data as well as to the writer it is written to.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I missed that. This can now be resolved.

@ghost ghost mentioned this pull request May 5, 2020
Copy link
Contributor

@sauyon sauyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments.

I took a look at a few of the results, and I think we should probably exclude Fprint calls that aren't printing to stderr or a file. At least some of them seem like true positives though.

or
preupd instanceof ArgumentNode and
mutableType(preupd.getType())
mutableType(preupd.getType().getUnderlyingType())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this isn't so important since it's only used here, but maybe mutableType should do this inside the predicate?

Sscanner() { this.hasQualifiedName("fmt", ["Sscan", "Sscanf", "Sscanln"]) }

override predicate hasTaintFlow(FunctionInput input, FunctionOutput output) {
input.isParameter(0) and output.isParameter(any(int i | i > 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the format string for Sscanf (argument 1) be considered as an output?

@max-schaefer
Copy link
Contributor Author

Thank you for the review, @sauyon. I have addressed your suggestions and made an issue about improving the Fprint* sinks.

@sauyon sauyon merged commit 164149b into github:master May 6, 2020
@max-schaefer max-schaefer deleted the fix-argument-post-update-nodes branch August 28, 2020 06:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants