-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ear taint checker #316
Ear taint checker #316
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Requesting review from @timothy-king. |
internal/pkg/earpointer/taint.go
Outdated
// For a function, transitively get the functions called within this function. | ||
// Argument "depth" controls the depth of the call chain. | ||
// For example, return {g1,g2,g3} for "func f(){ g1(); g2() }, func g1(){ g3() }". | ||
func getCalleeFunctions(fn *ssa.Function, result map[*ssa.Function]bool, depth uint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the evidence we need a depth parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned before, this is to control how far we go when searching the call chain. In the future CLs it will be used to control the searching of other cases. For example, one interesting case is to chase the chain of callers: f2->g2->h2.
func f2() string {
a := new T{} // taint source
return fmt.Printf(a)
}
func g2() {
return f2(a)
}
func h2() {
x := g2()
sinkf(x) // taint sink
}
Another interesting case is:
func f3() string {
a := new T{} // taint source
return fmt.Printf(a)
}
func g3(x string) {
sinkf(x) // taint sink
}
func h3() {
x := f3()
g3(x)
}
So connecting a source and a sink may span many functions. This depth
is used to limit how many functions we should consider when connecting a source and a sink. Ideally we should not use such a limit, but since EAR uses aggressive merging, the FP rate may be high without applying this limit. Too many references may be involved as well if we don't bound it. If later on we have experimental data justifying that we don't need such a limit, then we can remove it.
I recall that HawkEye uses a similar limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the evidence at the moment is 1) theoretical 2) following practices from similar projects. Is this a fair summary? These are not terrible reasons, but they are not overwhelming either.
If later on we have experimental data justifying that we don't need such a limit, then we can remove it.
It is easy to go from no limit to adding one too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believed that I've done some tests by disabling the limit or setting a large limit, there were more FPs showing up.
Anyway, this can be gotten around by using a large limit so I don't think that it is a fundamental issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I notice that the tests that were present in #310 are not in this PR. Are you planning to add those in a follow-up PR?
Forgot to add it last time. Now it is there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly, LGTM. My primary concerns are about globals and some nits about map
s.
Fixes #<issue_number_goes_here>