Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To discuss: Issues and prioritization for a '1.0' #62
To discuss: Issues and prioritization for a '1.0' #62
Changes from all commits
a00d7bd
8dfc421
de513f4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to handle the following cases:
func
s stored instruct
fields, e.g.func
s passed as function parameters:In theory, since
func
s are first class in Go, the actual range of cases we could potentially handle is much wider, e.g. a user could put functions in amap[string]func()
, store functions in local variables, etc.There are some cases we already handle, e.g.:
But it is quite easy to throw our analysis off in this case:
My opinion: I think we should handle
func
s stored instruct
fields, because that seems to be a fairly common pattern. The other cases seem more outlandish and seem like they would require some kind of cross-function or even whole-program analysis, so I think for a 1.0 they are out of scope.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently rely on the static dispatch in our call matching. In effect, we cover only case (a) of the four kinds of Call described in the docs.
I think we should make a reasonable effort to cover all four cases. (b) seems achievable in our current framework, and (c) seems a bit out-of-scope. Dynamic dispatch in case (d) is probably a whole can of worms. We could rely on
pointer.PointsTo
to get a set of possible functions, but that predicates whole-program analysis.I don't think the fact that a given
func
is stored in a struct field will make the problem any simpler. I'd be inclined to put this at a "post-1.0 but preferably 1.1" sort of target.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rereading this, I think well-defined specifiers catch both of these cases. It's just a difference of a sink being identified by a
ValueSpecifier
rather than aCallSpecifier
. Pseudo-config in yaml for readability and ignoring implementation details:[edit:] We still have the aforementioned issue of currently only following statically-linked calls. I'm just saying I think the problem is a lot more tractable than I originally thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it makes sense to identify a sink/sanitizer based on the return value. Could you give an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either. I was just throwing stuff out there.
All the instances that come to my mind are specific runtime values, like a buffer writer returning non-zero bytes written. But we won't see that in SSA.
Probably best to scratch from the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, we already handle aliases e.g.
type MySource = core.Source
, but not named types e.g.type MySource core.Source
. (Interestingly, in the short investigation I just did, the SSA was exactly the same for both cases).Do we want named types to also count as "aliases"? Playing devil's advocate: what if a user defines a named type
type SafeSource core.Source
and uses it to explicitly differentiate between sources that have and have not been properly sanitized?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinion: I think we should include named types. The only difference between the two types would be the method set associated, which to me suggests that the kind of data stored is reasonably similar. Also, mentioned below, I think it would be the easier way to handle the fact that field tags are naturally inherited in a way that a source specified by type path and name would not be.
Although...
I hadn't thought of that. I think I might prefer an edge-case false-positive to a systematic false-negative, though.
Also and alternatively: this could be enabled/disabled in configuration. And/or we could allow a user to include a safe-list for types that should not be considered sources, though that feels like it's getting a bit over-elaborate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you make a good case for named type inference.
The scenario I proposed is sufficiently unlikely that we should probably wait until we actually hit it before we think too deeply about how to handle it.
Also, as a user I think I would expect a named type that aliases a Source to also be considered a Source. Of course, if it weren't, it wouldn't be a big deal: I would just need to specify it in the config. But if I forget to do that, then I might think everything is fine when in reality named type Sources are reaching sinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume propagation/sanitization involving collections would fall under here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm...
I think propagation in general should be part of 1.0, and I believe you've made strides to cover most collections and are working on the remaining?
I think propagation and sanitization of specific elements within a collection might be a
Beyond 1.0
target.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Important distinction here. Yes, propagation/sanitization of specific elements is what I had in mind.
I think for now all that's left is
chan
s.