Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/exp/vulndb/govulncheck: type propagation can produce confusing results #47121

Open
rolandshoemaker opened this issue Jul 10, 2021 · 12 comments
Open

Comments

@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Jul 10, 2021

Type propagation appears to taint some functions in the call graph, causing them to be reported as vulnerable, sometimes shadowing the actual results.

The example below is a quite basic example of this. Because VulnerableImplementation flows from EntryB through EntryA, the latter is reported as vulnerable, shadowing EntryB, since we try to only show a single finding per vulnerability. Removing EntryB causes nothing to be reported.

I suspect this will also lead to more confusing results, where it is less obvious how the type is being propagated, tainting a function.

cc @zpavlinovic

package example

import "vuln"

type OrdinaryInterface interface {
	F()
}

func EntryA(oi OrdinaryInterface) {
	oi.F()
}

func EntryB() {
	var oi OrdinaryInterface
	oi = &vuln.VulnerableImplementation{}
	EntryA(oi)
}
package vuln

type VulnerableImplementation struct {}

func (vi *VulnerableImplementation) F() {}
[
    {
        "id": "GO-EXAMPLE",
        "package": {
            "name": "vuln",
            "ecosystem": "Go"
        },
        "ecosystem_specific": {
            "symbols": [
                "VulnerableImplementation.F"
            ]
        }
    }
]
Trace:
vuln.VulnerableImplementation.F (/Users/roland/vulncheck-permissive/main.go:10:6)
example.OrdinaryInterface.F(...) [approx. resolved to (*vuln.VulnerableImplementation).F] (/Users/roland/vulncheck-permissive/main.go:10:6)
example.EntryA(...) (/Users/roland/vulncheck-permissive/main.go:9:6)

Vulnerabilities:
vuln ()
@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Jul 14, 2021

@timothy-king (cc'ing myself to follow along and make my email filters happy)

Loading

@zpavlinovic
Copy link
Contributor

@zpavlinovic zpavlinovic commented Jul 16, 2021

(Partly discussed offline)

The issue happens because of VTA. VTA sees that EntryB calls EntryA with a specific input and hence concludes that vuln.VulnerableImplementation flows to oi of EntryA; oi.F thus resolves to vuln.VulnerableImplementation.F.

There are several potential ways to go around this issue:

  1. Run VTA separately for subprograms starting from EntryA and EntryB. Might be hard to figure out how to split entry points while having a manageable number of partitions.
  2. Not consider EntryA as an entry point since it is called by EntryB.
  3. Have VTA spit out more information on how things flow around, then use this to figure out that the reported trace depends on EntryB, and then prefer the trace that starts with EntryB. This is a more invasive change.

Loading

@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Jul 16, 2021

It is correct to report EntryA->F-> VulnerableImplementation, but it is just a lot more clear to users to say EntryB->EntryA->F-> VulnerableImplementation. The latter has enough info to become unambiguous, while the former is unclear why io can be a *VulnerableImplementation. Maybe this is mostly a ranking problem?

Some form of 3 seems the most appealing to me in the long run. Conceptually we could remove the "approx" in the report by locally knowing that the set of implementations of oi is {*vuln.VulnerableImplementation} when called from EntryB. There are a lot of choices of how to do that though: trace validation, k-1 context sensitivity, applying some kind of summary of EntryA, etc. All of these are more invasive.

A fourth option could also be to make the report of the approx step clearer. (Probably requires tracking more data too.)

Loading

@rolandshoemaker
Copy link
Member Author

@rolandshoemaker rolandshoemaker commented Jul 20, 2021

It is correct to report EntryA->F-> VulnerableImplementation, but it is just a lot more clear to users to say EntryB->EntryA->F-> VulnerableImplementation.

I'm not sure this is correct, as shown by the fact that nothing is reported if EntryB is removed.

This may touch on a previous decision we came to (as far as I remember), which was that if an entry point can take an interface which matches a vulnerable implementation, but does not actually receive that type anywhere in the analyzed code, we ignore it.

Loading

@rolandshoemaker
Copy link
Member Author

@rolandshoemaker rolandshoemaker commented Jul 20, 2021

There are several potential ways to go around this issue:

  1. Run VTA separately for subprograms starting from EntryA and EntryB. Might be hard to figure out how to split entry points while having a manageable number of partitions.
  2. Not consider EntryA as an entry point since it is called by EntryB.
  3. Have VTA spit out more information on how things flow around, then use this to figure out that the reported trace depends on EntryB, and then prefer the trace that starts with EntryB. This is a more invasive change.

I like (2) in theory, but I think in more complex applications it won't be as valuable. For instance in a case where we don't have this strange type propagation and there is a vulnerability in EntryA that has nothing to do with an interface, we probably do want to report the smaller chain.

It seems like some combination of (1) and (2) might be viable, where entry points that contain other entry points as subtrees of their call graphs have VTA run independently, so that they aren't contaminating each others results? So for example in this case we'd slice the graph such that VTA is run twice on (EntryB->EntryA->F->...) and (EntryA->F).

Loading

@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Jul 20, 2021

I'm not sure this is correct, as shown by the fact that nothing is reported if EntryB is removed.

Not sure if I agree so let's try to figure out where the disagreement is.

We need to allow flows of interfaces and func types. There is no guarantee this is doable in a single call sequence. Suppose we added:

func EntryC() OrdinaryInterface {
	var oi OrdinaryInterface
	oi = &vuln.VulnerableImplementation{}
	return oi
}

This is vulnerable and again goes through EntryA->F-> VulnerableImplementation. We can try to say EntryC -> {Back to the user} -> EntryA-> .... Eventually these will get hard to explain. And we are approximating so we will be wrong in some cases. I think it is reasonable to say EntryA->F-> VulnerableImplementation is a valid report. It is not a desirable one, but I don't think we are wrong to give it.

What do you think a correct report should be for EntryC? (Maybe there is a something I am missing?)

Loading

@rolandshoemaker
Copy link
Member Author

@rolandshoemaker rolandshoemaker commented Jul 20, 2021

As I see it (and is currently implemented) EntryC in your example is not vulnerable, because it does not call the vulnerable symbol F. EntryA(EntryC()) (or some more complicated flow) is vulnerable, but only if somebody actually calls it.

We could say "this could be vulnerable, but only if you do something that you're not currently doing", but this seems to go against the whole point of call graph analysis, and VTA in particular. We know it isn't being called, so why create noise?

I think this issue shows VTA is working as intended, but we're just shadowing the real result because the EntryA call chain is a super set of the EntryB call chain, which is a signal we've used to find the smallest chain but is, in this particular case, actually making the results more confusing. Whether we should report entry points as vulnerable without an existing call chain to a vulnerable symbol seems like a different discussion.

Loading

@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Jul 20, 2021

I think I understand. Here is my attempt at a summary:

For a package P, report when an entry point of P, E, may reach a vulnerable function V (without assuming there are any additional external function calls).

Basically the idea is that example.EntryA(example.EntryC()) is only a problem for a user of example if it happens. So thus we do not report it to example and instead report it to that user. The expectation would be that a user of example reports it if it happens. There is a certain internal consistency here that is quite nice.

I am not sure the author of example would prefer this. Do they want to be told if they run govulncheck example?

Thoughts?

Tangents:

  • We still provide sound guarantees in term of reachability for binaries as the main package always includes calling the init() function and then main(). Or model that main() calls init() within ssa as the first instruction, etc.
  • The bit in () could be formalized in terms of starting from env[var |-> an empty set of types] for each variable location var.

Loading

@rolandshoemaker
Copy link
Member Author

@rolandshoemaker rolandshoemaker commented Jul 20, 2021

I am not sure the author of example would prefer this. Do they want to be told if they run govulncheck example?

This is a complex question, for which I think there are two different cases, whether they import the vulnerable module/package (vuln in my initial example) or not.

I think the former is obviously no, it would require govulncheck to retrieve all possible vulnerabilities and check for vulnerable interfaces, which seems counterproductive to what we are trying to do.

The latter is more complex, there are a lot of things we could do to determine whether it is possible (i.e. by checking for things like your EntryC example where returned types contain vulnerable symbols, etc) but I think these end up adding significantly more complexity to the implementation than we really want. I think something like https://golang.org/cl/335171 simplifies this problem, by surfacing that while a vulnerable symbol is not reachable, the vulnerable module/package is being imported (allowing the developer to determine whether they are comfortable with the problem or not). This surfacing could be made more explicit via analysis, i.e. allowing us to say vulnerable symbol is reachable, if type X is passed to entry point Y, but I'm not sure how complex doing this would be/if that complexity would be worthwhile.

Loading

@zpavlinovic
Copy link
Contributor

@zpavlinovic zpavlinovic commented Jul 21, 2021

The vulnerability starting from EntryB will not be reported since we do not visit a function more than once to avoid analyzing too many paths. That is, we do a BFS search so we'll analyze EntryA when we start with entry points and when we see EntryA called from EntryB, we'll skip analyzing EntryA again.

As @rolandshoemaker suggested, the two less invasive options we have right now are based on splitting entry points into two groups: one that are called from other entry points and those that are not. Then, either

  1. do the search for both groups separately

or even

  1. run in parallel both the VTA and the search for both entry sets.

I've already experimented with 2) and there is some performance penalty but the implementation I have can be made faster IMO.

Loading

@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Jul 21, 2021

That is, we do a BFS search so we'll analyze EntryA when we start with entry points and when we see EntryA called from EntryB, we'll skip analyzing EntryA again.

Can graph based heuristics suffice if this is the main problem? Something like start BFS from each node starting from a topologically sorting of the entry points [according to some call graph]. Or weighted search and charge based on the topological order?

Loading

@zpavlinovic
Copy link
Contributor

@zpavlinovic zpavlinovic commented Jul 23, 2021

Lets call entry points that are called by other entry points secondary entry points just to refer to them more easily.

  1. If we do some notion of sorting, the secondary entry points will be effectively disregarded since we visit a function only once. This will solve the problem in this issue but it will introduce another one as @rolandshoemaker pointed out: sometimes we will show a trace A -> B -> ... where both A and B are entry points but A is actually not that important for the vuln use.

IMO, this is ok for now since it is still a valid and clear trace that just might contain some redundancy.

  1. I am not exactly sure what would a weighted search look like. Are weights used as a criteria for revisiting functions in certain cases?

  2. The cleanest but more expensive approach would be to split the entry points into two groups of primary and secondary entry points and run goaudit in parallel on both of them. This way we analyze entry points in isolation from each other.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants