-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support graphs and graph traversals #46
Comments
This is an interesting suggestion that's worth considering. The SATE format has a notion of linking related results that is another example of this kind of relationship. Their solution was simply to provide a universal identifier, allowing results to store lists of UIDs to other items. You're proposing something more expressive, obviously. Besides the option of annotating a node/entity in the graph ('stack frame'), it's interesting to consider whether we annotate the edges instead, with most nodes simply being a code location. so a 'calls into' edge could point to a common location referenced by other entities. All of this resurrects an historical question with SARIF, which is whether locations themselves should be stored as a distinct set of things (like we do with the 'files' object) with results and other things referencing items in the set. That would be a disruptive change to the current spec but would also be a support for what you've proposed here. |
More expressive in that it is a graph structure, not a flat list. Less expressive in that each vertex in the graph can only be one strucure: a location and a message (plus whatever other appropriate field), not a UID referencing anything else in the SARIF file. The graph structure I have in mind is replacing/generalizing structures within the result object, not allowing edges/references beyond the result object - I think that would be an orthogonal feature.
I would note that (as I understand it) this question is orthogonal to using a graph strucure.
|
Different code analysis tools look at vertices and edges. If we went down this path would it make sense to be able to express either? |
Andrew (@DerSaidin), do you have time to comment on this thread in general? How would a graph improve our ability to express a stack? Imagine a graph that represents a control flow graph, how would we represent a flow through it? Can you provide clarifying examples that show the promise of what you're proposing? |
Stack Examplevoid b() {
zzz1(); //Dependency1
}
void a() {
b();
zzz2(); //Dependency2
c();
zzz4(); //Dependency4
e();
}
void c() {
d();
}
void d() {
zzz3(); //Dependency3
}
void e() {
zzz5(); //Dependency5 //Bug location
} Without a graphWith the current SARIF proposal...
... We can have a list of frames going linearly up the stack, starting at Bug ...
This might be displayed something like...
As a graphWith a graph, we can capture the location of all the dependencies accurately (The word "Frame" may be a little misused here, but it is convenient)...
Note that the nodes in this graph have different types/tags. Some are tagged as frames, some are tagged as leaf dependencies. And we can take advantage of more structured to display it with context...
For example, Dependency3 might be a write to a variable passed in from a(). The graph could also have an edge e.g. from Dependency3 to Dependency5. This is a key difference between static and dynamic tools - a dynamic callstack is strictly linear, but a static analysis context is not (and not really a "stack"). Graphs of other Things
The codeFlows and stacks are specific cases of graphs. What if I want to show a callgraph? A graph structure is generalized, and applicable to other things. |
The idea of having graphs associated with the result object is definitely an interesting one. However, I'm not sure that it replaces or generalizes the existing elements. I think the current elements ( As an example, consider a call graph, with the vertices representing functions and edges representing calls. A result could represent a path through that call graph as a @DerSaidin In your example you have what is essentially a call graph, but the structured display you provide is actually a path through that graph. I think that structured display can therefore be encoded as a I think the question here is whether a graph representation makes sense in addition to the existing path representation. One area I think it could be useful is to help SARIF viewers provide a more sophisticated path display mechanism, in particular when:
There may also be cases where a graph is useful independently of a path for a result, but I don't have a concrete example. |
I have opened a separate issue (#71) related to allowing edge properties on codeFlows. |
I like the distinction between paths through some a graph, rather than the graph itself. This is true, it is usually paths that are displayed. If there was graphs in addition to paths, then I think the paths should refer to nodes of a graph. At this point the paths are essentially part of the graph. I think graph vs path is like vector graphics vs bitmap. The vector graphics has more structured information and details, you can render various bitmaps views from it.
As always, it is a trade off. Complexity vs flexibility. |
As per most recent TC discussion, we are excluding this feature from this SARIF release. Larry will add some additional detail from notes taken during that session. |
The upshot of our discussion in the last TC meeting was that
|
Closed by 20f0042. |
There are many graph structures in static analysis which are useful to preserve in results. For example:
Several of the existing properties of result could be abstracted and generalized in this manner:
Note that all of these have properties in common: location, message, These would be the vertices in the graph. (in the case of stacks, the stackFrame objects are the vertices and the stacks object is providing some of the graph edges/structure).
This would also allow the format to support other information which generally fits into a graph.
Having codeflows and stacks properties show the desire for this generalization/extensibility. What other similar properties will be wanted in the future that are not currently specified?
Each vertex would need some tag to identify what it means (i.e. this vertex is a stackFrame, this vertex is a value flow at an addition) and how vertices are expected to fit together (a stackFrame cannot flow into an addition, these should not appear in the same graph).
Tools doing their own graphs (not specified in SARIF) could still have a graph of vertices with a location and a message and their own meaning.
The text was updated successfully, but these errors were encountered: