Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make a call graph for a function in C/C++? #8376

Open
nrb547 opened this issue Mar 8, 2022 · 2 comments
Open

How to make a call graph for a function in C/C++? #8376

nrb547 opened this issue Mar 8, 2022 · 2 comments
Labels
question Further information is requested

Comments

@nrb547
Copy link

nrb547 commented Mar 8, 2022

I'm looking forward to writing a query which will deliver a call graph for a given function.
Preferably a call graph (a picture or similar graphic representation), but if that is
not possible, at least a chain of possible calls.

I already checked the CodeQL documentation and it seems like there is more information
for Java, and I'm not capable of getting a call graph with this C++ module:
https://codeql.github.com/codeql-standard-libraries/cpp/semmle/code/cpp/pointsto/CallGraph.qll/module.CallGraph.html

I would appreciate any guidance.

@nrb547 nrb547 added the question Further information is requested label Mar 8, 2022
@rdmarsh2
Copy link
Contributor

rdmarsh2 commented Mar 8, 2022

I assume what you're looking for is the tree of callers/callees reachable from a given function? If you don't need to handle function pointers and virtual dispatch, the easiest option is to use codeql database analyze --format dot or --format dgml to run a query like the following:

/**
 * @kind graph
 */

import cpp

from Function caller, Function callee, Function root
where
  root.hasQualifiedName("namespace1.namespace2", "Class1", "func") and
  caller.calls(callee) and
  (
    root.calls*(caller)
    or
    callee.calls*(root)
  )
select caller, callee

Depending on the format you specify, that will give either a dot text file you can render with graphviz tools, or a DGML file that Visual Studio can support with the optional DGML Viewer component.

If you do need to resolve function pointers and virtual dispatch, I'd recommend the resolveCall predicate here: https://codeql.github.com/codeql-standard-libraries/cpp/semmle/code/cpp/ir/dataflow/ResolveCall.qll/module.ResolveCall.html.

@nrb547
Copy link
Author

nrb547 commented Mar 9, 2022

Thanks, I tried this out and it works well for small codebases. Although, in larger codebases
like the Linux kernel it's very slow, I didn't manage to get any results as it was busy computing.

If you do need to resolve function pointers and virtual dispatch, I'd recommend the resolveCall predicate here: https://codeql.github.com/codeql-standard-libraries/cpp/semmle/code/cpp/ir/dataflow/ResolveCall.qll/module.ResolveCall.html.

I replaced calls/calls* respectively with allCalls/allCalls* from
https://codeql.github.com/codeql-standard-libraries/cpp/semmle/code/cpp/pointsto/CallGraph.qll/module.CallGraph.html
and it recognizes function pointers correctly, so this is working as expected.

One problem I have is lacking speed

    root.calls*(caller)
    or
    callee.calls*(root)

What was your reason behind using callee.calls*(root)? I removed it to increase speed and the result is the same.

Another problem I have is limiting the call graph so it isn't too big in size.
One idea I had was to limit the scope to functions defined in the same file as the root function.
This works, but then I miss out on a few semantically important functions that are used in the code.

So another thing I tried was to limit the call graph to all functions called in the same file as the root function.

void foo(void) // foo is root, and foo is defined in main.c
{
    bar(); // bar is defined in main.c, so it is included in the result
    test(); // test is defined in test.c, but it is in the result because the FunctionCall occurs in main.c 
}

What I did was define the callee as a FunctionCall instead of a Function, and check
if the call happens in main.c (same location as root).
Although, this is very time intensive.

/**
 * @kind graph
 */

import cpp
import semmle.code.cpp.pointsto.CallGraph

from Function caller, FunctionCall callee, Function root
where
  root.hasName("main")
  and allCalls(caller, callee.getTarget())
  and allCalls*(root, caller)
  and callee.getLocation().getFile() = root.getFile()
select caller, callee.getTarget()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants