# Call Graph Construction using Scalpel

A call graph depicts calling relationships between subroutines in a computer program. It is an essential component in most static analysis and can be leveraged to build more sophisticated applications such as profiling, vulnerability propagation and refactoring. Scalpel provides an interface to create call-graphs, which is a wrapper function for the call-graph module in PyCG.



## Let's install Scalpel first

Use the command in your virtual environment to install Scalpel.
```console
pip install python-scalpel
```

Let's now import required modules.

In [1]:
import json
from pprint import pprint
from scalpel.call_graph.pycg import CallGraphGenerator, formats

We  make use of the **CallGraphGenerator** module to generate the call graph. In the generated  call graph, the basic node can be either a function, a class or a module. The edges represent calling relationships between program nodes.

The example package has the folder structure as follows.

```
-example_pkg
    -main.py
    -sub_folder1
        -module1.py
        -module2.py
    -sub_folder2
```

The **CallGraphGenerator** object provides functions to analyze and generate the output call-graph with just a simple function call.

In [2]:
cg_generator = CallGraphGenerator(["./cg_example_pkg/main.py"], "cg_example_pkg")
cg_generator.analyze()
cg = cg_generator.output()
print(cg)

{'main': {'sub_folder1.module2.Module2', 'sub_folder1.module1.Module1.add', 'sub_folder1.module2.Module2.minus', 'sub_folder1.module1.Module1'}, 'sub_folder1.module1.Module1': set(), 'sub_folder1.module1.Module1.add': set(), 'sub_folder1.module2.Module2': set(), 'sub_folder1.module2.Module2.minus': set()}


Call graph generator also provides option to output all the function calls using **output_edges**.

In [3]:
edges = cg_generator.output_edges()
edges

[['main', 'sub_folder1.module2.Module2'],
 ['main', 'sub_folder1.module1.Module1.add'],
 ['main', 'sub_folder1.module2.Module2.minus'],
 ['main', 'sub_folder1.module1.Module1']]

We can also get the lists of internal and external modules in the call graph. Internal modules are those that are defined within the package. As the **internal_mods** are provided as a dictionary, we are using **pprint** to print the dictionary in a more structured way.

In [4]:
internal_mods = cg_generator.output_internal_mods()
pprint(internal_mods)

{'main': {'filename': 'main.py',
          'methods': {'main': {'first': 1, 'last': 7, 'name': 'main'}}}}


Now lets see, all the external modules in the source package. External modules are the calls to Python functions or modules that are defined outside of the current package.

In [5]:
external_mods = cg_generator.output_external_mods()
pprint(external_mods)

{'sub_folder1': {'filename': None,
                 'methods': {'sub_folder1': {'first': None,
                                             'last': None,
                                             'name': 'sub_folder1'},
                             'sub_folder1.module1.Module1': {'first': None,
                                                             'last': None,
                                                             'name': 'sub_folder1.module1.Module1'},
                             'sub_folder1.module1.Module1.add': {'first': None,
                                                                 'last': None,
                                                                 'name': 'sub_folder1.module1.Module1.add'},
                             'sub_folder1.module2.Module2': {'first': None,
                                                             'last': None,
                                                             'name': 'sub_folder1.module2.Module2'},
     

**CallGraphGenerator** also provides an option to format the generated call graph, which can be stored as a JSON object maintaining the format.

In [6]:
formatter = formats.Simple(cg_generator)
print(formatter.generate())
store_output = False
if store_output:
    with open("example_results.json", "w+") as f:
        f.write(json.dumps(formatter.generate()))

{'main': ['sub_folder1.module2.Module2', 'sub_folder1.module1.Module1.add', 'sub_folder1.module2.Module2.minus', 'sub_folder1.module1.Module1'], 'sub_folder1.module1.Module1': [], 'sub_folder1.module1.Module1.add': [], 'sub_folder1.module2.Module2': [], 'sub_folder1.module2.Module2.minus': []}
