This guide will get you started using the various commands for Data Gravity Insights (DGI). If you would like to contribute to the project you can Set up your Development Environment. Regular users can just follow the steps below:
If this is your first time using Tackle Data Gravity Insights you must first install some prerequisite software. Here are the instructions to install the Prerequisites
You can use the Docker image of dgi
(preferred) or you can install it locally.
We provide a Docker image with DGI installed for users that don't have a Python 3.9 environment in which to run DGI. Using the Docker image will allow you to run DGI anywhere you have Docker. Since you will need Docker to run Neo4j, this is gives you a compete "containerized" installation.
You can pull down the latest image from the GitHub Container Registry with the following command:
docker pull quay.io/konveyor/data-gravity-insights:latest
This will load the image into your local Docker image cache so that it is ready for any time you want to run it.
In order to have the Docker version behave like the local version you should create the following alias
in your .bashrc
or .zshrc
file (or what ever shell you are using).
alias dgi="docker run --rm -it --link neo4j -v $(pwd):/data quay.io/konveyor/data-gravity-insights:latest"
This command will remove the container once the command completes with: --rm
, create an interactive terminal with: -it
, link to the Neo4j container with: --link neo4j
, share your current working folder as /data
inside of the container with: -v $(pwd):/data
, and start the DGI container with: quay.io/konveyor/data-gravity-insights:latest
. The default is to return the help unless you specify parameters.
Note: The entry point for this image is the dgi
command so you only need to add parameters to run it.
There are two ways to install the dgi
command line interface:
You can install dgi
globally into your system packages as root with:
sudo pip install -U tackle-dgi
This will make the dgi
command globally available. You can then run it from anywhere on your computer.
If you do not want to install it system wide you can install dgi
locally with:
pip install -U tackle-dgi
This will install the dgi
command locally under your home folder in a hidden folder called: ~/.local/bin
. If you choose this approach, you must add this folder to your PATH
with:
export PATH=$PATH:$HOME/.local/bin
Regardless of how you installed dgi
(using either the Docker image or installing locally) you will need an instance of Neo4j to store the graphs that dgi
creates. You can start one up in a container using docker
or podman
(to use podman
just substitute podman
for docker
in the command below).
docker run -d --name neo4j \
-p 7474:7474 \
-p 7687:7687 \
-v neo4j:/data \
-e NEO4J_AUTH="neo4j/konveyor" \
docker.io/neo4j:4.4.14
You must set an environment variable to let dgi
know where to find this neo4j container.
export NEO4J_BOLT_URL="neo4j://neo4j:konveyor@localhost:7687"
You can now use the dgi
command to load information about your application into the graph database. We start with dgi --help
. This should produce:
Usage: dgi [OPTIONS] COMMAND [ARGS]...
Tackle Data Gravity Insights
Options:
-n, --neo4j-bolt TEXT Neo4j Bolt URL
-q, --quiet Be more quiet
-c, --clear Clear graph before loading [default: True]
--help Show this message and exit.
Commands:
c2g This command loads Code dependencies into the graph
s2g This command parses SQL schema DDL into a graph
tx2g This command loads DiVA database transactions into a graph
This is a demonstration of the usage of DGI. For this, we'll use daytrader7 as an example. Feel free to follow along with your own application and your personal directories. But, keep track of the directories where the application source code and the built jar/war/ear reside and replace them appropriately below.
Let's download a copy of our sample application and build it. If you already have your own application ear file you can skip this step.
-
First let's create a folder called
demo
and change into it to extract the code to:mkdir demo cd demo
-
If you have your own Java application, you can skip this step. If you want to use the DayTrader7 demo, you must load and extract the demo DayTrader7 application using
wget
(if you don't havewget
you can install it here: install wget):wget -c https://github.com/WASdev/sample.daytrader7/archive/refs/tags/v1.4.tar.gz -O - | tar -xvz -C .
-
Now build the application
docker run --rm -v $(pwd)/sample.daytrader7-1.4:/build docker.io/maven:3.8.4-openjdk-8-slim mvn --file=/build/pom.xml install
This will create an EAR file called
daytrader-ee7-1.0-SNAPSHOT.ear
insample.daytrader7-1.4/daytrader-ee7/target
directory.
In this step, we'll run code2graph to populate the graph with various static code interaction features pertaining to object/dataflow dependencies and their respective lifecycle information.
Code2graph uses the output from a tool called DOOP.
-
First let's prepare an input and output folders for
doop
calleddoop-input
anddoop-output
respectively and copy the generateddaytrader-ee7-1.0-SNAPSHOT.ear
file in thedoop-input
folder. You should already be in thedemo
folder from the previous steps before making these directories:mkdir doop-input mkdir doop-output cp sample.daytrader7-1.4/daytrader-ee7/target/daytrader-ee7-1.0-SNAPSHOT.ear doop-input
Note: if you are using your own application ear file, copy it into the
doop-input
folderJust to double check, you should see the DayTrader ear or your ear file in the
doop-input
folder:$ ls doop-input/ daytrader-ee7-1.0-SNAPSHOT.ear
-
Next, we'll run DOOP to process the compiled
*.jar
files.For ease of use, DOOP has been pre-compiled and hosted as a docker image at
quay.io/rkrsn/doop-main
. We'll use that for this demo.docker run -it --rm -v $(pwd)/doop-input:/root/doop-data/input -v $(pwd)/doop-output:/root/doop-data/output quay.io/rkrsn/doop-main:latest rundoop
Note: Running DOOP may roughly takes 5-6 mins
Let's review what we have done so far:
-
We used the
doop-input/
folder to store the the compiled jars, wars, and ears -
We used a new folder
doop-output
to save all the information (formatted as *.csv files) gathered from DOOP.
-
-
After gathering the data with DOOP, we'll now run code2graph to synthesize DOOP output into a graph stored on neo4j.
The syntax for code2graph can be see with
dgi c2g --help
. Below:$ dgi c2g Usage: dgi c2g [OPTIONS] This command loads Code dependencies into the graph Options: -i, --input DIRECTORY DOOP output facts directory. [required] -a, --abstraction [class|method|full] The level of abstraction to use when building the graph [default: class] --help Show this message and exit.
We'll run code2graph by pointing it to the doop generated facts from the step above:
dgi --clear c2g --abstraction=class --input=doop-output
Note that we could have passed in [class|method|full] as the abstraction. If you decide to run with the
method
orfull
level of abstraction, make sure you use the same abstraction level when running withtx2g
as well.After successful completion, you should see:
$ dgi --clear c2g --abstraction=class --input=doop-output code2graph generator started... Verbose mode: ON Building Graph... [INFO] Populating heap carried dependencies edges 100%|█████████████████████| 7138/7138 [01:37<00:00, 72.92it/s] [INFO] Populating dataflow edges 100%|█████████████████████| 5022/5022 [01:31<00:00, 54.99it/s] [INFO] Populating call-return dependencies edges 100%|█████████████████████| 7052/7052 [02:26<00:00, 48.30it/s] [INFO] Populating entrypoints code2graph build complete
To run scheme to graph, use dgi [OPTIONS] s2g --input=<path/to/ddl>
. For this demo, we have a sample DDL for daytrader at demo/schema2graph-samples/daytrader-orcale.ddl
, let us use that:
dgi --clear s2g --input=./sample.daytrader7-1.4/daytrader-ee7-web/src/main/webapp/dbscripts/oracle/Table.ddl
This should give us:
Clearing graph...
Building Graph...
Processing schema tables:
100%|████████████████████| 12/12 [00:00<00:00, 69.40it/s]
0it [00:00, ?it/s]
Processing foreign keys:
Graph build complete
Here we'll first use Tackle-DiVA to infer transaction traces from the source code. DiVA is available as a docker image, so we just need to run DiVA by pointing to the source code directory and the desired output directory (for which we'll user the demo folder again).
-
Run the following command to get the transaction traces from DiVA:
mkdir tx2graph-output docker run --rm -v $(pwd)/sample.daytrader7-1.4:/app -v $(pwd)/tx2graph-output:/diva-distribution/output quay.io/konveyor/tackle-diva
This should output 6 files in the
./tx2graph-output
folder. One of these will be a json file calledtransaction.json
with all the transactions. -
We'll now run DGI's
tx2g
command to populate the graph with SQL tables, transaction data, and their relationships to the code.dgi --clear tx2g --input=./tx2graph-output/transaction.json
After a successful run, you'll see:
Verbose mode: ON [INFO] Clear flag detected... Deleting pre-existing SQLTable nodes. Building Graph... [INFO] Populating transactions 100%|████████████████████| 158/158 [00:01<00:00, 125.73it/s] Graph build complete
We'll save the graph generated so far locally for further analysis. This enables us to use a free version of Neo4J Bloom to interact with the graph.
-
First we stop the neo4j container because the data can't be backup while Neo4J is running.
docker stop neo4j
-
Then, we'll use
neo4j-admin
command to dump the DB.docker run --rm -v neo4j:/data -v $(pwd):/var/dump neo4j bin/neo4j-admin dump --to=/var/dump/DGI.dump
You'll now find a
DGI.dump
file, which has the entire DB with code2graph, schema2graph, and tx2graph in your current folder. -
If you want to continue to use
dgi
with Neo4J don't forget to restart Neo4J:docker start neo4j
In order to explore the neo4j graph, visit http://localhost:7474/browser/. Then,
-
Under connect URL, select
neo4j://
and enter:localhost:7687
-
Under username, enter:
neo4j
-
Under password, enter:
konveyor
This should bring you to the browser page where you can explore the DGI graph.