<<<<<<< HEAD
We propose GraspDB, which, to the best of our knowledge, is the first black-box approach for identifying bugs related to write operations in graph database systems.
Requirements:
- Java 11
- Maven(version 4.0.0)
- The graph database engines that you want to test (now supporting Neo4j v5.6.0, RedisGraph v2.10.9, Memgraph v2.7.0 and Agensgraph v2.13.1 and above. )
All experiments are conducted on a computer with Intel i5-8400 CPU, 16 GB of memory and Windows 11 OS. And the requirements above need to be met.
- src
- main/java/org.example.GraspDB
- common: infrastructure
- CypherTransform: the mutator for GraspDB to generate mutated query
- parsercypher: the cypher parser generated by ANTLR
- cypher
- ast/standard_ast: implementation of ast structure
- algorithm: algorithm used by GraspDB
- gen: generators for queries, graphs, patterns, expressions
- condition
- GuidedConditionGenerator.java: the condition generator for GraspDB
- graph
- SlidingGraphGenerator.java: the graph generator for GraspDB
- expr
- RandomExpressionGenerator.java: the expression generator for GraspDB
- condition
- oracle: oracles
- DifferentialNonEmptyBranchOracle.java: the oracle used by GraspDB
- neo4j: support for neo4j database(schema, connection, options)
- redisGraph: support for redisGraph database(schema, connection, options)
- memGraph: support for memGraph database(schema, connection, options)
- agensGraph: support for agensGraph database(schema, connection, options)
- main/java/org.example.GraspDB
- out: the executable jar file GraspDB.jar
Note that for a given target database, two instances of it should be created for testing, as our test case pairs involve writing operations, which may change the database status. For example, we can create two instances of Neo4j version5.6.0 by running the following commands at the command line:
docker run --restart always --name neo4j_1 -p 7473:7473 -p 7474:7474 -p 7687:7687 -d neo4j:5.6.0
docker run --restart always --name neo4j_2 -p 7475:7473 -p 7476:7474 -p 7688:7687 -d neo4j:5.6.0
The "--restart always" parameter prevents the test from stopping due to a database crash. The "--name" specifies the name of the container. The "-p" parameter specifies the mapping of ports. The "-d" means running in the background. And the "neo4j:5.6.0" specifies the database name and version number of the docker image.
For example, if you want to test Neo4j above, the configuration for both database instances is specified in the config.json
file:
{
"neo4j@first":{
"port": 7687,
"host": "localhost",
"username": "neo4j",
"password": "GraspDB"
},
"neo4j@second":{
"port": 7688,
"host": "localhost",
"username": "neo4j",
"password": "GraspDB"
}
}
The config.json
file records the information necessary to connect and access the database. The "neo4j@first" tells GraspDB which database to connect. The "port" and "host" mean the host and port number on which the database service is running. The "username" and "password" specify the authentication for database connections, and some databases do not require authentication, such as memGraph.
Note that you must change the password before the first connection to Neo4j.
Generally GraspDB can be configured and executed using the following command:
java -jar GraspDB.jar --[database_option1] --[database_option2] composite
The list of supported database options is as follows. (relation-removed and graph-state-oracle options are used for RQ2 in our paper):
--num-tries <num-tries> // the number of graphs to generate
--num-queries <num-queries> // the number of queries generated for each graph
--database-instance <database name> // the name of database instance, such as neo4j, memgraph, redisgraph
--relation-removed <mutation rule> // which type of mutation rule to be removed, 1 for AWC, 2 for MWC, 3 for MRC, 0 for none
--graph-state-oracle <check or not> // whether check the graph state as oracle, 0 for true, 1 for false
For example, we can run the following command to start testing Neo4j:
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --database-instance neo4j composite
Then the testing should begin, GraspDB will generate 5000 graphs and for each graph it will generate 100 pairs of test case.
The potential bugs are recorded in GraspDB/log.txt
. All the graph database instances and test case pairs will be recorded in the logs
directory for reproduction. Note that GraspDB/log.txt
needs to be removed before the next round of testing begins. Inspecting the log information to reproduce and validate the reported bugs.
The list of databases currently supported by GraspDB is as follows:
neo4j
redisgraph
memgraph
agensgraph
And testing other databases is similar to tesing Neo4j.
Step1: Run the docker images of target database(Neo4j, Memgraph, Redisgraph, Agensgraph) with target versions. (all commands are listed here)
docker run --restart always --name neo4j_1 -p 7473:7473 -p 7474:7474 -p 7687:7687 -d neo4j:[version]
docker run --restart always --name neo4j_2 -p 7475:7473 -p 7476:7474 -p 7688:7687 -d neo4j:[version]
docker run --restart always --name mem_1 -p 7689:7687 -d memgraph:[version]
docker run --restart always --name mem_2 -p 7690:7687 -d memgraph:[version]
docker run --restart always --name redis_1 -p 6379:6379 -d redisgraph:[version]
docker run --restart always --name redis_2 -p 6380:6379 -d redisgraph:[version]
docker run --restart always --name agens_1 -p 5432:5432 -d agensgraph:[version]
docker run --restart always --name agens_2 -p 5433:5432 -d agensgraph:[version]
{
"neo4j@first": {
"port": 7687,
"host": "localhost",
"username": "neo4j",
"password": "GraspDB"
},
"neo4j@second":{
"port": 7688,
"host": "localhost",
"username": "neo4j",
"password": "GraspDB"
},
"redisgraph@first":{
"port": 6379,
"host": "localhost",
"username": "redis",
"password": "GraspDB"
},
"redisgraph@second":{
"port": 6380,
"host": "localhost",
"username": "redis",
"password": "GraspDB"
},
"memgraph@first":{
"port": 7689,
"host": "localhost",
"username": "memgraph",
"password": "GraspDB"
},
"memgraph@second":{
"port": 7690,
"host": "localhost",
"username": "memgraph",
"password": "GraspDB"
},
"agensgraph@first": {
"port": 5432,
"host": "localhost",
"username": "postgres",
"password": "GraspDB"
},
"agensgraph@second":{
"port": 5433,
"host": "localhost",
"username": "postgres",
"password": "GraspDB"
}
}
We run our method for 6 months and reported the detected bugs. During the 6 months, we periodically run the testing process, each time generating 5000 graph database instances, and generating 100 query pairs for each graph database instance.
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --database-instance neo4j composite
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --database-instance memgraph composite
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --database-instance redisgraph composite
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --database-instance agensgraph composite
To validate the contribution of different components of GraspDB to bug detection, we can remove each of them by controling the parameters in Step3. And other steps of replication are the same as those of RQ1.
Run the following command to run the GraspDB(It corresponds to row 1 of the Table 6 in our paper):
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 neo4j composite
Run the following command to remove the graph state oracle(It corresponds to row 2 of the Table 6 in our paper):
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --graph-state-oracle 1 neo4j composite
We can also remove other components by using relation-removed option(They correspond to rows 3, 4 and 5 of the Table 6 in our paper):
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --relation-removed 1 neo4j composite
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --relation-removed 2 neo4j composite
java -jar GraspDB.jar --num-tries 5000 --num-queries 100 --relation-removed 3 neo4j composite
For the comparison with baselines, run GraspDB following the steps of replicating RQ1. Run GDSmith and GraphGenie follow their GitHub pages, refering to steps instructed in their manual page. The Github link to GDSmith is https://github.com/ddaa2000/GDsmith, and GraphGenie is https://github.com/YuanchengJiang/GraphGenie.
The links to issues of the studied bugs can be found in our preprint paper in the corresponding section (section 5.5), where the detailed description of bugs as well as the developer's response are available. We also listed all the bug issue ids detected by GraspDB in the following.
The following table shows the number of bugs detected by GraspDB.
Subject | Detected | Writing-related | Fixed | Confirmed but not fixed | Reported but not confirmed | Crash/Error | Logic |
---|---|---|---|---|---|---|---|
Neo4j | 35 | 18 | 30 | 3 | 2 | 29 | 4 |
RedisGraph | 15 | 6 | 6 | 6 | 3 | 8 | 4 |
MemGraph | 17 | 7 | 7 | 6 | 4 | 8 | 5 |
AgensGraph | 10 | 0 | 0 | 0 | 10 | 0 | 0 |
SUM | 77 | 31 | 43 | 15 | 19 | 45 | 13 |
The following table shows the ID of the 77 issues detected by GraspDB on three GDBMSs.
Subject | Detected | Issue ID |
---|---|---|
Neo4j | 35 | #13317,#13318,#13140,#13159,#13188,#13204,#13221,#13222,#13302,#13047,#13104,#13085,#13245,#13246,#13113,#13162,#13240,#13299,#13321,#13111,#13173,#13125,#13305,#13146,#13171,#13160,#13226,#13185,#13232,#13259,#13258,#13250-#13253 |
RedisGraph | 15 | #3030,#3205,#3211,#3207,#3032,#3027,#3191,#3193,#3203,#3213,#3206,#3210,#3033,#3113,#3013 |
MemGraph | 17 | #878,#872,#1309,#1328,#1330,#861,#1296,#1297,#1307,#1333,#1331,#1329,#1293,#1294,#1298,#874,#1376 |
AgensGraph | 10 | #628,#629,#631-#638 |