relationship connectivity partitioning(RCP) is a edge-centric property graph partitioning approach, where the objective function is to minimize the cuts to the number of distinct crossing properties. This approach can be used to avoid inter-partition joins in a wider set of Cypher/Gremlin workloads in the context of distributed Cypher/Gremlin query evaluation.
This section outlines the steps and methodologies involved in preprocessing the data.
This file is used to divide the edge csv files in the original data set into the corresponding txt files according to the relationship.
This file is used to process the txt files obtained in the previous step into the corresponding LCC set. The set of interconnected nodes is divided into an LCC by connecting different nodes by edges.
This section provides an overview of the steps and methods involved in partitioning.
This file is used to divide the LCC obtained in the previous step into different partitions considering both cost and benefit.
This file is used to divide the total node table into sub-node tables of each partition based on the set of nodes belonging to each partition obtained in the previous step.
This file is used to partition the total edge table into sub-edge tables of each partition based on the set of LCCS belonging to each partition obtained in the previous step.
This file is used to convert the sub-node table and sub-edge table obtained in the previous step into a csv file.
The benchmark queries used in our experimental evaluation exists in #queries# folder.
If you encounter any problems, please send emails to me (email address: shimin22@hnu.edu.cn).