Skip to content

lias-laboratory/pandasql

main
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
cli
 
 
gui
 
 
 
 

PandaSQL

This project concerns implementing and optimizing Randomized Triangle Enumeration Algorithm using SQL queries.

Requirements

Downloads

It contains the following elements:

  • CLI version (cli directory) which contains:
    • Vertica_codes: contains two python scripts:
      • Standard_TE.py concerns the triangle enumeration using standard algorithm query,
      • Randomized_TE concerns the triangle enumeration using randomized algorithm optimized queries.
    • Triplet: this folder contains color triplet files according to the size of the cluster (8,27,64,…)
  • GUI version (gui directory) with HTML page interface

Build and install

To use the script, please define your database connection statement in the script then use the following command line to execute it:

  • For standard algorithm query:
$ python Vertica_codes/Standard_TE.py path_to_your_dataset path_to_output_directory type[directed/undirected]
  • For Randomized algorithm query:
$ python Vertica_codes/Randomized_TE.py path_to_your_dataset triplet/triplet8.txt path_to_output_directory type[directed/undirected]

In the command line above, make sure to choose between directed or undirected without typing key word type (this should be according to the type of the chosen data set). Here an example:

$ python Vertica_codes/Standard_TE.py Datasets/Real/WikiTalk.txt Results_TE/ directed

To use the graphic interface, update the file config.py in PandaSQL directory with your database connection statement then use the following command line to execture it:

$ cd path/to/PandaSQL_GUI
$ python server.py

To use PandaSQL, open a browser and type: 127.0.0.1:5000

How to use

Please refer to this video for a live demonstration.

PandaSQL Demonstration

Results

The results output using the CLI version of PandaSQL by the standard algorithm are of the following format: (vertex1,vertex2,vertex3)

Example of output:

Vertex1 vertex2 vertex3
1 2 5
1 2 8
1 3 7
... ... ...
185 200 305

The results output using the CLI version of PandaSQL by the randomized algorithm are of the following format: (machine,vertex1,vertex2,vertex3)

Example of output:

machine vertex1 vertex2 vertex3
1 1 2 3
1 1 2 5
1 1 3 7
... ... ... ...
8 160 59 365

Publication

  • Abir Farouzi, Ladjel Bellatreche, Carlos Ordonez, Gopal Pandurangan, Mimoun Malki. A Scalable Randomized Algorithm for Triangle Enumeration on Graphs based on SQL Queries, DAWAK Conference 2020

Historic Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published