Skip to content

Carrieww/GraphHT

Repository files navigation

A Sampling-based Framework for Hypothesis Testing on Large Attributed Graphs

Python 3.9+ License

This repository contains the implementation and experimental data for the paper A Sampling-based Framework for Hypothesis Testing on Large Attributed Graphs, accepted to PVLDB 2024.

📄 Paper: Link to paper

This repository contains the implementations of

  1. the sampling-based hypothesis testing framework
  2. the hypothesis-aware samplers PHASE and its optimized version Opt-PHASE.

Install and Run

  1. Download the repository
git clone https://github.com/Carrieww/GraphHT.git

All graph data shall be stored in \datasets. Here we include a graph.pkl for MovieLens dataset for illustration.

  1. Install required packages
pip install -r requirements.txt
  1. Run the framework

The preprocessed dataset can be found here on OneDrive: link.

You can either run:

python main.py --sampling_method "PHASE"

or specify the sampling method in run.sh and run

bash run.sh

in the terminal.

If you want to specify the dataset, sampling budget, and hypothesis, you can specify them in config.py and run using the above two lines of code.

Output files

All output files are stored in the folder /result/one_sample_log_and_results_*. The two output files are

  1. *.txt: a table seperated by tab storing the accuracy, time, p-value, and confidence interval results at the specified sampling budgets.
  2. *.log: logger information

If you want to get plots for accuracy, time, p-value, and confidence interval versus the sampling budget, you can edit the parameters in makePlot.py and run it. The plot will be saved to the same folder as the txt and log files.

Citation

If you find this work useful in your research, please cite:

@article{wang2024sampling,
  title={A Sampling-Based Framework for Hypothesis Testing on Large Attributed Graphs},
  author={Wang, Yun and Kosyfaki, Chrysanthi and Amer-Yahia, Sihem and Cheng, Reynold},
  journal={Proceedings of the VLDB Endowment},
  volume={17},
  number={11},
  pages={3192--3200},
  year={2024},
  publisher={VLDB Endowment}
}

For questions or issues, please refer to the paper or contact the authors.

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors