Benchmark for FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics

This repository contains the research code for our paper FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics which will be presented at AAAI-23 in February 2023. A standalone version of FastAMI for easier use in other projects is available at https://github.com/mad-lab-fau/fastami and can be installed from PyPi via pip install fastami. This repository is for archival purposes only.

This benchmark version compares our implementation with the AMI in scikit-learn, the pairwise AMI [3], and the SMI [1] and contains a preprocessed version of the Benchmark Suite for Clustering Algorithms - Version 1 [2].

Setup

To reproduce the results in our paper, you must first install Python 3.10.4 and the required dependencies:

pip install -r requirements.txt

For the direct SMI sampling, we use C code that must be compiled by executing.

fastami/rcont2/build.sh

For the clustered version of Benchmark Suite for Clustering Algorithms – Version 1 please unpack gagolewski.zip in

/data/gagolewski

Running the Benchmarks

For the synthetic EMI and SMI Benchmarks execute

python synthetic_benchmark.py

The benchmarks on real datasets can be executed as follows

python gagolewski_benchmark.py

and

python snap_benchmark.py

References

[1] S. Romano, J. Bailey, V. Nguyen, and K. Verspoor, “Standardized Mutual Information for Clustering Comparisons: One Step Further in Adjustment for Chance,” in Proceedings of the 31st International Conference on Machine Learning, Jun. 2014, pp. 1143–1151. Accessed: Dec. 08, 2021. [Online]. Available: https://proceedings.mlr.press/v32/romano14.html

[2] M. Gagolewski and others, “Benchmark Suite for Clustering Algorithms – Version 1.” 2020. doi: 10.5281/zenodo.3815066.

[3] D. Lazarenko and T. Bonald, “Pairwise Adjusted Mutual Information,” arXiv:2103.12641 [cs], Mar. 2021, Accessed: Sep. 16, 2021. [Online]. Available: http://arxiv.org/abs/2103.12641

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark

benchmark

fastami

fastami

.gitignore

.gitignore

README.md

README.md

gagolewski_benchmark.py

gagolewski_benchmark.py

requirements.txt

requirements.txt

snap_benchmark.py

snap_benchmark.py

synthetic_benchmark.py

synthetic_benchmark.py

Repository files navigation

Benchmark for FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics

Setup

Running the Benchmarks

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
benchmark		benchmark
fastami		fastami
.gitignore		.gitignore
README.md		README.md
gagolewski_benchmark.py		gagolewski_benchmark.py
requirements.txt		requirements.txt
snap_benchmark.py		snap_benchmark.py
synthetic_benchmark.py		synthetic_benchmark.py

mad-lab-fau/fastami-benchmark

Folders and files

Latest commit

History

Repository files navigation

Benchmark for FastAMI - A Monte Carlo Approach to the Adjustment for Chance in Clustering Comparison Metrics

Setup

Running the Benchmarks

References

About

Resources

Stars

Watchers

Forks

Languages