Skip to content

Cecca/fair-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fair k-center clustering

This repository contains several implementations of algorithms for fair k-center clustering. For the latest version visit the github repository. The algorithms are implemented mostly in Python, with performance-sensitive parts implemented in Rust.

Building the software

First, you need to install Rust. Then, you can use micromamba, you can setup the environment using the provided env.yaml file:

micromamba create -f env.yaml
micromamba activate fair-clustering

After that, you can compile the Rust extension used by the software using the following command:

maturing develop --release

Datasets

All datasets used in the experimental evaluation are downloaded and preprocessed when needed. See datasets.py for details.

If you want to download and preprocess all datasets in one go, you can use:

python datasets.py

Datasets are stored in HDF5 files with the following structure:

  • data contains the points of the dataset
  • data-PCA contains the PCA rotation of the dataset (used mainly for visualization)
  • colors is the vector of each point's colors

These data files can be processed using the very convenient h5py library in Python.

Running the experiments

The experiments reported in the paper are run using the code in the script experiments.py. You need a copy of Cplex to run the experiments, so that then you can run

python experiments.py ~/your/path/to/cplex $EXPERIMENT

Where $EXPERIMENT is one of

  • exhaustive
  • mapreduce
  • streaming

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published