Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Latest commit a52d2b7 Apr 9, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src module paths refactoring Apr 9, 2019
.gitignore chameleon 2 benchmark Jun 3, 2016
README-old.asc update readme Nov 9, 2015
README.md update datasets list Nov 10, 2015
consensus consensus script Jun 10, 2015
evolve-sc import Jun 9, 2015
nb-configuration.xml partitioning benchmark Nov 6, 2015
pom.xml use java 8t May 12, 2018
run import Jun 9, 2015
updreadme.rb fit 3 images to page Nov 9, 2015

README.md

Clustering benchmarks

Datasets

This project contains collection of labeled clustering problems that can be found in the literature. Most of datasets were artificially created.

The benchmark includes:

Artificial data

2d-10c 2d-20c-no0 2d-3c-no123 2d-4c-no4 2d-4c-no9 2d-4c 2sp2glob 3-spiral 3MC D31 DS577 DS850 R15 aggregation atom banana birch-rg1 birch-rg2 birch-rg3 chainlink cluto-t4.8k cluto-t5.8k cluto-t7.10k cluto-t8.8k complex8 complex9 compound cure-t0-2000n-2D cure-t1-2000n-2D cure-t2-4k curves1 curves2 dartboard1 dartboard2 dense-disk-3000 dense-disk-5000 diamond9 disk-1000n disk-3000n disk-4000n disk-4500n disk-4600n disk-5000n disk-6000n donut1 donut2 donut3 donutcurves ds2c2sc13 ds3c3sc6 ds4c2sc8 elliptical_10_2 elly-2d10c13s engytime flame fourty golfball hepta insect jain long1 long2 long3 longsquare lsun mopsi-finland mopsi-joensuu pathbased rings s-set1 s-set2 s-set3 s-set4 sizes1 sizes2 sizes3 sizes4 sizes5 smile1 smile2 smile3 spherical_4_3 spherical_5_2 spherical_6_2 spiral spiralsquare square1 square2 square3 square4 square5 st900 target tetra triangle1 triangle2 twenty twodiamonds wingnut xclara zelnik1 zelnik2 zelnik3 zelnik4 zelnik5 zelnik6

Experiments

This project contains set of clustering methods benchmarks on various dataset. The project is dependent on Clueminer project.

in order to run benchmark compile dependencies into a single JAR file:

mvn assembly:assembly

Consensus experiment

allows running repeated runs of the same algorithm:

./run consensus --dataset "triangle1" --repeat 10

by default k-means algorithm is used.

For available datasets see resources folder.

You can’t perform that action at this time.