GitHub - Abercus/kmeansimps: Implementations of k-means algorithms for clustering

Implementations of different k-means algorithms in C.

Using guide:

Installation

With having gcc compiler and Make installing the program should be as easy as writing make. An executable "clbin" will be created.

Flags

Help menu. Argument and its value must be separated by a space:
-h  - help menu
-f  - input data points file name
-o  - output file name
-ci - centers input file name
-a  - algorithm choice (lloyd, elkan, hamerly, macqueen, hartigan, closest)
-i  - initialization method choice (kpp, forgy, partition, furthest, firstn)
-m  - metric choice (euclidean, manhattan)
-k  - clusters count, -ci flag has a higher priority
-s  - random seed nr, otherwise uses current time as the seed. Used to confirm clustering results
-n  - iteration count (default 100)

Contains the following:

Choosing initial cluster centers:

k-means++
Forgy
Partition (assigns points to random cluster and then finds means of these assignments as centers)
Furthest first
firstn (chooses k first points from input file as initial cluster centers).

Clustering algorithms:

Lloyd
Elkan
Hamerly
MacQueen
Hartigan-Wong
Closest (just assigns points to closest centers and stops)

Supports following metrics

Euclidean
Manhattan

It should be rather easy to add more of them.

Input data format

First line consists of two integers: Point count and data dimensionality. Follows n rows with d doubles on each of them.

Output data formats

Outputs 2 files

File consisting of numbers to which cluster a point belongs to
File consisting of means' vectors. First row has an integer which shows how many means follow.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Dokumentatsioon.txt		Dokumentatsioon.txt
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
auxfunctions.c		auxfunctions.c
auxfunctions.h		auxfunctions.h
clusteringalgorithms.h		clusteringalgorithms.h
commonmacros.h		commonmacros.h
elkan.c		elkan.c
elkan.h		elkan.h
hamerly.c		hamerly.c
hamerly.h		hamerly.h
hartiganwong.c		hartiganwong.c
hartiganwong.h		hartiganwong.h
initmethods.c		initmethods.c
initmethods.h		initmethods.h
lloyd.c		lloyd.c
lloyd.h		lloyd.h
macqueen.c		macqueen.c
macqueen.h		macqueen.h
main.c		main.c
metrics.c		metrics.c
metrics.h		metrics.h

License

Abercus/kmeansimps

Folders and files

Latest commit

History

Repository files navigation

Implementations of different k-means algorithms in C.

Using guide:

Installation

Flags

Contains the following:

Choosing initial cluster centers:

Clustering algorithms:

Supports following metrics

Input data format

Output data formats

About

Resources

License

Stars

Watchers

Forks

Languages