Skip to content

This implementation of the k-means algorithm contains functions for training and outputting the learned cluster centers. It also outputs the total dispersion of clustering, so you can run it multiple times with multiple different K and find the simplest solution that gives the least dispersion.

Notifications You must be signed in to change notification settings

czonios/c-kmeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K-Means in C

An implementation of the k-means algorithm in C.

This implementation of the k-means algorithm contains functions for training and outputting the learned cluster centers. It also outputs the total dispersion of clustering, so you can run it multiple times with multiple different K and find the simplest solution that gives the least dispersion.

Learned centers for the example dataset

Installation

Linux

Clone the repository:

git clone https://github.com/czonios/c-kmeans.git

Change your directory and compile:

cd c-kmeans/src
make

Clean the intermediate objects and debugging derivative files:

make clean

Usage example

The main.c file contains the defined settings of the algorithm.

./main

Generating a test dataset

The scripts directory includes a Python script that generates an example dataset. You can call it by using:

python create_clustering_dataset.py

Debugging

You can readily use GDB debugger by running:

make debug
gdb debug

Release History

  • 0.1-alpha
    • Work in progress
  • 1.0.0 2020-10-09
    • Initial full release

Meta

Christos A. Zonios – @czonios – czonios (at) gmail (dot) com

Distributed under the MIT license. See LICENSE for more information.

https://github.com/czonios/c-kmeans

Contributing

  1. Fork it (https://github.com/czonios/c-kmeans/fork)
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Commit your changes (git commit -am 'Add some fooBar')
  4. Push to the branch (git push origin feature/fooBar)
  5. Create a new Pull Request

TODO

  • Add command line params (dataset file, K)

About

This implementation of the k-means algorithm contains functions for training and outputting the learned cluster centers. It also outputs the total dispersion of clustering, so you can run it multiple times with multiple different K and find the simplest solution that gives the least dispersion.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published