Skip to content
Code accompanying my blog post on k-means in Python, C++ and CUDA
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Minimum decency cleanup Sep 16, 2017

⤴️ ⤴️ k-means


This repository contains code referenced in my blog post Exploring k-means in Python, C++ and CUDA, where I implement k-means in a variety of platforms. In this post I show how CUDA implementations of k-means can outperform scikit-learn and scipy in performance by a factor of 72 and 90, respectively.

The code is not particularly tidy, but gives an idea of how to implement k-means efficiently on a GPU.


  • python/ contains Python code for k-means using scikit-learn, scipy and a roll-it-yourself implementation.
  • cpp/ contains C++ implementations of k-means, including one using Eigen.
  • cuda/ holds all CUDA implementations.
  • data/ has some toy data with 100 and 100k datapoints in five clusters as well as a script to generate more.


Peter Goldsborough + cat ❤️

You can’t perform that action at this time.