Skip to content
Implementation of the k-means algorithm in PyTorch that works for large datasets
Branch: master
Clone or download
ilyaraz Update README.md
A remark about `chunk_size`
Latest commit dc8928e Jan 29, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE.txt First commit Jan 29, 2019
README.md Update README.md Jan 29, 2019
kmeans.py more idiomatic use of .size() Jan 29, 2019

README.md

PyTorch implementation of the k-means algorithm

This code works for a dataset, as soon as it fits on the GPU. Tested for Python3 and PyTorch 1.0.0.

For simplicity, the clustering procedure stops when the clustering stops updating. In practice, this might be too strict and should be relaxed.

There is a magic constant (search for chunk_size) which should ideally be determined automatically based on the amount of free memory on the GPU.

You can’t perform that action at this time.