Skip to content

ghazian/lwcoreset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coreset

Work done in this repository is for my Master's thesis titled "Comparative Lightweight Corset Construction for K-Means Problem." In this repo, the construction of coresets are in the context of the k-means problem. In particular, the techniques of coreset construction such as geometric decomposition and random sampling from the works of [1]. The aim of the study is to present theoretical results and to implement the techniques of building coresets comparative to lightweight coresets which were made by the authors of [2].

Algorithms Implemented

Lightweight Coreset Construction

Scalable k-Means Clustering via Lightweight Coresets [2]

Geometric Decomposition

Farthest Point Algorithm by [3]

Fast Constant Factor Approximation by [4]

References

[1] Munteanu, A., Schwiegelshohn, C. Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms. Künstl Intell 32, 37–53 (2018). https://doi.org/10.1007/s13218-017-0519-3

[2] Olivier Bachem, Mario Lucic, and Andreas Krause. 2018. Scalable k -Means Clustering via Lightweight Coresets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '18). Association for Computing Machinery, New York, NY, USA, 1119–1127. DOI:https://doi.org/10.1145/3219819.3219973

[3] Gonzalez, Teofilo F.. “Clustering to Minimize the Maximum Intercluster Distance.” Theor. Comput. Sci. 38 (1985): 293-306.

[4] Sariel Har-Peled and Soham Mazumdar. 2004. On coresets for k-means and k-median clustering. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing (STOC '04). Association for Computing Machinery, New York, NY, USA, 291–300. DOI:https://doi.org/10.1145/1007352.1007400

Releases

No releases published

Packages

No packages published

Languages