Project 1: Global K-means clustering algorithm
Saividhya Saibaba - 1211191602 Rukmani Ganapathy Seetharaman - 1211210075
computeKMeans.m - Computes k centroids using normal k-means algorithm kmeansClustering.m - Caller program for normal k-means algorithm globalKmeansClustering.m - Computes k centroids using global k-means algorithm fastKmeansClustering.m - Computes k centroids using fast-global k-means algorithm fastKmeansClustering_kd.m - Computes k centroids using fast-global k-means with k-d tree kd_tree.m - K-D tree implementation kmeansMain.m - Calls every algorithm for k =3 to 15 and stores all intermediate values in dat files kmeansMain_kdbuckets.m - Executes fast global k-means with k-d tree for bucket numbers j = 1 to 15 kmeansMain_normalavg.m - Executes normal k-means algorithm 100 times to compute average performance findPurity.m - Finds purity of clusters given labels kmeans_fast_main.m - Caller method to execute fast global k-means with k-d tree for different bucket numbers kmeans_normal_main.m - Caller method to execute normal k-means algorithm 100 times mixtureGaussian.m - Generates artificial data set and loads MNIST data set and calls the kmeansMain program
- Run mixtureGaussian to execute the program for k=3 to 15 for all 3 data sets for all the 3 data sets
- Run kmeans_fast_main.m to execute the fast-global with different bucket values for the same data set which is stored in data folder in previous step
- Run kmeans_normal_main to execute normal k-means 100 times for the same data set
The following MNIST data set is taken from http://yann.lecun.com/exdb/mnist/
t10k-images-idx3-ubyte t10k-labels-idx1-ubyte train-images-idx3-ubyte train-labels-idx1-ubyte
We have stored all intermediate values in data folder Run testscript/plot_graph.m to view the graphs