K-medoids Clustering Algorithm

Introduction

K-medoids is a clustering algorithm related to K-means. In contrast to the K-means algorithm, K-medoids chooses datapoints as centers of the clusters. There are eight combinations of Initialization, Assignment and Update algorithms to achieve the best results in the given dataset. Also Clara algorithm approach is implemented.

Initialization

K-medoids++

Concentrate(Park-Jun, paper)

Assignment

PAM assignment(simplest approach)

Assignment by LSH/DBH: Reverse Approach

Update

Update a la Lloyd’s

CLARANS update

Clara Algorithm

Silhouette value is used to see the quality of the clusters.

Support

The K-medoids algorithm supports:

Euclidean Vector Space

Cosine Vector Space

Hamming Metric Space

Distance Matrix Metric Space

See the datatypes file for the format of each metric space.

Compile

Use make command to compile and make clean to delete object files(there is a MakeFile).

Run

./medoids –d <input file> –c <configuration file> -ο <output file> -complete
-d <input file>: The input file name

–c <configuration file>: configuration file name
-ο <output file>: Output file name
-complete: Output has the members of each cluster

Functionality

The input date are clustered in k clusters. The k-medoids clustering works like this. 1)Initialize centroids
2)Assign the rest of the points to the centroids
3)Update the centroids
Go to 2 until no better clusters are found
All initialization, assign and update methods are used and the results of the clusters are writed to the output file as with the silhouette value of each cluster

K-medoids API

void KMedoids<T>::run(InitializationType initial, AssignmentType Assign, UpdateType update, int s = 2)
initial is ={InitializationPP, InitializationConcentrate}
Assign is ={PamAssign, LSHAssign}
update is={Lloyds,Clarans}
void KMedoids<T>::clara(int s = 5)
is the clara algorithm, and s is the number of iterations

More information about the algorithms in the Papers

Files

Format of input file:
Vector space
@metric_space vector
@metric {euclidean, cosine} //default: euclidean
item_id1 x11 x12 ... x1d
item_id2
.
.
item_idN xN1 xN2 ... xNd

Hamming
@metric_space hamming
item_id1 B1
....
item_idN

Distance Matrix
@metric_space matrix
item_id1 x11 x12 ... x1d
item_id2
.
.
item_idQ xQ1 xQ2 ... xQd

Configuration file
number_of_clusters: int // k value
number_of_hash_functions: int //default 4
number_of_hash_tables: int //default 5
clarans_set_fraction: int //default max{0.12*k(N-k),250}, k is the number of clusters and N the number of the datapoints
clarans_iterations: int //default 2

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Sample Datasets (ergasia 2)		Sample Datasets (ergasia 2)
papers		papers
List.cpp		List.cpp
List.h		List.h
Makefile		Makefile
Node.cpp		Node.cpp
Node.h		Node.h
README.md		README.md
ReadMe.pdf		ReadMe.pdf
centroid.cpp		centroid.cpp
centroid.h		centroid.h
clusterNode.cpp		clusterNode.cpp
clusterNode.h		clusterNode.h
dataTypes.h		dataTypes.h
distancesCalculations.cpp		distancesCalculations.cpp
distancesCalculations.h		distancesCalculations.h
euclideanNode.cpp		euclideanNode.cpp
euclideanNode.h		euclideanNode.h
generalFunctions.cpp		generalFunctions.cpp
generalFunctions.h		generalFunctions.h
hashFunction.cpp		hashFunction.cpp
hashFunction.h		hashFunction.h
hashFunction.h~		hashFunction.h~
hashFunctionCosine.cpp		hashFunctionCosine.cpp
hashFunctionCosine.h		hashFunctionCosine.h
hashFunctionEuclidean.cpp		hashFunctionEuclidean.cpp
hashFunctionEuclidean.h		hashFunctionEuclidean.h
hashFunctionMatrix.cpp		hashFunctionMatrix.cpp
hashFunctionMatrix.h		hashFunctionMatrix.h
hashtable.cpp		hashtable.cpp
hashtable.h		hashtable.h
initialization.cpp		initialization.cpp
initialization.h		initialization.h
kMedoids.cpp		kMedoids.cpp
kMedoids.h		kMedoids.h
lsh.cpp		lsh.cpp
lsh.h		lsh.h
main.cpp		main.cpp
psedoRandomNumbers.cpp		psedoRandomNumbers.cpp
psedoRandomNumbers.h		psedoRandomNumbers.h
readFile.cpp		readFile.cpp
readFile.h		readFile.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-medoids Clustering Algorithm

Introduction

Initialization

Assignment

Update

Clara Algorithm

Support

The K-medoids algorithm supports:

Compile

Run

Functionality

K-medoids API

More information about the algorithms in the Papers

Files

About

Releases

Packages

Languages

billDrett/K-Medoids-Clustering

Folders and files

Latest commit

History

Repository files navigation

K-medoids Clustering Algorithm

Introduction

Initialization

Assignment

Update

Clara Algorithm

Support

The K-medoids algorithm supports:

Compile

Run

Functionality

K-medoids API

More information about the algorithms in the Papers

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages