The repo consists of three scripts:
analysis.py
: basic functions that are used repetitively in other scripts;elbow.py
: plots the cost function (elbow method) to find the appropriate number of clusters;iteration.py
: produces table output of clustering based on the number of clusters determined by the elbow plot.
Notebooks demonstrate how the clustering algorithm defined in the scripts above could be applied
example.ipynb
andnew_data_example.ipynb
show worked examples of how k-means clustering is implemented for 2D datasets.t_shirt.ipynb
presents a clustering example of a multi-dimensional dataset.