Skip to content

KMeans Unsupervised Algorithm implementation using Python3

Notifications You must be signed in to change notification settings

luisramirez7/kmeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Mining - A KMeans implementation from Scratch (no Sci-kit)

Getting Started

  • Ensure you are have Python3 installed with libraries scipy, numpy, and pandas.

Included Files

  • iris.arff: This is the test data file that the kmeans clustering runs on.
  • kmeans.py: This is the file that contains the main script to run the clustering on the iris dataset.
  • parsing.py: This is the python file used to read in the data from the arff file.
  • Assignment-3-Report.pdf: This is the file that contains our overall report and contribution sheet.

Usage

  • To run the clustering, execute the following command: python3 kmeans.py <k_value> <epsilon_value> <number_of_iterations>
    • k_value is the number of clusters the program will produce.
    • epsilon_value is the change in the sum of the distances from the cluster centers.
    • number_of_iterations is the number of iterations the program will go through to find the best clustering.
  • To run finding and plotting the optimal k values of the dataset, uncomment line 184 in kmeans.py

About

KMeans Unsupervised Algorithm implementation using Python3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages