Skip to content

An implementation of the basic idea of K-Means from scratch.

License

Notifications You must be signed in to change notification settings

AboNady/K_Means_From_Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

K-Means From Scratch

k-Means clustering is an unsupervised machine learning algorithm that seeks to segment a dataset into groups based on the similarity of datapoints. An unsupervised model has independent variables and no dependent variables. In this project, I break down the basic idea of it, just simple to understand how it works.


Tech Stack

  • Python: Version 3.10

  • NumPy: Version 1.23.0

  • Scipy: Version 1.9.1

  • Matplotlib: Version 3.5.3

  • Spyder IDE: Version 5.3.2

Details

  • I implemented here an algorithm from scratch to apply clustring to some dataset. There are 2 main points we need to know.

  • First, choosing some random points to start with it as initial centroids, pick any 3 centroids from the dataset.

  • Second, find the nearest centroid for every point, then assign it to it's centroid.

  • Then, we try to centralize the new centroids by finding the shortest path between all of them.

  • Finally, Repeat!

  • For more details... Please check the References.


Figures

Finafl



Contributing

Contributions are what makes the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Do not forget to give the project a star! Thanks again!


License

Distributed under the MIT License. See LICENSE.txt for more information.

References

  • This is an important video

Contacts