-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #6 from vojtechpavlu/feature/readme
`README.md`
- Loading branch information
Showing
2 changed files
with
75 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# K-Means | ||
|
||
|
||
K-Means is a popular cluster algorithm widely used in various fields for data | ||
analysis. Its centroid-based nature enables the simple way of assignment of | ||
each given instance into any of the found clusters. | ||
|
||
To find more information about this algorithm, have a look at the | ||
[Wikipedia page](https://en.wikipedia.org/wiki/K-means_clustering). | ||
|
||
This implementation is not meant to be used in a production environments, | ||
because it uses just the simple Python means; no usage of NumPy or any other | ||
scientific and compute performance optimized library. The purpose of this | ||
project is to be rather educational and not for being used in any mission | ||
critical applications. | ||
|
||
|
||
## Requirements | ||
|
||
The only dependency required by this project is a `pytest` library for testing | ||
purposes. If you are not about to run these tests, you can skip it. Otherwise, | ||
clone this repository to your device (into a virtual environment, if you want) | ||
and run: | ||
|
||
```bash | ||
python -m pip install --upgrade pip | ||
python -m pip install pytest | ||
``` | ||
|
||
On some devices, this might be kinda problematic - for example on linux OS. | ||
So if the previous does not work, try these commands: | ||
|
||
```bash | ||
python3 -m pip install --upgrade pip | ||
python3 -m pip install pytest | ||
``` | ||
|
||
|
||
## Project structure | ||
|
||
This project is organized into simple multi-directory structure as seen | ||
below: | ||
|
||
- `.github` - contains just a simple workflow to run all the tests using | ||
GitHub Actions | ||
|
||
- `src` - contains the actual source code, the implementation | ||
|
||
- `test` - contains the unit tests to validate some of the features | ||
|
||
|
||
## Notable objects | ||
|
||
The most important classes are definitely | ||
|
||
- `Point` and `Centroid` (both defined in the `./src/datapoint.py` module), | ||
|
||
- abstract class `Metric` used to calculate the distance between two points | ||
in a multidimensional space (`./src/datapoint.py`) | ||
|
||
- class `KMeans`, that defines instances providing the actual model | ||
(this is defined in the `./src/k_means.py` module). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
from typing import TYPE_CHECKING | ||
|
||
# Problems with circular imports | ||
if TYPE_CHECKING: | ||
from src.metric import Metric | ||
from datapoint import (Point, | ||
Example, | ||
Centroid, | ||
InconsistentDimensionalityError, | ||
NormalizationError) | ||
|
||
from metric import Metric, Euclidean, Taxicab, Hamming | ||
from k_means import KMeans, KMeansError |