-
-
Notifications
You must be signed in to change notification settings - Fork 48.9k
Implement K-Medoids clustering algorithm #13488 #13644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Implement K-Medoids clustering algorithm #13488 #13644
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper
commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper review
to trigger the checks for only added pull request files@algorithms-keeper review-all
to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
from matplotlib import pyplot as plt | ||
from sklearn.metrics import pairwise_distances | ||
|
||
def get_initial_medoids(data, k, seed=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function get_initial_medoids
Please provide return type hint for the function: get_initial_medoids
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide descriptive name for the parameter: k
Please provide type hint for the parameter: k
Please provide type hint for the parameter: seed
medoids = data[indices, :] | ||
return medoids | ||
|
||
def assign_clusters(data, medoids): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function assign_clusters
Please provide return type hint for the function: assign_clusters
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide type hint for the parameter: medoids
cluster_assignment = np.argmin(distances, axis=1) | ||
return cluster_assignment | ||
|
||
def revise_medoids(data, k, cluster_assignment): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function revise_medoids
Please provide return type hint for the function: revise_medoids
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide descriptive name for the parameter: k
Please provide type hint for the parameter: k
Please provide type hint for the parameter: cluster_assignment
new_medoids.append(members[medoid_index]) | ||
return np.array(new_medoids) | ||
|
||
def compute_heterogeneity(data, k, medoids, cluster_assignment): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function compute_heterogeneity
Please provide return type hint for the function: compute_heterogeneity
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide descriptive name for the parameter: k
Please provide type hint for the parameter: k
Please provide type hint for the parameter: medoids
Please provide type hint for the parameter: cluster_assignment
heterogeneity += np.sum(distances**2) | ||
return heterogeneity | ||
|
||
def kmedoids(data, k, initial_medoids, maxiter=100, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function kmedoids
Please provide return type hint for the function: kmedoids
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide descriptive name for the parameter: k
Please provide type hint for the parameter: k
Please provide type hint for the parameter: initial_medoids
Please provide type hint for the parameter: maxiter
Please provide type hint for the parameter: verbose
return medoids, cluster_assignment | ||
|
||
# Optional plotting | ||
def plot_clusters(data, medoids, cluster_assignment): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/k_medoids.py
, please provide doctest for the function plot_clusters
Please provide return type hint for the function: plot_clusters
. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide type hint for the parameter: medoids
Please provide type hint for the parameter: cluster_assignment
c0c86ba
to
c0543f7
Compare
for more information, see https://pre-commit.ci
Describe your change:
Implemented the K-Medoids clustering algorithm in Python.
This algorithm is similar to K-Means but uses actual data points as cluster centers (medoids),
making it more robust to noise and outliers.
Includes usage example, optional plotting, and doctest.
Checklist: