Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clustering: Add a step_kmeans() #77

Closed
mdancho84 opened this issue Mar 17, 2021 · 3 comments
Closed

Clustering: Add a step_kmeans() #77

mdancho84 opened this issue Mar 17, 2021 · 3 comments
Labels
feature a feature request or enhancement

Comments

@mdancho84
Copy link

Love embed. It would be super awesome if there was a step_kmeans() or step_cluster() that added cluster assignments to a data frame.

Why?

Cluster assignments are super important for segmentation. K-Means and similar algorithms (e.g. K-modes) can help us to identify customer groups.

Embed

Embed is a good spot for this. step_umap() is a similar algorithm that I often use in combination with K-Means.

Let me know what you think.

Thanks, Matt

@juliasilge juliasilge added the feature a feature request or enhancement label Mar 17, 2021
@topepo
Copy link
Member

topepo commented Mar 17, 2021

This was previously discussed in tidymodels/recipes#399; I wasn't sold on what the poster wanted to return and they added a step function to their own package.

You might also want to take a look at tidymodels/planning#12. For non-preprocessing needs, I think that @kbodwin's thoughts are spot-on.

It would be good as long as the output is a factor variable that denotes the cluster that the sample belongs to. For new data, we can use a nearest centroid (or mediod) approach to assign new samples (but this is dependent on the clustering method).

I'd support this but don't have the time to do it; you'd have to start a PR.

@topepo
Copy link
Member

topepo commented Apr 16, 2021

I'll close but please add a PR if this is important.

@topepo topepo closed this as completed Apr 16, 2021
@github-actions
Copy link

github-actions bot commented May 1, 2021

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators May 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants