Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to be current with kneed v0.4.1 #935

Merged
merged 5 commits into from Aug 6, 2019
Merged

Refactor to be current with kneed v0.4.1 #935

merged 5 commits into from Aug 6, 2019

Conversation

arvkevi
Copy link
Contributor

@arvkevi arvkevi commented Jul 27, 2019

This PR fixes #931

I have made the following changes:

  1. Refactored utils/kneed.py to be current with kneed v0.4.1.

TODOs and questions

Questions for the @DistrictDataLabs/team-oz-maintainers:

  • The issue raised by @Kautumn06 is still unresolved. I show a quick experiment with different random_state values. Should a separate issue be created for this?

CHECKLIST

  • Is the commit message formatted correctly?
  • Included a sample plot to visually illustrate your changes?
  • Do all of your functions and methods have docstrings?
  • Have you added/updated unit tests where appropriate?
  • Have you updated the baseline images if necessary?
  • Have you run the unit tests using pytest?
  • Is your code style correct (are you using PEP8, pyflakes)?
  • Have you documented your new feature/functionality in the docs?

@lwgray
Copy link
Contributor

lwgray commented Jul 29, 2019

@arvkevi Thank you so much for this PR we really appreciate your contributions

Copy link
Member

@rebeccabilbro rebeccabilbro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @arvkevi for helping to keep our port of kneed up-to-date and for resolving #931! I'll get this merged in now!

Just to loop in @Kautumn06 — here's what the new elbow looks like for a synthetic dataset:

from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs

from yellowbrick.cluster import KElbowVisualizer


X, y = make_blobs(n_samples=1000, n_features=12, centers=8, random_state=12)

model = KMeans()
visualizer = KElbowVisualizer(model, k=(4,12))

visualizer.fit(X)
visualizer.poof()

image

And for our new NFL dataset:
image

@rebeccabilbro rebeccabilbro merged commit 653dd6a into DistrictDataLabs:develop Aug 6, 2019
@Kautumn06
Copy link
Contributor

This looks great @arvkevi! I really appreciate your help to get this fixed, and thank you also for taking the time to create and work through the checklist!

@arvkevi
Copy link
Contributor Author

arvkevi commented Aug 6, 2019

Thanks, looking forward to the release 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorporate kneed refactor in utils/kneed.py
4 participants