Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better description of the hierarchical clustering parameter #9171

Merged
merged 4 commits into from
Aug 23, 2018

Conversation

raamana
Copy link
Contributor

@raamana raamana commented Aug 21, 2018

Clarifies that t could be an integer specifying the max number of clusters under maxclust* criteria

@rgommers rgommers added scipy.cluster Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org labels Aug 21, 2018
@raamana
Copy link
Contributor Author

raamana commented Aug 22, 2018

the reasons pypy3 failed with PR seems to be for reasons completely unrelated to my changes - any idea whats going on?

@@ -2471,8 +2471,11 @@ def fcluster(Z, t, criterion='inconsistent', depth=2, R=None, monocrit=None):
Z : ndarray
The hierarchical clustering encoded with the matrix returned
by the `linkage` function.
t : float
The threshold to apply when forming flat clusters.
t : float or int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could really be any numeric that can be safely cast to int and float , correct? A boolean would work here as well. I like your new comment below, but am unsure if changing to "float or int" is an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its more of a semantic hint to further reinforce the point that it is referring to number of clusters, an integer.. Making it explicit, speaking in python terms!. Without that hint, people may supply 2.0, thinking they are expected a float.

in fact, it might even be better to rename to t to t_or_num_clust

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other places referring to kmeans, scipy already uses k_or_guess

Copy link
Member

@rgommers rgommers Aug 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the generic term to use for float or int is scalar

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take your point @raamana, I think scalar is the right way to go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure - what about the renaming the variable to t_or_num_clust?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot rename variables in the signature, that breaks backwards compatibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. revised the docs now.

@rgommers rgommers merged commit 04f6392 into scipy:master Aug 23, 2018
@rgommers
Copy link
Member

Merged, thanks @raamana, @jeffyancey

@rgommers rgommers added this to the 1.2.0 milestone Aug 23, 2018
@raamana
Copy link
Contributor Author

raamana commented Aug 23, 2018

Yay! thanks. So glad to be able to add a few characters into the mighty scipy codebase :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org scipy.cluster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants