-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: phik correlation metrics #319
Comments
You could contribute to scipy upstream and then everyone would be able to use these metrics, not only in UMAP. Under the hood of UMAP mostly |
@sleighsoft thanks for the reply. Actually, scipy already has a Spearman's rho implementation but since it's not defined as a scipy distance it's not accessible from scikit-learn's Even if it were available as a distance, it would be only useful for small datasets since, UMAP uses pairwise scikit-learn distances only if the dataset is size is small (n<4096) (and if kNN approximation is not forced by the user) (see this and that). In all other cases, it falls back to UMAP's own So, what I'm proposing is to implement spearman (and maybe phi_k too, if anyone is willing to) in |
Oh I see! Thanks for clarifying the issue. Btw, I do not find pearson in the |
@gokceneraslan I look at this again and I believe the best place for this would actually be the pynndescent project https://github.com/lmcinnes/pynndescent/blob/master/pynndescent/distances.py It has the same number of metrics as UMAP and I assume it will be the default for UMAP. |
Added a PR. |
It'd be great to add other correlations as distances, such as Spearman's rho or the new correlation coefficient called phi_k. I am aware that it's possible to use any distance metric using either
metric='precomputed'
or implementing a custom function, but it'd be more convenient to passmetric='spearmanrho'
or'phik'
directly.The text was updated successfully, but these errors were encountered: