Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added handling for sparse precomputed distance matrices #425

Merged
merged 2 commits into from
May 12, 2020
Merged

Added handling for sparse precomputed distance matrices #425

merged 2 commits into from
May 12, 2020

Conversation

Rocketknight1
Copy link
Contributor

This is a PR for issue #424. I've confirmed that it works for my use case, but I'm not fully confident I haven't broken something else! Please double-check what I wrote before merging.

Also, I wasn't sure what datatypes to use for self._knn_indices and self._knn_dists, so I just used np.int and np.float. If you'd prefer int32/uint32 and float32, let me know!

@pep8speaks
Copy link

pep8speaks commented May 11, 2020

Hello @Rocketknight1! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-05-12 17:37:08 UTC

@Rocketknight1
Copy link
Contributor Author

Ah, sorry, my local line length is a bit longer than PEP8 wants. I'll fix that too when submitting any other changes later.

@lmcinnes
Copy link
Owner

I think the real answer is that we currently prefer to just use black to solve all code formatting issues. If you get a recent version of black installed and run it on any files you touched that should resolve formatting issues, including the travis-ci failure (which is a check for black formatting).

umap/umap_.py Outdated
Comment on lines 1706 to 1707
row_data = row_data[row_indices != row_id]
row_indices = row_indices[row_indices != row_id]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually the knn includes the point itself, and distances of zero are handled specially. I suspect this doesn't break anything, but I believe it is not necessary either (unless samples potentially have non-zero distances to themselves).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these and replaced them with a check that all the entries on the diagonal are either missing or zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to explicitly insert zero values on the diagonal, to ensure consistency?

@lmcinnes
Copy link
Owner

Thanks for this -- it looks good, and is definitely appreciated.

@lmcinnes lmcinnes merged commit 2a706c8 into lmcinnes:master May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants