New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed randomness in tl.diffmap - compute_eigen (v0 argument) #1858
Conversation
Added the random_state arguments to: - tools/_diffmap.py _diffmap (tl.diffmap function) - tools/_dpt.py _diffmap - neighbors/__init__.py compute_eigen random_state is used to set the random state for generation of the v0 argument for scipy.sparse.linalg.eigsh Otherwise each subsequent function call generates different results.
Codecov Report
@@ Coverage Diff @@
## master #1858 +/- ##
=======================================
Coverage 71.20% 71.21%
=======================================
Files 92 92
Lines 11185 11188 +3
=======================================
+ Hits 7964 7967 +3
Misses 3221 3221
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I think the way we're supposed to be using random state internally (due to recent numpy updates) looks more like:
random_state = check_random_state(random_state)
v0 = random_state.standard_normal(matrix.shape[0])
However, I recall there being some weirdness about scipy arpack and random state where it was difficult to ever get a truly deterministic result (see #1749, scipy/scipy#10374).
Could add a test to make sure this works? Something checking that case which is currently non-deterministic becomes deterministic after your patch.
Sure, will do! Should I combine this PR with the fix mentioned in issue #313 or would you prefer a separate PR? |
Thanks! Separate PR for that, please! |
Done! I've changed the random vector generation to use the code you suggested and added a test to test_embedding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some minor comments (one formatting, possibly one unnecessary line), but looks great overall.
Could you add a bug-fix release note to docs/release-notes/1.8.0.rst
?
Co-authored-by: Isaac Virshup <ivirshup@gmail.com>
No problem, all done! |
Thanks for the PR! |
Currently tl.diffmap generates slightly different results on consequent runs. This is because the v0 argument is not provided in the scipy.sparse.linalg.eigsh call (inside the compute_eigen function).
To fix this I set the random_state inside the compute_eigen function and generate a random vector to pass into scipy.sparse.linalg.eigsh as v0.
I tried to follow how random_state is implemented in other scanpy functions, I hope this is consistent.
To allow user control of the random_state I added random_state arguments to: