Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize Sampling of LDT #744

Merged
merged 31 commits into from May 20, 2021
Merged

Conversation

kareef928
Copy link
Contributor

Reference Issues/PRs

Fixes #520

What does this implement/fix? Briefly explain your changes.

Parallelizes the sampling in the latent distribution test

Any other comments?

Notebook showing runtime comparison: https://nbviewer.jupyter.org/github/kareef928/graspologic-personal/blob/main/notebooks/ldt-parallel-sim.ipynb

@netlify
Copy link

netlify bot commented Apr 1, 2021

Deploy Preview for graspologic failed.

Built with commit bcc4466

https://app.netlify.com/sites/graspologic/deploys/60a5729eba61610007da2e30

@kareef928 kareef928 mentioned this pull request Apr 25, 2021
@asaadeldin11 asaadeldin11 self-assigned this Apr 27, 2021
@asaadeldin11
Copy link
Contributor

@kareef928 if there's randomness associated with the parallelization, add a random_state parameter for reproducibility. See how GraphMatch deals with this for reference. If you have any questions let me know.

@kareef928
Copy link
Contributor Author

@asaadeldin11 something like this right? (I tested the _sample_modified_ase method and got reproducibility from it)

Copy link
Contributor

@asaadeldin11 asaadeldin11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job Kareef. Make these small changes and you'll be good to go.

graspologic/inference/latent_distribution_test.py Outdated Show resolved Hide resolved
@bdpedigo
Copy link
Collaborator

bdpedigo commented May 6, 2021

@asaadeldin11 not sure about the n_jobs change - i do want to standardize (see #755, and https://github.com/microsoft/graspologic/pull/710/files#r580257172) but would have to do for both, so this would be an API-breaking change.

@bdpedigo
Copy link
Collaborator

bdpedigo commented May 6, 2021

just talked to @dwaynepryce, if we do want it to be n_jobs for both then we just have to bump the major version in setup.cfg

@asaadeldin11
Copy link
Contributor

asaadeldin11 commented May 6, 2021

just talked to @dwaynepryce, if we do want it to be n_jobs for both then we just have to bump the major version in setup.cfg

I don't have a preference, we can also just change the parameter name in GraphMatch to workers if that's easier?

@daxpryce
Copy link
Contributor

daxpryce commented May 6, 2021

yep!

@bdpedigo
Copy link
Collaborator

bdpedigo commented May 6, 2021

n_jobs is my preference overall if using joblib just because that is actually what joblib calls it and we can steal their docs

@daxpryce
Copy link
Contributor

daxpryce commented May 6, 2021

conversely I'd prefer workers because I can parallelize leiden and I'm probably going to use threads or workers, not n_jobs

@daxpryce
Copy link
Contributor

daxpryce commented May 6, 2021

So really if I have to choose right now, my preferential order is:

  1. threads
  2. workers
  3. n_jobs

@bdpedigo
Copy link
Collaborator

bdpedigo commented May 6, 2021

in that case, for the purposes of this PR, let's leave as workers, not change anything for LPT, and we can bring this discussion up (one of many) in #755

@bdpedigo
Copy link
Collaborator

@kareef928 for some reason there are conflicts here

@bdpedigo bdpedigo added this to Needs Triage in PRs via automation May 17, 2021
@bdpedigo bdpedigo moved this from Needs Triage to Ready for Review in PRs May 17, 2021
@bdpedigo bdpedigo moved this from Ready for Review to Ready For Merge in PRs May 17, 2021
@bdpedigo bdpedigo merged commit 23d147b into graspologic-org:dev May 20, 2021
PRs automation moved this from Ready For Merge to Done May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
PRs
Done
Development

Successfully merging this pull request may close these issues.

Consider parallelizing sampling inside of the LatentDistributionTest
4 participants