Consider parallelizing sampling inside of the LatentDistributionTest #520

alyakin314 · 2020-10-07T13:03:23Z

Expected Behavior

LDT that uses size correction is a little slower than uncorrected one. This is expected since it has an extra step, but it may be made faster. In particular, it may be that sampling from multivariate gaussians (which is probably done via rejection sampling or some kind of transformation and is likely not very fast) can be sped up by parallelizing here: https://github.com/microsoft/graspologic/blob/bbbc68a24ca9e2097575e7f92f809916ea3eeb46/graspologic/inference/latent_distribution_test.py#L346-L350
However, there are two caveats:

The results of each iteration of this loop are random, and parallelization+randomness should be treated with caution. If using Joblib, consider this resourse: https://bdpedigo.github.io/posts/2020/02/demo-parallel/ :)
Each separate iteration by itself might be quite fast, so it is not clear whether parallelizing, either via joblib, or multiprocessing will even speed up or slow down the overall test. Thus it is imperative to run an experiment ensuring that parallelizing yields a better performance than not parallelizing. It is also important to ensure that this holds both when running only a single LDT and when running a simulation that requires repeated use of the LDT (because joblib has interesting performance differences when you call it only once or multiple times).

Of course, this would use the same workers as the current kwarg to LDT.

Actual Behavior

Sampling is not parallelized. :(

kareef928 · 2020-10-31T20:03:02Z

I'd be interested in this issue

kareef928 · 2020-11-11T18:34:33Z

DoD:

Write LDT using Joblib parallel
Run simulation to compare speeds of original LDT and parallelized LDT across several settings(different n_verts, single vs multiple iterations of LDT)

1st sim (time vs n_verts for a single iteration of LDT):

n_verts = [10, 100, 1000, 10000, 100000]
Generate two ER graphs - 1st with n_vert vertices and second with 2*n_vert vertices, and p=0.5 for both
Obtain times for running LDT unparallelized (workers=1) and parallelized (workers=-1) on the two graphs
Repeat steps 2-3 for 30 trials per n_vert in n_verts
Plot stripplots of times vs n_verts (of graph 1) for LDT unparallelized and parallelized

Sim 2 - single vs multiple iterations of LDT

n_iters = [1, 5, 25, 50, 100]
Generate two ER graphs - 1st with 250 vertices and second with 500 vertices, and p=0.5 for both
Obtain times for running LDT unparallelized (workers=1) and parallelized (workers=-1) on the two graphs for n_iters iterations
Repeat steps 2-3 for 15 trials per n_iter in n_iters
Plot stripplots of times vs n_iters for LDT unparallelized and parallelized

If LDT parallelized is consistently faster than original LDT, then push the parallelized version of LDT

kareef928 · 2021-03-17T22:39:31Z

2. It is also important to ensure that this holds both when running only a single LDT and when running a simulation that requires repeated use of the LDT

@alyakin314 could you clarify what you mean by this?

alyakin314 · 2021-04-01T16:19:13Z

It is also important to ensure that this holds both when running only a single LDT and when running a simulation that requires repeated use of the LDT

@alyakin314 could you clarify what you mean by this?

Yes, so what I meant is - because of how hyppo paralellizes things via joblib, if it takes X time to run the test once, it will take waaaay less time to run the test 1000 times than 1000X, because the first iteration (maybe the first several? not sure...) is the slowest one due to joblib setting some things up. I would assume @sampan501 would know more.

And what I was saying - it'd be nice to know that the test becomes faster regardless of whether you were to run it a single time or 1000. Does that clarify?

sampan501 · 2021-04-01T17:07:15Z

I think this tutorial may help: https://joblib.readthedocs.io/en/latest/parallel.html.

Though I'm not really sure of the run time comparisons between running a serial Python and running a single thread through joblib (I would expect negligible), the performance improvements are really noticed when running over multiple threads. You should see real improvement if this line of code is the bottleneck (which I don't think it is).

alyakin314 added low priority enhancement New feature or request labels Oct 7, 2020

bdpedigo added this to Code improvements in Neuro Data Design Oct 14, 2020

bdpedigo removed this from Code improvements in Neuro Data Design Oct 14, 2020

bdpedigo added this to Code improvements in Neuro Data Design Oct 16, 2020

kareef928 mentioned this issue Apr 1, 2021

Parallelize Sampling of LDT #744

Merged

bdpedigo closed this as completed in #744 May 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider parallelizing sampling inside of the LatentDistributionTest #520

Consider parallelizing sampling inside of the LatentDistributionTest #520

alyakin314 commented Oct 7, 2020

kareef928 commented Oct 31, 2020

kareef928 commented Nov 11, 2020 •

edited

kareef928 commented Mar 17, 2021

alyakin314 commented Apr 1, 2021

sampan501 commented Apr 1, 2021 •

edited

Consider parallelizing sampling inside of the LatentDistributionTest #520

Consider parallelizing sampling inside of the LatentDistributionTest #520

Comments

alyakin314 commented Oct 7, 2020

Expected Behavior

Actual Behavior

kareef928 commented Oct 31, 2020

kareef928 commented Nov 11, 2020 • edited

kareef928 commented Mar 17, 2021

alyakin314 commented Apr 1, 2021

sampan501 commented Apr 1, 2021 • edited

kareef928 commented Nov 11, 2020 •

edited

sampan501 commented Apr 1, 2021 •

edited