Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up generation of pixel cluster masks overlay #546

Merged
merged 7 commits into from
May 19, 2022

Conversation

alex-l-kong
Copy link
Contributor

@alex-l-kong alex-l-kong commented May 16, 2022

What is the purpose of this PR?

Closes #535. generate_pixel_cluster_masks in data_utils.py runs very slowly and needs to be optimized.

How did you implement your changes

The main issue is in the way the mask for each FOV is currently indexed. Currently, each FOV image is accessed in its original 2D format. Due to how 2D arrays are stored in memory, this leads to several cache misses that adds a significant bottleneck.

By flattening the array using .ravel and converting the coordinates so we can do 1D indexing, significant time is saved as 1D arrays are stored contiguously in memory. We get about a 3x speedup per FOV.

Remaining issues

If this implementation is still too slow, we should examine a different option. Unfortunately, the per-FOV iteration is a bottleneck we can't eliminate.

@alex-l-kong
Copy link
Contributor Author

@cliu72 would you mind testing this PR too for speed purposes? It may not be completely accurate yet, but I'd like to get a sense of the speed performance on a different machine.

@alex-l-kong alex-l-kong self-assigned this May 17, 2022
@cliu72
Copy link
Contributor

cliu72 commented May 17, 2022

For 12 1024x1024 FOVs:
Old version: 6.987326205000045
This version: 2.889587150999887

@ngreenwald
Copy link
Member

Is that on the cluster or your laptop?

@cliu72
Copy link
Contributor

cliu72 commented May 17, 2022

Google cloud, but baby instance (4 CPU, 32 GB memory)

@ngreenwald
Copy link
Member

Hmm. Currently in my docker, it's taking closer to 30 seconds for each 2048x2048 FOV. Could be some non-linear scaling with FOV size, or just laptop vs real computer issues.

@ngreenwald
Copy link
Member

ngreenwald commented May 17, 2022

I found the problem: the example notebook creates masks for all fovs, not just the pixel_fovs. I thought it was taking forever on the 8 I was analyzing, but it was using all 30 in the directory
image

@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented May 17, 2022

I found the problem: the example notebook creates masks for all fovs, not just the pixel_fovs. I thought it was taking forever on the 8 I was analyzing, but it was using all 30 in the directory image

Ok, I think we should this function to create masks for just the provided FOVs, not all FOVs (initially wanted to support the user being able to visualize a different set of FOVs without needing re-running this cell again). In any case, FOV indexing is faster now too.

@cliu72 @ngreenwald did you run into any correctness issues with the pixel masks generated? Indexing looked fine on my end but just wanted to verify before requesting a review.

@cliu72
Copy link
Contributor

cliu72 commented May 18, 2022

Didn't run into problems in my tests

@alex-l-kong
Copy link
Contributor Author

@ngreenwald from benchmarking, we get a far more significant speedup using 1-D indexing.

Here are the times for each fov using tuple-based indexing on a 2-D array:

Screen Shot 2022-05-18 at 3 15 06 PM

These are the equivalent times for 1-D indexing into a flattened array, then reshaping back into a 2-D array:

Screen Shot 2022-05-18 at 3 15 43 PM

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alex-l-kong alex-l-kong merged commit 1b78fc1 into master May 19, 2022
@alex-l-kong alex-l-kong deleted the pixel_mask_boost branch May 19, 2022 17:23
@srivarra srivarra added the enhancement New feature or request label Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

generate_pixel_cluster_mask is really slow
4 participants