Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing for frame encoding in Segmentation constructor #245

Merged
merged 8 commits into from
Sep 15, 2023

Conversation

CPBridge
Copy link
Collaborator

@CPBridge CPBridge commented Jul 27, 2023

Add the ability to use concurrency to encode frames when creating segmentations.

When using encapsulated transfer syntaxes, the process of encoding frames takes a significant amount of the total time for creation of a Segmentation object. For large multiframe segmentations such as those of WSIs, this can take several minutes. Parallelising the encoding of frames is an obvious place for significant performance gains.

In this PR, I add the ability to use multiprocessing for frame encoding using the Python standard library's concurrent.futures module. The user specifies a workers parameter governing the number of worker processes to spawn (with the default remaining no worker threads). Alternatively they may "bring their own" worker pool by providing an instance of any concurrent.futures.Executor. This has two potential benefits. First it allows a worker pool to be re-used in programs where a large number of segmentations needs to be created to save to overhead of setting up the worker pools. Secondly, it gives the user more control over the multiprocessing context if they want it. For example, in principle, they could pass a ThreadPoolExecutor to use multi-threading rather that multiprocessing.

I also made a few other tweaks to the segmentation array processing for further efficiency gains. Most notably, I adjusted the _get_segment_pixel_array method to return the array for a single frame and a single segment (as opposed to a all frames and a single segment). This makes the looping logic a bit simpler and it turns out that this is slightly faster, presumably because it avoid the need to allocate memory for large arrays.

@CPBridge CPBridge requested a review from hackermd July 27, 2023 01:20
@CPBridge CPBridge self-assigned this Jul 27, 2023
@CPBridge CPBridge added the enhancement New feature or request label Jul 27, 2023
Copy link
Collaborator

@hackermd hackermd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I really like that parallel encoding of frames using multiple workers. It makes me wonder whether we should use multiple workers for decoding of frames as well.

segment_array[plane_index].flatten()
)
# Concatenate the 1D array for encoding at the end
full_frames_list.append(segment_array.flatten())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether a separate variable is needed or whether frames (compressed or not) could just be stored in a list called frames? How the list of frames then needs to be handled (encapsulation or native encoding) is determined by is_enaps.

The variable for the futures could be called frames_futures.

In my opinion, that would make it easier to follow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way primarily to appease mypy, but I guess it probably hurts readability. Implemented 5c5cbb6 as you suggest

@CPBridge CPBridge requested a review from hackermd August 7, 2023 23:53
@CPBridge
Copy link
Collaborator Author

CPBridge commented Aug 7, 2023

Yes I have definitely thought about decoding with multiple workers too, one for the to-do list.

@CPBridge CPBridge merged commit 9289950 into v0.22.0dev Sep 15, 2023
@CPBridge CPBridge deleted the seg_multithreading branch June 27, 2024 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants