Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance reading SEG with 98 segments #202

Closed
pieper opened this issue Oct 21, 2022 · 3 comments
Closed

Slow performance reading SEG with 98 segments #202

pieper opened this issue Oct 21, 2022 · 3 comments

Comments

@pieper
Copy link
Member

pieper commented Oct 21, 2022

I have this notebook testing the time to run the seg example on a file with ~100 segments. It takes over 8 minutes to load this file and print the simple metadata.

Is this expected? Can we expect any optimization as the library matures? Working with the same data in a research format such as nrrd is orders of magnitude faster.

@hackermd @fedorov

@pieper pieper changed the title Slow performance reading SEG with >100 segments Slow performance reading SEG with 98 segments Oct 21, 2022
@pieper
Copy link
Member Author

pieper commented Oct 21, 2022

@CPBridge
Copy link
Collaborator

Hi @pieper

The iter_segments was an early solution for accessing segmentation pixels, but has largely been replaced by a set of methods on the Segmentation class itself. Unfortunately this has not yet made it into the User Guide yet, though I do have a branch on it. It is fairly well documented in the API docs for the segmentation class however. Iterating through segments is not efficient nor convenient in many situations. Would something like this be more suited to your use case? This runs considerably faster for me (though there may still be room for performance gains):

import pydicom
import highdicom as hd


seg = hd.seg.segread("/tmp/ABD_LYMPH_008_SEG.dcm")
print('read file')

source_uids = seg.get_source_image_uids()
source_sops = [uids[2] for uids in source_uids]

pix = seg.get_pixels_by_source_instance(
    source_sops,
    ignore_spatial_locations=True,
    #segment_numbers=[1,2,3]  # you could choose specific segments if you like
)

print(pix.shape)

There are also various methods to identify which segments you need ahead of calling this method (if this is useful). See the get_segment_numbers method

Not saying we couldn't make iter_segments more efficient, but wondering whether there may now be a better fit for what you are actually trying to achieve.

@pieper
Copy link
Member Author

pieper commented Oct 21, 2022

Thanks for the insights @CPBridge! For context, we are looking at how to optimize the import of SEG in Slicer. We have been using dcmqi, which works very well, but takes about 4 minutes for this dataset and since we wish to support many more segments in the future we wanted to explore highdicom's functionality.

There would be some Slicer programming required, so before digging into that I wanted to see how fast highdicom might be for this use case.

I updated my colab notebook with both approaches and some timings. When I try highdicom on native Slicer your new approach is down to about 20 seconds to get all 98 segments, so it should be a big improvement over what we are currently doing.

Will come back to you as we get more practical experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants