Slow performance reading SEG with 98 segments #202

pieper · 2022-10-21T18:30:02Z

I have this notebook testing the time to run the seg example on a file with ~100 segments. It takes over 8 minutes to load this file and print the simple metadata.

Is this expected? Can we expect any optimization as the library matures? Working with the same data in a research format such as nrrd is orders of magnitude faster.

@hackermd @fedorov

pieper · 2022-10-21T18:30:53Z

Here is the notebook:

https://colab.research.google.com/drive/1ZLqJwDIO1XKnnjOzClkSq8RIawm3sp9M?usp=sharing

CPBridge · 2022-10-21T20:25:32Z

Hi @pieper

The iter_segments was an early solution for accessing segmentation pixels, but has largely been replaced by a set of methods on the Segmentation class itself. Unfortunately this has not yet made it into the User Guide yet, though I do have a branch on it. It is fairly well documented in the API docs for the segmentation class however. Iterating through segments is not efficient nor convenient in many situations. Would something like this be more suited to your use case? This runs considerably faster for me (though there may still be room for performance gains):

import pydicom
import highdicom as hd


seg = hd.seg.segread("/tmp/ABD_LYMPH_008_SEG.dcm")
print('read file')

source_uids = seg.get_source_image_uids()
source_sops = [uids[2] for uids in source_uids]

pix = seg.get_pixels_by_source_instance(
    source_sops,
    ignore_spatial_locations=True,
    #segment_numbers=[1,2,3]  # you could choose specific segments if you like
)

print(pix.shape)

There are also various methods to identify which segments you need ahead of calling this method (if this is useful). See the get_segment_numbers method

Not saying we couldn't make iter_segments more efficient, but wondering whether there may now be a better fit for what you are actually trying to achieve.

pieper · 2022-10-21T21:27:52Z

Thanks for the insights @CPBridge! For context, we are looking at how to optimize the import of SEG in Slicer. We have been using dcmqi, which works very well, but takes about 4 minutes for this dataset and since we wish to support many more segments in the future we wanted to explore highdicom's functionality.

There would be some Slicer programming required, so before digging into that I wanted to see how fast highdicom might be for this use case.

I updated my colab notebook with both approaches and some timings. When I try highdicom on native Slicer your new approach is down to about 20 seconds to get all 98 segments, so it should be a big improvement over what we are currently doing.

Will come back to you as we get more practical experience.

pieper changed the title ~~Slow performance reading SEG with >100 segments~~ Slow performance reading SEG with 98 segments Oct 21, 2022

pieper closed this as completed Oct 21, 2022

CPBridge mentioned this issue Nov 12, 2022

Seeking ideas for performance improvements with large segmentation files pydicom/pydicom#1728

Closed

CPBridge mentioned this issue Dec 5, 2022

Seg Array Optimization #208

Merged

hackermd mentioned this issue Feb 18, 2023

Parser performance ImagingDataCommons/libdicom#40

Closed

CPBridge mentioned this issue Mar 7, 2023

conversion of DICOM Segmentation object to nifti file #218

Open

CPBridge mentioned this issue May 16, 2023

Improve Segmentation Creation Efficiency #227

Merged

CPBridge mentioned this issue May 28, 2023

Proposal: Defining and Prototyping "Labelmap" Segmentations in DICOM Format NA-MIC/ProjectWeek#643

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance reading SEG with 98 segments #202

Slow performance reading SEG with 98 segments #202

pieper commented Oct 21, 2022

pieper commented Oct 21, 2022

CPBridge commented Oct 21, 2022

pieper commented Oct 21, 2022

Slow performance reading SEG with 98 segments #202

Slow performance reading SEG with 98 segments #202

Comments

pieper commented Oct 21, 2022

pieper commented Oct 21, 2022

CPBridge commented Oct 21, 2022

pieper commented Oct 21, 2022