New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow performance reading SEG with 98 segments #202
Comments
Here is the notebook: https://colab.research.google.com/drive/1ZLqJwDIO1XKnnjOzClkSq8RIawm3sp9M?usp=sharing |
Hi @pieper The import pydicom
import highdicom as hd
seg = hd.seg.segread("/tmp/ABD_LYMPH_008_SEG.dcm")
print('read file')
source_uids = seg.get_source_image_uids()
source_sops = [uids[2] for uids in source_uids]
pix = seg.get_pixels_by_source_instance(
source_sops,
ignore_spatial_locations=True,
#segment_numbers=[1,2,3] # you could choose specific segments if you like
)
print(pix.shape) There are also various methods to identify which segments you need ahead of calling this method (if this is useful). See the get_segment_numbers method Not saying we couldn't make iter_segments more efficient, but wondering whether there may now be a better fit for what you are actually trying to achieve. |
Thanks for the insights @CPBridge! For context, we are looking at how to optimize the import of SEG in Slicer. We have been using dcmqi, which works very well, but takes about 4 minutes for this dataset and since we wish to support many more segments in the future we wanted to explore highdicom's functionality. There would be some Slicer programming required, so before digging into that I wanted to see how fast highdicom might be for this use case. I updated my colab notebook with both approaches and some timings. When I try highdicom on native Slicer your new approach is down to about 20 seconds to get all 98 segments, so it should be a big improvement over what we are currently doing. Will come back to you as we get more practical experience. |
I have this notebook testing the time to run the seg example on a file with ~100 segments. It takes over 8 minutes to load this file and print the simple metadata.
Is this expected? Can we expect any optimization as the library matures? Working with the same data in a research format such as nrrd is orders of magnitude faster.
@hackermd @fedorov
The text was updated successfully, but these errors were encountered: