Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document NumPyDataLoader for reading .npy and .npz files #6733

Closed
wants to merge 1 commit into from

Conversation

pieper
pieper previously requested changes Dec 12, 2022
Copy link
Member

@pieper pieper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks really handy.

Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Copy link
Member

@jcfr jcfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution 🙏

After addressing a few nitpick and adding a test, this should be good for integration.

Additional, the following documentation should be updated.

See https://github.com/Slicer/Slicer/blob/main/Docs/user_guide/data_loading_and_saving.md#images

Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
Modules/Scripted/NumpyDataLoader/NumpyDataLoader.py Outdated Show resolved Hide resolved
@dzenanz dzenanz changed the title Add NumpyDataLoader for reading .npy/.npz files Add NumpyDataLoader for reading .npy files Dec 13, 2022
@dzenanz dzenanz force-pushed the npyLoader branch 2 times, most recently from 402e429 to 9e45525 Compare December 13, 2022 23:02
@dzenanz dzenanz marked this pull request as ready for review December 13, 2022 23:10
@dzenanz dzenanz requested review from pieper and jcfr December 13, 2022 23:10
@dzenanz
Copy link
Member Author

dzenanz commented Dec 13, 2022

Do you know how to auto-select the just loaded file in user interface? That does not happen for .npy files, like it happens for .png and .nrrd files.

@pieper
Copy link
Member

pieper commented Dec 13, 2022

how to auto-select the just loaded file in user interface?

Probably easiest to use this utility to show the images in the background:
https://github.com/Slicer/Slicer/blob/main/Base/Python/slicer/util.py#L514

@lassoan
Copy link
Contributor

lassoan commented Dec 14, 2022

how to auto-select the just loaded file in user interface?

Probably easiest to use this utility to show the images in the background: https://github.com/Slicer/Slicer/blob/main/Base/Python/slicer/util.py#L514

setSliceViewerLayers would work, but in image readars we usually use applogic.PropagateVolumeSelection, so for consistency it may be better to use it here, too:

appLogic = slicer.app.applicationLogic()
selNode = appLogic.GetSelectionNode()
selNode.SetActiveVolumeID(volumeNode.GetID())
appLogic.PropagateVolumeSelection()

@lassoan
Copy link
Contributor

lassoan commented Dec 14, 2022

I started to add some more comments, but then I rather made the changes myself:

  • add support for both .npy and .npz format
  • add support for arrays up to 5D (3D spatial + component + time sequence)
  • report loading errors to the user

@dzenanz
Copy link
Member Author

dzenanz commented Dec 14, 2022

Thanks for the improvements @lassoan. You should have added yourself as a module contributor.

In PyTorch, the usual conventions for dimensions are: NCHW for 4D tensors/arrays and NCDHW for 5D. I assume that channel-last convention was taken from ITK, or maybe VTK? How many people rely on this? As this is a new addition, we could only use PyTorch's convention in this reader, if we don't want to change this convention in the rest of Slicer.

@lassoan
Copy link
Contributor

lassoan commented Dec 14, 2022

In PyTorch, the usual conventions for dimensions are: NCHW for 4D tensors/arrays and NCDHW for 5D. I assume that channel-last convention was taken from ITK, or maybe VTK? How many people rely on this? As this is a new addition, we could only use PyTorch's convention in this reader, if we don't want to change this convention in the rest of Slicer.

Yes, this is the ITK/VTK memory layout. It is easy enough to reorder axes before writing to file, so documenting the convention is probably enough (as it is already described in the updated User guide page). If it turns out to be an issue then we could add axis order as loading option (and an options widget for choosing this order in the "Add data" window).

@dzenanz
Copy link
Member Author

dzenanz commented Dec 14, 2022

My principal motivation for this PR is desire to easily and conveniently visualize dumps of NCHW 4D tensors. Having an option to load them as "channel last/fastest" would be nice, but I really want PyTorch's convention to be the default.

@dzenanz
Copy link
Member Author

dzenanz commented Dec 14, 2022

Even when reordering the axis for channel to be last, with the updated reader, I get an all-black image of "4 components". Visualizing something (vector norm, component average, or even just the first component) would be more useful than an all-black image.

@dzenanz
Copy link
Member Author

dzenanz commented Dec 14, 2022

Also, we could employ heuristic: the dimensions with largest size could be considered to be spatial. So (10, 4, 368, 640) would be interpreted as 4D image with 4 timepoints and 3D size of 640x368x10 (ijk). Prefer timepoints over components unless size is 3, which can be easily visualized as RGB.

@dzenanz
Copy link
Member Author

dzenanz commented Dec 14, 2022

Of course, all of these complications stem from .npy not having suitable metadata.

@lassoan
Copy link
Contributor

lassoan commented Dec 14, 2022

Having we could have auto, NCDHW, NKJIC axis options. Auto could implement the heuristics you described.

My only concern is that if sooner or later users will report that the loader randomly breaks. We experienced this with ITK, as ITK vector image reader has such heuristic, too. We were shocked when after being annoyed by seemingly random load failures for a long time we realized that loading fails for all data sets that have 3 or 4 time points simply because it is interpreted as channel data instead of time sequence.

Not much better, but maybe a bit more predictable heuristic could be to decide based on (composite) file extension. For example, something.torch.npy could make NCDHW order the default.

Why don't you save the result using pynrrd? You could use nrrd.write and specify the axis kinds like this:

nrrd.write("test.nrrd", a, {'kinds': ['RGB-color', 'domain', 'domain']})

How do you load the image into Slicer? Drag-and-drop is pretty tedious and it also shows the loaded image by default (there is no way to update an existing image). You could solve both tedious drag-and-drop and setting axes by using slicerio:

np.save("path/to/myfile.npy", someTensor),
nodeID = slicerio.server.file_load("path/to/myfile.npy", "NumpyArrayFile", {'axes': 'NCDHW'})

slicerio.server.file_load loads a data set in Slicer (if Slicer is not running then it launches it). If you load the node from a standard file format then you can reload it by simply calling slicerio.server.node_reload(id=nodeID).

@pieper
Copy link
Member

pieper commented Dec 14, 2022

+1 for being more explicit either in the naming convention or in giving the user the ability to specify the load pattern. I agree that heuristics can be painful.

@pieper pieper closed this Dec 14, 2022
@jcfr jcfr reopened this Dec 14, 2022
@jcfr
Copy link
Member

jcfr commented Dec 14, 2022

Assuming the pull-request was unintentionally closed, I am re-opening it so that we further improve and reach consensus.

@pieper
Copy link
Member

pieper commented Dec 14, 2022

Assuming the pull-request was unintentionally closed, I am re-opening it so that we further improve and reach consensus.

Yes - sorry, wrong button!

@dzenanz
Copy link
Member Author

dzenanz commented Dec 15, 2022

@lassoan It would be good if you could take over this PR, or start a new PR. Implementing auto, TCDHW, TKJIC (T standing for timepoint) axis options would be great. I think you are more familiar with Slicer's internals and UI, so adding this would be a lot easier for you than me.

@lassoan
Copy link
Contributor

lassoan commented Jul 12, 2023

As it could take some time for this reader to get more mature, it would make sense to put it in the Sandbox extension, where it could be updates more easily and more frequently.

@jcfr
Copy link
Member

jcfr commented Jul 12, 2023

This would be consistent with the existing modules available in SlicerSandbox:

Then, this PR would only update the following files:

Base/Python/slicer/util.py
Docs/developer_guide/script_repository/sequences.md
Docs/user_guide/data_loading_and_saving.md

@dzenanz
Copy link
Member Author

dzenanz commented Jul 12, 2023

put it in the Sandbox extension

Agreed. @lassoan or @jcfr, could one of you do it?

@jcfr
Copy link
Member

jcfr commented Jul 12, 2023

I will take care of it.

This commit NumpyDataLoader for reading .npy and .npz files

The file may contain an array of dimension between 1 to 5 (axes: time, K, J, I, component).
User is responsible for setting the correct IJK to RAS matrix.

As suggested in forum discussion
https://discourse.slicer.org/t/import-numpy-file-in-as-image-volume/3653/5

Based on:
https://github.com/pieper/SlicerDMRI/blob/9565f89d16cd72618cd87f1c9046542450e4937b/Modules/Scripted/NIfTIFile/NIfTIFile.py
https://github.com/Slicer/Slicer/blob/9390be6bf76e048e6fd384d8bfd607eba2857b5a/Applications/SlicerApp/Testing/Python/SlicerScriptedFileReaderWriterTest.py

Co-authored-by: Andras Lasso <lasso@queensu.ca>
@jcfr
Copy link
Member

jcfr commented Aug 1, 2023

This pull request should be integrated after the following one have been reviewed and integrated:

@jcfr jcfr changed the title Add NumpyDataLoader for reading .npy files Add NumPyDataLoader for reading .npy files Aug 1, 2023
@jcfr jcfr changed the title Add NumPyDataLoader for reading .npy files Document NumPyDataLoader for reading .npy and .npz files Aug 1, 2023
@jcfr jcfr added the Type: Documentation Issues regarding documentation label Nov 11, 2023
@dzenanz
Copy link
Member Author

dzenanz commented May 30, 2024

Should this PR be merged or closed?

@lassoan
Copy link
Contributor

lassoan commented May 30, 2024

The reader is avaialable in the Sandbox extension (PerkLab/SlicerSandbox#19), so indeed we can close this pull request.

@lassoan lassoan closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Documentation Issues regarding documentation
Development

Successfully merging this pull request may close these issues.

None yet

4 participants