New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some deterministic transforms #353
Conversation
Hi Fernando, I like the deterministic transform idea, it makes it more clear, and explicit, Just to be sure, you also have it in mind (Sorry if I repeat mysefl here ..), About reproducibility it is important to be able to rely on a saved text file (or other format) I thing it should be made explicit in the example scritp for intance something like instead of transformed = transform(colin)
h = transformed.history[0]
same_transform = transformed.get_composed_history() I prefer the first version you proposed there #299 (comment) transform_name, arguments = transformed.history[0]
same_transform = getattr(tio, transform_name)(**arguments) # TODO: nice wrapper for this arguments, beeing I guess a dict, that can be save and read from disque if need |
Hi, @romainVala. First I want to say I really appreciate your feedback on this. It's very helpful to know about how you use these features. I also want to apologise for undoing some of the work by @GReguig. His PR was very helpful and, although I'm removing some of the code, I'm building on some of his ideas and other code.
It is indeed going to be quite an important change. I'll bump the version to
Thanks, I think so too. I think this separates nicely the random sampling and the functional operations.
I'll try to post some examples and explanations here.
As I said, this type of feedback is very useful. I'll take this into account and bring back some of the
I have turned In [1]: import torchio as tio
...: colin = tio.datasets.Colin27()
...: tr = tio.RandomAffine()
...: c2 = tr(colin)
In [3]: c2.applied_transforms
Out[3]:
[('Affine',
{'scales': (1.0360139608383179, 1.0092111825942993, 0.9428247213363647),
'degrees': (5.846307277679443, -7.355504989624023, -6.5344929695129395),
'translation': (0.0, 0.0, 0.0),
'center': 'image',
'default_pad_value': 'otsu',
'image_interpolation': 'linear'})]
In [4]: c2.history
Out[4]: [<torchio.transforms.augmentation.spatial.random_affine.Affine at 0x7fcb37d149d0>]
In [5]: c2.get_composed_history()
Out[5]: <torchio.transforms.augmentation.composition.Compose at 0x7fcb37873730> Looking forward to your thoughts! |
Now that I think of it: is it really worth the effort to generate a human-readable/jsonable, e.g. JSON or YAML version of the transforms parameters? Can't the history just be serialized and saved using Python's |
No worries about the changes, I do not have a "pythonic" expertise as you do, and this is why I appreciate working on this project where you make the effort to produce a very good open library (documentation test ect ...). My concern was just about the final functionality (to reproduce a transform in an independent python execution) so now we have c2 = tr(colin)
same_transform = c2.get_composed_history() which need to have the transformed subject python object "alive" I propose to also make sure we can get the same transform with the attribute strategy : something like that : parameters = c2.applied_transforms
transform_name = ...
same_transform = getattr(tio, transform_name)(**parameters) may be it is already just fine, because looking at the new code (for instance RandomAffine.apply_transform) it is already the case, since randomAffine is building an Affine with the parameters an input. If the plan is to do it for all transform (also the preprocessing one), i think I am 100% happy ... |
Yes, this is the plan. Will take some work, but at least it will make you 100% happy 😄 Could you please let me know how you're saving the history? I suppose at the moment it gets lost when the data loader collates the samples to create a batch... |
json and co may not be the right choice. |
Ok this is good. Remember you can do this to save your transform: In [3]: import torch
In [4]: torch.save(c2.get_composed_history(), '/tmp/c2_history.pth')
In [5]: c2_history = torch.load('/tmp/c2_history.pth')
In [6]: c2_history
Out[6]:
Compose(
<torchio.transforms.augmentation.spatial.random_affine.Affine object at 0x7f4b1a7ce1c0>
)
In [7]: same = c2_history(colin) |
why should be it lost ?, actually it is currently working fine, # loader is the pytorch dataloader (where data within subject are concatenate in batch
for i, subject in enumerate(loader, 1): we get the same concatenation for history, if I do and the same is working identically with queue (nice no ?) |
Hmm. Aren't you using a custom |
Oups, you are right I forget this point, but yes fabien modified the pytorch collate function, so that it also include history |
using collate_fn=lambda x: x works, but you then loos the nice concatenation done on the tensor, so we prefer to modify the original collate function the modification was to add: elif isinstance(elem, container_abcs.Mapping):
# The only change to the original function is here:
# if elem has attribute 'history', then a key 'history' is added to the batch
# which value is the list of the history of the elements of the batch
dictionary = {key: history_collate([d[key] for d in batch]) for key in elem}
if hasattr(elem, 'history'):
dictionary.update({
'history': [d.history for d in batch]
})
return dictionary if someone is interested, it is there https://github.com/romainVala/torchQC/blob/7e509098e433aad2e5d32bcb40cb502d54f9e5df/segmentation/collate_functions.py#L13 |
great, you are doing it fast ! I wonder, if one should handle, all call to random, the same way as you do it for RandomNoise (ie: adding a seed as transform parameters) |
I might have to slow down soon...
The number of bias field coefficients with default parameters is 20.
I think the point of having these new transforms is that you know exactly what's going on, given certain parameters. Also, the typical usage won't be to call them directly, I'd say. And if one does, they should have the responsibility of correctly formatting the input arguments.
|
Thanks! I wonder if there's a way to get something like this without copying the original code. I tried putting |
This seems to work: from torch.utils.data._utils.collate import default_collate
def history_collate(batch):
elem = batch[0]
if isinstance(elem, tio.Subject):
dictionary = {key: default_collate([d[key] for d in batch]) for key in elem}
if hasattr(elem, 'applied_transforms'):
dictionary.update({
'applied_transforms': [d.applied_transforms for d in batch]
})
return dictionary |
Hi fernando, a small comment about the More generally I wonder, which transform can not be inverted So the transform I am sure they can not be inverted, transform that could be inverted, (although of course you will loos a lot of information )
preprocessing.intensity (not sure, the strikethrough the word means it will not be implemented ?
|
What for ? I have to think of it, (and I hope we will find some new one). For now, the use I see is to quite anecdotal : to estimate the information loss, (or the difference) induce by applying a transform and its inverse |
Many thanks for reviewing, Romain!
The method is inherited from def is_invertible(self):
return hasattr(self, 'invert_transform')
Agree.
Yeah I need to give these a thought. I suppose it would mean adding an additional kwarg to the transforms, such as "original affine" or so. I'm not too concerned about the reorientation, but inverting a resampling would be useful after e.g. a RandomDownsample, which was my original idea when I created the transform (of course, Benjamin Billot confirmed this with his paper!).
These two call other transforms that will be added to the history, so no extra work needs to be done.
I'm not sure it's worth it to work on inverting these. I think it's fine to have some transforms that are invertible and some that are not. Unless we can come up with a good use case to invert them. I think that inverting e.g. RandomMotion is going to be painful and doesn't feel like it would help much.
My feeling about this is a mix of my two comments above: 1) would need to change the design and 2) not sure it's worth the effort. |
I'm not sure I understand. If only the label needs to be inverted (I assume you mean the result of a segmentation inference), the other images in the subject could be deleted, no? |
Ok I agree, the "What for" question, was just a general one, to seek for argument, again what I was expected you will answer :
Learning with synthetic data, as propose, by billot, open so much new perspective, that I have the feeling it worth the effort. (as you point it for resampling). we just do not see it today ... about change in design, yes, but not that much, and since you are changing a lot here, I though it may be the right time But ok, let's be pragmatic, we can address it later About inverting motion, I will hope I'll find time to test it, but I think it may help for learning a motion correction method. (at least my motion implementation make it easy to invert, so I can already test it ...) |
Maybe I'm failing to understand a use case, probably because I'm strongly biased towards 1) brain 2) MRI 3) segmentation. But as you say, let's be pragmatic. I'd rather push for an "incomplete" feature than investing a lot of time so that the number of invertible transforms in the library is as large as possible. We can always add the functionality later on! This PR will provide the infrastructure to do that easily. |
Codecov Report
@@ Coverage Diff @@
## master #353 +/- ##
==========================================
+ Coverage 93.29% 93.57% +0.28%
==========================================
Files 113 114 +1
Lines 5441 5744 +303
==========================================
+ Hits 5076 5375 +299
- Misses 365 369 +4
Continue to review full report at Codecov.
|
763f1ea
to
09d9950
Compare
37ce430
to
a4f3dfb
Compare
Add Noise transform Add only deterministic transforms to history Remove compose_from_history Fix old import Reverse order of transforms when inverting Refactor Transform and use str for interpolation Add BiasField transform Add custom collate function Move get_arguments to Transform Add Blur transform Update Crop and Pad Add undo transform methods Fix reversed transforms list Add inverse methods for Crop and Pad Add some typing hints Use minimum for default pad value Update Resample Update ToCanonical Add repr() for deterministic transforms Update RescaleIntensity Update normalization transforms Add deterministic Gamma Remove seed kwarg from docstrings Rename undo methods Add deterministic Swap Use Sequence for input typing hint Add deterministic Ghosting Fix some tests Fix affine tests Fix undo transform Raise error if input transform is not callable Multiple changes Allow no args in Resample Add deterministic LabelsToImage Fix some errors Parse free form deformation Remove test Move method to function Fix tests Skip reproducibility tests Add deterministic Motion Add deterministic Spike Replace RandomDownsample with RandomAnisotropy Edit reproducibility test Add test for max_displacement is zero Remove old test Add invertibility test Fix all tests Refactor docs Add automatic plots in documentation Remove :py from docs Improve documentation Improve 2D support for RandomAnisotropy Improve coverage of CLI interface Add some tests Apply anisotropy to scalars only Use import torchio as tio in some docstrings Fix docstring
20e40a2
to
2b9bf19
Compare
Congratulation ! |
Haha thanks Romain! And as usual, thanks a lot for your feedback! I hope this PR is useful for you. |
Description
This is a refactoring of the reproducibility system of the library, plus adding new features to invert transforms easily.
Linked issues
Resolves #191.
Resolves #142.
Resolves #208.
Resolves #299.
Resolves #336.
Resolves #355.
Further issues to take into account
Saving transforms historyRandomNoise
(addseed
to parameters?)Add a history to theImage
class, and how to handle that without code duplication?inverse()
methodCompose
andOneOf
separatelyProgress
New transforms
DownsampleTo update
CropOrPadWrite inverse() method
New usage
Checklist
CONTRIBUTING
docs and have a developer setup (especially important arepre-commit
andpytest
)pytest
make html
inside thedocs/
folder