Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve random transforms reproducibility #226

Closed
wants to merge 1 commit into from

Conversation

fepegar
Copy link
Owner

@fepegar fepegar commented Jul 11, 2020

Resolves #191, #208.

I will also add docs on reproducibility, and maybe a notebook.

@codecov
Copy link

codecov bot commented Jul 11, 2020

Codecov Report

Merging #226 into master will increase coverage by 0.11%.
The diff coverage is 99.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #226      +/-   ##
==========================================
+ Coverage   91.29%   91.41%   +0.11%     
==========================================
  Files          88       89       +1     
  Lines        3700     3740      +40     
==========================================
+ Hits         3378     3419      +41     
+ Misses        322      321       -1     
Impacted Files Coverage Δ
torchio/cli.py 35.13% <ø> (+1.80%) ⬆️
torchio/data/queue.py 96.05% <ø> (ø)
tests/utils.py 98.30% <83.33%> (-1.70%) ⬇️
...ms/augmentation/test_random_elastic_deformation.py 100.00% <100.00%> (ø)
tests/transforms/preprocessing/test_crop_pad.py 100.00% <100.00%> (ø)
tests/transforms/preprocessing/test_resample.py 100.00% <100.00%> (ø)
tests/transforms/test_reproducibility.py 100.00% <100.00%> (ø)
torchio/data/subject.py 95.18% <100.00%> (+2.03%) ⬆️
...sforms/augmentation/intensity/random_bias_field.py 100.00% <100.00%> (ø)
...o/transforms/augmentation/intensity/random_blur.py 93.54% <100.00%> (-0.57%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac86f80...169c235. Read the comment docs.

@fepegar fepegar force-pushed the 208-transforms-reproducibility branch from f3529c8 to 515420b Compare July 13, 2020 10:21
@romainVala
Copy link
Contributor

romainVala commented Jul 15, 2020

Hi
Nice PR, as most of the change is to remove code from the transform, since now save and get history is done in the subject class

I am still not sure, that to do not have the possibility to add 'normal parameter" in the history, (as it was before) is a good idea,

Let consider random Elastic where you have to specify the num_control_point, having only the seed saved in the history, will allow to reproduce the same transform, if and only if you put the same num_control_point
Why not saving it to the history ?

Other drawback of this changes, is for simple transform without random choise (like rescale or normalization), I would like to add the chosen parameter in the history too,

but with those changes, you can now only log seed ... non ?

@fepegar
Copy link
Owner Author

fepegar commented Jul 21, 2020

The problem with this implementation is that intensity transforms use different parameters for the different images in the Subject. Either we use the same for all, or we use a more complex transforms history...

Let consider random Elastic where you have to specify the num_control_point, having only the seed saved in the history, will allow to reproduce the same transform, if and only if you put the same num_control_point
Why not saving it to the history ?

Because the user has control over what parameters they passed to the transform.

@romainVala
Copy link
Contributor

romainVala commented Jul 21, 2020

Because the user has control over what parameters they passed to the transform.

yes but it makes the code more difficult (with hard coded parameter valuer), I really thing it will be more efficient to read it directly from the history (given the fact it is so simple to add !)

applying transform and reading parameters, should be dissociate

I can think at an example where you do not know : let say you want to compare the influence of num_control_point, and you define OneOf transform with 2 randomElastic call each of them with different num_control_point

@fepegar fepegar mentioned this pull request Jul 21, 2020
@romainVala
Copy link
Contributor

sorry to insist, but I really like to convince you on this point

Once you performed multiple experiment, with different transform, it is convenient to just have a look to save history in order to know what has been applied
I loose this explicit reading if we now store only the seed, but at least I can make a function that will convert the stored seed in real parameters, but for the completeness I also need the "fixed parameters"

It is also a way to check that every thing was done as planed (which is important too)

The objective is to be able to recreate a given transformed volume (because the model is very bad for this volume) why should I need to remember which num_control_point I set (or which percentiles I set for RescaleIntensity) if those parameters can be stored in the history I have an autonomous mechanism to reproduce the given transform without prior knowledge
so I can write a generic function to reproduce any given transform (from its saved history)

don't you think it is much better so ?

If there are some drawback to store all parameters, let's make it optional, but I think it is a great feature, to make this full parameter recording possible

@romainVala
Copy link
Contributor

The problem with this implementation is that intensity transforms use different parameters for the different images in the Subject. Either we use the same for all, or we use a more complex transforms history...

given what has been discuss here #238 we will need a more complex transforms history ...
may be instead of having a list containing
[ <transform_name> , transform_dict_param]
we will have
[ <transform_name> , [<image_key>, transform_dict_param] ]
(and if the same transform is apply to all we will take the previous one)

Add seed to dict only if transformed Subject

Add method to generate seed

Add reproducibility tests

Update tests

Fix docstring

Update CLI tool

Add get_transform function

Use JSON to store transforms history

Update transforms

Rename function to get transform class
@fepegar fepegar force-pushed the 208-transforms-reproducibility branch from 515420b to 169c235 Compare July 21, 2020 15:20
@fepegar fepegar closed this Nov 16, 2020
@fepegar fepegar deleted the 208-transforms-reproducibility branch October 26, 2021 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Seeding in Transform, make the same transform all the time
2 participants