Improve random transforms reproducibility #226

fepegar · 2020-07-11T14:44:31Z

Resolves #191, #208.

I will also add docs on reproducibility, and maybe a notebook.

codecov · 2020-07-11T15:07:18Z

Codecov Report

Merging #226 into master will increase coverage by 0.11%.
The diff coverage is 99.27%.

@@            Coverage Diff             @@
##           master     #226      +/-   ##
==========================================
+ Coverage   91.29%   91.41%   +0.11%     
==========================================
  Files          88       89       +1     
  Lines        3700     3740      +40     
==========================================
+ Hits         3378     3419      +41     
+ Misses        322      321       -1

Impacted Files	Coverage Δ
torchio/cli.py	`35.13% <ø> (+1.80%)`	⬆️
torchio/data/queue.py	`96.05% <ø> (ø)`
tests/utils.py	`98.30% <83.33%> (-1.70%)`	⬇️
...ms/augmentation/test_random_elastic_deformation.py	`100.00% <100.00%> (ø)`
tests/transforms/preprocessing/test_crop_pad.py	`100.00% <100.00%> (ø)`
tests/transforms/preprocessing/test_resample.py	`100.00% <100.00%> (ø)`
tests/transforms/test_reproducibility.py	`100.00% <100.00%> (ø)`
torchio/data/subject.py	`95.18% <100.00%> (+2.03%)`	⬆️
...sforms/augmentation/intensity/random_bias_field.py	`100.00% <100.00%> (ø)`
...o/transforms/augmentation/intensity/random_blur.py	`93.54% <100.00%> (-0.57%)`	⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac86f80...169c235. Read the comment docs.

romainVala · 2020-07-15T13:34:45Z

Hi
Nice PR, as most of the change is to remove code from the transform, since now save and get history is done in the subject class

I am still not sure, that to do not have the possibility to add 'normal parameter" in the history, (as it was before) is a good idea,

Let consider random Elastic where you have to specify the num_control_point, having only the seed saved in the history, will allow to reproduce the same transform, if and only if you put the same num_control_point
Why not saving it to the history ?

Other drawback of this changes, is for simple transform without random choise (like rescale or normalization), I would like to add the chosen parameter in the history too,

but with those changes, you can now only log seed ... non ?

fepegar · 2020-07-21T11:24:36Z

The problem with this implementation is that intensity transforms use different parameters for the different images in the Subject. Either we use the same for all, or we use a more complex transforms history...

Let consider random Elastic where you have to specify the num_control_point, having only the seed saved in the history, will allow to reproduce the same transform, if and only if you put the same num_control_point
Why not saving it to the history ?

Because the user has control over what parameters they passed to the transform.

romainVala · 2020-07-21T11:50:13Z

Because the user has control over what parameters they passed to the transform.

yes but it makes the code more difficult (with hard coded parameter valuer), I really thing it will be more efficient to read it directly from the history (given the fact it is so simple to add !)

applying transform and reading parameters, should be dissociate

I can think at an example where you do not know : let say you want to compare the influence of num_control_point, and you define OneOf transform with 2 randomElastic call each of them with different num_control_point

romainVala · 2020-07-21T13:02:57Z

sorry to insist, but I really like to convince you on this point

Once you performed multiple experiment, with different transform, it is convenient to just have a look to save history in order to know what has been applied
I loose this explicit reading if we now store only the seed, but at least I can make a function that will convert the stored seed in real parameters, but for the completeness I also need the "fixed parameters"

It is also a way to check that every thing was done as planed (which is important too)

The objective is to be able to recreate a given transformed volume (because the model is very bad for this volume) why should I need to remember which num_control_point I set (or which percentiles I set for RescaleIntensity) if those parameters can be stored in the history I have an autonomous mechanism to reproduce the given transform without prior knowledge
so I can write a generic function to reproduce any given transform (from its saved history)

don't you think it is much better so ?

If there are some drawback to store all parameters, let's make it optional, but I think it is a great feature, to make this full parameter recording possible

romainVala · 2020-07-21T13:13:32Z

The problem with this implementation is that intensity transforms use different parameters for the different images in the Subject. Either we use the same for all, or we use a more complex transforms history...

given what has been discuss here #238 we will need a more complex transforms history ...
may be instead of having a list containing
[ <transform_name> , transform_dict_param]
we will have
[ <transform_name> , [<image_key>, transform_dict_param] ]
(and if the same transform is apply to all we will take the previous one)

Add seed to dict only if transformed Subject Add method to generate seed Add reproducibility tests Update tests Fix docstring Update CLI tool Add get_transform function Use JSON to store transforms history Update transforms Rename function to get transform class

fepegar mentioned this pull request Jul 12, 2020

Reproduce a given transform #208

Closed

fepegar force-pushed the 208-transforms-reproducibility branch from f3529c8 to 515420b Compare July 13, 2020 10:21

fepegar mentioned this pull request Jul 21, 2020

Rotation transform #229

Closed

fepegar force-pushed the 208-transforms-reproducibility branch from 515420b to 169c235 Compare July 21, 2020 15:20

fepegar closed this Nov 16, 2020

fepegar deleted the 208-transforms-reproducibility branch October 26, 2021 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve random transforms reproducibility #226

Improve random transforms reproducibility #226

fepegar commented Jul 11, 2020 •

edited

Loading

codecov bot commented Jul 11, 2020 •

edited

Loading

romainVala commented Jul 15, 2020 •

edited

Loading

fepegar commented Jul 21, 2020

romainVala commented Jul 21, 2020 •

edited

Loading

romainVala commented Jul 21, 2020

romainVala commented Jul 21, 2020

Improve random transforms reproducibility #226

Improve random transforms reproducibility #226

Conversation

fepegar commented Jul 11, 2020 • edited Loading

codecov bot commented Jul 11, 2020 • edited Loading

Codecov Report

romainVala commented Jul 15, 2020 • edited Loading

fepegar commented Jul 21, 2020

romainVala commented Jul 21, 2020 • edited Loading

romainVala commented Jul 21, 2020

romainVala commented Jul 21, 2020

fepegar commented Jul 11, 2020 •

edited

Loading

codecov bot commented Jul 11, 2020 •

edited

Loading

romainVala commented Jul 15, 2020 •

edited

Loading

romainVala commented Jul 21, 2020 •

edited

Loading