-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor image samplers #175
Conversation
it is because in a segmentation task, we want to the label as target and also use them for the sampling strategy, so this is the same volume, but 2 uses ... |
Sure, I understand! But that's a very specific case in which the label happens to be the sampling map. But if the user defines it somehow and it's not a label, what should be the type? That's why I'm wondering. |
Add weighted sampler Add some docs for weighted sampler Rename probability map argument Add weighted sampler to docs Add features to samplers Add abstract method get_probability_map() Move tests Add features, tests and docs for samplers Use crop transform to extract patches Add comment to bounds transform Add type hint for samplers Fix TypeError for Python <3.8
c9ecd76
to
d5a917d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fepegar! Great job with this PR! I was just wondering about the best way to do the following: imagine you have a segmentation task with 2 classes (tumor and non tumor for example) and you want to have the same number of patches in the background and in the object of interest. I think such a scenario would make sense, what do you think?
To implement such a scenario with your implementation you would have to compute the number of points in your volume, the number of points in your object of interest and set the value of all points in the object of interest to be the ratio and in the background 1 - ratio. And you would have to precompute that before giving the sample to the Sampler. Could it be a good idea to modify the LabelSampler to work this way or to create a new Sampler? The proportion of the different classes could also be a parameter of the Sampler.
Thanks for your feedback! I've been thinking of adding something similar to NiftyNet's "balanced" sampler. At the moment, the user can achieve that behavior by precomputing a sampling map. But of course, we want to be friendly, so I think we can create a new The class would take e.g. a label_probability = {
0: 0.5, # background
1: 0.5, # tumor
} And the actual probability map can be quickly computed in label_probability = {
0: 0, # background
1: 1, # tumor
} I've implemented something like this in the past, so hopefully I can reuse some code if needed. |
Yes that would be perfect! |
Done. Maybe there are more efficient ways to do this, but it seems to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems good, well done @fepegar!
Thanks for the feedback, @GFabien. I'll add some more tests and merge soon. |
IterableDataset seems to have been removed as of fepegar#175, therefore torch>=1.1 could be sufficient. This is in interest of many people still relying on CUDA 9. soles fepegar#195
* Remove unused attribute * Use slicing and clone() to crop images in samplers Using sitk.Crop is slow because sitk.GetImageFromArray is slow. Using copy.deepcopy(patch) is slow because it copies the whole array, not just the patch. The solution is to use slicing for cropping and patch.clone() for copying. Related to #175, #183, 89acf63. Fixes #212. * Set proper affine when cropping patches
The weighted sampler uses a probability map included in the subject sample and computes once the possible patch locations, so it's as efficient as the
ImageSampler
(that will disappear) and much more efficient that the currentLabelSampler
.Weighted sampler
Add a generic sampler that uses a probability map to extract patches.
Use
Crop
to extract patches.At the moment there is legacy code in
WeightedSampler
that could be easily replaced by a call toCrop
. The only difficulty is figuring out the arguments toCrop
, but should be doable in 5 minutes with a pen and paper.Move
sample
argument to__call__
This is important so that it can be used with the queue. The current implementation of
Queue
takes the sampler class as argument (not an instance of it), so no arguments can be passed directly to the sampler (e.g. the probability map name). The class is initialized insideQueue
so that it can be passed the sample from which the patches will be extracted.An alternative is passing an instance of the sampler and moving the sample/subject argument to
__call__
. It would look a bit like this:Another option, maybe simpler, is just writing a setter for the sample/subject. Then,
sampler.set_subject(subject)
can be called in the queue before looping to extract the patches.Uniform sampler (current
ImageSampler
)The
UniformSampler
could be a thin wrapper ofWeightedSampler
, where the probability map is alwaysNone
so (a map of ones will be used).Refactor
LabelSampler
Similarly, this sampler could just be a wrapper of the weighted sampler. Some fancier options could be added, for example sampling each class with equal probability. Or that could be a new sampler,
BalancedSampler
. This is nice because it follows NiftyNet syntax, to which new users might be used.Handle how transforms are applied to probability maps
If an image is somehow marked as a sampling/probability map, e.g. using the currently unused
torchio/torchio/torchio.py
Line 12 in 4133e10
then intensity transforms shouldn't be applied to it. I guess using that type (or anything else) would make this work as intended. Using
torchio.LABEL
would work, but it doesn't make a lot of sense.Crop
to extract patches__call__
ImageSampler
)LabelSampler