-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add worst_tiles sampling #77
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks very good!
I left a couple of comments of how I think we could improve it.
@@ -429,6 +431,7 @@ def worst_points( | |||
forests, forests_per_stage, | |||
sample_fraction_per_stage, | |||
accumulate_samples=True, | |||
sampling_kwargs={}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing a dict as default argument is generally discouraged (see https://stackoverflow.com/questions/26320899/why-is-the-empty-dictionary-a-dangerous-default-value-in-python). Can you use the **sampling_kwargs
pattern instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought in this case I have to adapt the arguements of all the other sampling startegies as well, but when I think about it again right now that's not true. So sure, I'll fix it.
# reshape diff to image shape and apply convolution with 1-kernel | ||
diff_img = torch.Tensor(diff.reshape(img_shape)[None, None]) | ||
kernel = torch.Tensor(np.ones(tiles_shape)[None, None]) | ||
diff_img_smooth = torch.nn.functional.conv2d(diff_img, weight=kernel) if len(img_shape)==2 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using torch for the convolution here is a bit overkill. Two points:
- Do we actually want to apply a uniform kernel here, or do we rather want to apply smoothing with a large sigma (for this can use
vigra.filters.gaussianSmoothing
) - If we do want the uniform kernel you can use the convolve functionality from scipy, speed doesn't matter here and it's easier to read because you don't need to convert to torch tensors:
scipy.ndimage.convolve
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll replace torch. About the question what's better, I don't know to be honest. What do you think? Maybe it's possible to allow the smoothing function to be given as an input. Or do you think that's a bit overkill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think? Maybe it's possible to allow the smoothing function to be given as an input. Or do you think that's a bit overkill?
Maybe we can keep it a bit simpler and have a single parameter, that is either a scalar (then we interpret it as sigma and use gaussian smoothing) or a tuple (then we interpret it as tile shape and apply the uniform kernel)
indices_per_class = [np.where(labels == class_id)[0] for class_id in range(nc)] | ||
|
||
# get maxima of the label-prediction diff | ||
max_centers = np.argsort(diff_img_smooth)[::-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we rather use a function that finds local maxima here? I think that would make more sense, because otherwise we can end up oversampling the global maximum.
There's https://scikit-image.org/docs/dev/api/skimage.feature.html#skimage.feature.peak_local_max or vigra.analysis.localMaxima
you can use for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're totally right. I could even add the min_distance
parameter. Then I could make sure that there is no overlap of the tiles at all. Maybe then I can do the entire thing more efficient. I'll have a look into that :)
Oh, and just disregard the CI failures, this looks like a conda issue; hopefully it's transient and things will work again tomorrow. |
I adapted all your suggestion and additionally made the code faster and easier to read. |
forests, forests_per_stage, | ||
sample_fraction_per_stage, | ||
img_shape, | ||
tiles_shape=[51, 51], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think such a big tile shape makes sense as default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't have a real good metric for that. Just tried with 7, 25, and 51 and the DiceLoss was smallest for 51. But you already mentioned that that's not a good proxy. So I guess we can use something smaller. What would you suggest?
With 51 usually 2-3 maxima were sampled :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a patch size between 256 and 512
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 51 usually 2-3 maxima were sampled :)
But did it actually then sample points from all of them after class balancing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, now that I think about the class balancing more here: would it maybe make sense to compute the maxima for the classes independently? I think we might run into cases where we only sample one class otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With 51 usually 2-3 maxima were sampled :)
But did it actually then sample points from all of them after class balancing?
at least one class did sample from all of them, but you are right for the others that's not guaranteed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, now that I think about the class balancing more here: would it maybe make sense to compute the maxima for the classes independently? I think we might run into cases where we only sample one class otherwise.
True, this could happen. I'll try to adapt it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we loose a bit of the "a human would label localy" idea. because locality is then just guaranteed in a class but not over classes. But I think this still makes sense.
Yes, looks much better now.
Yes, let's keep the current version, it reads much easier and we don't really care about speed here. What I am not quite sure about yet: |
Finding the maxima is now done per class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think the changes make sense now, and the per class maxima should ensure a more balanced sampling of classes; and 25 sounds like a good default.
Current implementation looks quite complicated. Unfortunately, the idea is not compatible with your
_score_based_points()
. But maybe there still some room to improve my implementation...