Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add worst_tiles sampling #77

Merged
merged 3 commits into from
Jul 21, 2022
Merged

Conversation

JonasHell
Copy link
Contributor

Current implementation looks quite complicated. Unfortunately, the idea is not compatible with your _score_based_points(). But maybe there still some room to improve my implementation...

Copy link
Owner

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea looks very good!
I left a couple of comments of how I think we could improve it.

@@ -429,6 +431,7 @@ def worst_points(
forests, forests_per_stage,
sample_fraction_per_stage,
accumulate_samples=True,
sampling_kwargs={},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a dict as default argument is generally discouraged (see https://stackoverflow.com/questions/26320899/why-is-the-empty-dictionary-a-dangerous-default-value-in-python). Can you use the **sampling_kwargs pattern instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in this case I have to adapt the arguements of all the other sampling startegies as well, but when I think about it again right now that's not true. So sure, I'll fix it.

# reshape diff to image shape and apply convolution with 1-kernel
diff_img = torch.Tensor(diff.reshape(img_shape)[None, None])
kernel = torch.Tensor(np.ones(tiles_shape)[None, None])
diff_img_smooth = torch.nn.functional.conv2d(diff_img, weight=kernel) if len(img_shape)==2 \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using torch for the convolution here is a bit overkill. Two points:

  • Do we actually want to apply a uniform kernel here, or do we rather want to apply smoothing with a large sigma (for this can use vigra.filters.gaussianSmoothing)
  • If we do want the uniform kernel you can use the convolve functionality from scipy, speed doesn't matter here and it's easier to read because you don't need to convert to torch tensors: scipy.ndimage.convolve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll replace torch. About the question what's better, I don't know to be honest. What do you think? Maybe it's possible to allow the smoothing function to be given as an input. Or do you think that's a bit overkill?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think? Maybe it's possible to allow the smoothing function to be given as an input. Or do you think that's a bit overkill?

Maybe we can keep it a bit simpler and have a single parameter, that is either a scalar (then we interpret it as sigma and use gaussian smoothing) or a tuple (then we interpret it as tile shape and apply the uniform kernel)

indices_per_class = [np.where(labels == class_id)[0] for class_id in range(nc)]

# get maxima of the label-prediction diff
max_centers = np.argsort(diff_img_smooth)[::-1]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we rather use a function that finds local maxima here? I think that would make more sense, because otherwise we can end up oversampling the global maximum.
There's https://scikit-image.org/docs/dev/api/skimage.feature.html#skimage.feature.peak_local_max or vigra.analysis.localMaxima you can use for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're totally right. I could even add the min_distance parameter. Then I could make sure that there is no overlap of the tiles at all. Maybe then I can do the entire thing more efficient. I'll have a look into that :)

@constantinpape
Copy link
Owner

Oh, and just disregard the CI failures, this looks like a conda issue; hopefully it's transient and things will work again tomorrow.

@JonasHell
Copy link
Contributor Author

I adapted all your suggestion and additionally made the code faster and easier to read.
In the current version I removed the early stopping. While trying a few things it didn't feel like it brings any speed boost. Rather the opposite was the case, it felt like the overhead of the if statements made it slower. But maybe this impression is wrong... What do you think: better keep the current easy to read version with 2 seperate for loops (where we iterate over all maxima even if we don't need all) or add the nested for loop with early stopping back in?

forests, forests_per_stage,
sample_fraction_per_stage,
img_shape,
tiles_shape=[51, 51],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think such a big tile shape makes sense as default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't have a real good metric for that. Just tried with 7, 25, and 51 and the DiceLoss was smallest for 51. But you already mentioned that that's not a good proxy. So I guess we can use something smaller. What would you suggest?
With 51 usually 2-3 maxima were sampled :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with a patch size between 256 and 512

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 51 usually 2-3 maxima were sampled :)

But did it actually then sample points from all of them after class balancing?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, now that I think about the class balancing more here: would it maybe make sense to compute the maxima for the classes independently? I think we might run into cases where we only sample one class otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 51 usually 2-3 maxima were sampled :)

But did it actually then sample points from all of them after class balancing?

at least one class did sample from all of them, but you are right for the others that's not guaranteed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, now that I think about the class balancing more here: would it maybe make sense to compute the maxima for the classes independently? I think we might run into cases where we only sample one class otherwise.

True, this could happen. I'll try to adapt it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we loose a bit of the "a human would label localy" idea. because locality is then just guaranteed in a class but not over classes. But I think this still makes sense.

@constantinpape
Copy link
Owner

constantinpape commented Jul 21, 2022

I adapted all your suggestion and additionally made the code faster and easier to read.

Yes, looks much better now.

better keep the current easy to read version with 2 seperate for loops

Yes, let's keep the current version, it reads much easier and we don't really care about speed here.

What I am not quite sure about yet: [51, 51] sounds like a fairly large shape to take around the maxima. How did you choose it?
(I am a bit worried that given the class balancing we will end up only sampling from a single maxima with this.)

@JonasHell
Copy link
Contributor Author

Finding the maxima is now done per class.
I also changed default tile size to 25. Would you prefer it to be even smaller?

Copy link
Owner

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think the changes make sense now, and the per class maxima should ensure a more balanced sampling of classes; and 25 sounds like a good default.

@constantinpape constantinpape merged commit 280c052 into constantinpape:more-s2d Jul 21, 2022
@JonasHell JonasHell deleted the more-s2d branch July 21, 2022 16:01
@JonasHell JonasHell restored the more-s2d branch July 22, 2022 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants