You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know that this issue has been raised multiple times before. I have gone through the issues, both opened and closed, and I see that a lot of people have the same or a related question.
There are two classes of issues that are related to this:
This is expected because in the original paper it was implemented this way.
Simply pad the input so that the prediction size would match the unpadded input, for example here and here.
Resize the prediction to match the input as mentioned here.
The first response, while correct, is not really helpful. The second and third responses are just incorrect. Padding the input could hypothetically change the distribution of pixels in the input image, which could introduce errors into the prediction. The third solution is also wrong, because the prediction map is both downsampled and shifted (i.e. spatial translation) with respect to the input. This means that just upsampling the prediction map without accounting for the shift would result in a misalignment.
I am going to write this because I need it for my own research.
The text was updated successfully, but these errors were encountered:
siavashk
changed the title
Network Predictions and Ground Truth Segmentation Should Match in Shape
The Network Prediction and Ground Truth Segmentation Should Match in Shape
Apr 28, 2019
siavashk
changed the title
The Network Prediction and Ground Truth Segmentation Should Match in Shape
Network Predictions and Ground Truth Segmentations Should Match in Shape
Apr 28, 2019
I made a mistake. The relevant piece of code is not crop_and_concat, it is actually crop_to_shape.
I added an inverse function (expand_to_shape) that pads the prediction such that it aligns with the input.
I know that this issue has been raised multiple times before. I have gone through the issues, both opened and closed, and I see that a lot of people have the same or a related question.
There are two classes of issues that are related to this:
People that directly ask how to get the prediction that matches in width and height with their input. For example see prediction.shape different from input.shape #41, How to get output which is in same size as input image? #138, Question about padding #175, Output dimension #183.
People that are asking about training with
padding='SAME'
instead of'VALID'
. They are doing this because they do not know how to properly align the prediction with the input. For example see Padding Options #93, Question about padding #175, Error: logits and labels must be broadcastable: logits_size=[1447680,2] labels_size=[0,2] #215.There are three responses:
This is expected because in the original paper it was implemented this way.
Simply pad the input so that the prediction size would match the unpadded input, for example here and here.
Resize the prediction to match the input as mentioned here.
The first response, while correct, is not really helpful. The second and third responses are just incorrect. Padding the input could hypothetically change the distribution of pixels in the input image, which could introduce errors into the prediction. The third solution is also wrong, because the prediction map is both downsampled and shifted (i.e. spatial translation) with respect to the input. This means that just upsampling the prediction map without accounting for the shift would result in a misalignment.
What this repository is missing is a function that is the inverse of
crop_and_concat
:https://github.com/jakeret/tf_unet/blob/master/tf_unet/layers.py#L50
I am going to write this because I need it for my own research.
The text was updated successfully, but these errors were encountered: