Suggestion: Define X,Y grid so that they include -1 and 1 #15

simonhessner · 2019-06-22T21:07:33Z

I have read the paper and was wondering if there is a fix for the problem stated on page 8:

Analysis of misclassified examples revealed that DSNT was less accurate for predicting edge case joints that lie very close to the image boundary, which is expected due to how the layer works

The reason seems to be that the X and Y grid is defined to lie in the range (-1,1) by the formulas on page 4. Is there a specific reason for this or would the DSNT also work when the grids are in the range [-1,1]?

A formula to define such a grid would be

-1 + (2*(i-1)) / (w-1)

For a heatmap that has the width 5, the grid would have these values in the columns:

i=1 => -1
i=2 => -1 + 2/4 = -0.5
i=3 => -1 + 4/4 = 0
i=4 => -1 + 6/4 = 0.5
i=5 => -1 + 8/4 = 1

So the grid would look like
-1 | -0.5 | 0 | 0.5 | 1

instead of
-0.8 | -0.4 | 0 | 0.4 | 0.8

So my question is if there is a reason to use the second grid instead of the first one? From what I see this should also work. If there is interest in this change, I could try to implement it.

The advantage would be that the system will be able to regress coordinates on the border and not just very close to the border (depending on the heatmap dimensions)

anibali · 2019-06-23T05:40:13Z

Yes, I am aware of this distinction. In fact, my first implementation took the approach that you are proposing. In reality I don't believe there to be much of a practical difference between the two approaches and I happened to like the property of -1=far left, +1=far right when I created this particular library. There is currently an effort to move much of the functionality from this library into Kornia (kornia/kornia#167), which will once again use the scheme that you propose here. So no need to implement it yourself.

anibali · 2019-06-23T05:40:53Z

The advantage would be that the system will be able to regress coordinates on the border and not just very close to the border (depending on the heatmap dimensions)

I don't think that this is true. It should be the same either way.

simonhessner · 2019-06-23T19:58:14Z

May I ask why you decided to use (-1,1) rather than [-1,1]? With the current implementation it is not possible to regress for example (x,y)=(1,1) (as the paper says), but what if you have a task where you need to be able to regress coordinates everywhere in the image?

Thanks for the hint with kornia! Did not know about that library :)

anibali · 2019-06-23T22:13:01Z

It doesn't make a difference because you should also change how you convert from pixel coordinates to normalised coordinates. So yes, the normalised value corresponding to pixel (0, 0) isn't (-1, -1), but it still maps to (0, 0) so it doesn't matter. You can use the normalized_to_pixel_coordinates and pixel_to_normalized_coordinates functions to help with that.

Example:
You have a 5x5 image like you describe in your first post. The model predicts location (-0.8, 0.8). Converting to pixels you get (0, 4). This is the last pixel of the first column---right in the corner. If we used the other representation, the model would predict (-1, 1) for the same location, but the end result would be the same because the conversion formula would be slightly different.

I hope that this clears things up.

simonhessner · 2019-10-14T11:54:06Z

Hi,

as I see DSNT has now been merged into Kornia. Is the version in Kornia "final" and I should switch to it?

For now I have a question about the normalized_to_pixel_coordinates function. The docs say:

Coordinate tensor, where elements in the last dimension are ordered as (x, y, ..)

I am not sure if I understand this correctly. For example, my tensors are shaped like this:

(BATCH_SIZE, N_LANDMARKS, 2) and the last dimension (2) contains x and y, each in its own "column". So there is no column that contains x1,y1,x2,y2,...,xn,yn. It seems to work, but I want to be sure it is in the correct format to avoid any bad results.

When trying this simple example:

dsntnn.normalized_to_pixel_coordinates(torch.tensor([-1.0, 1.0, 0.0]), (128))

I get:

tensor([ -0.5000, 127.5000, 63.5000])

Shouldn't it be [0.0, 128, 64] instead?

EDIT: Okay, my fault. I forgot that the coordinate range is (-1,1) and not [-1,1]... So my normalized coordinates are outside the valid range, thus the result is also outside the range. This works:

dsntnn.normalized_to_pixel_coordinates(dsntnn.pixel_to_normalized_coordinates(torch.tensor([128.0]), (128)), (128))

tensor([128.])

anibali · 2019-10-15T01:22:32Z

The statement in the docs is simply referring to the order of the coordinates in the last dimension. Let's look at your example:

(BATCH_SIZE, N_LANDMARKS, 2) and the last dimension (2) contains x and y, each in its own "column".

Here there are two possibilities for ordering the last dimension (ie the order of the two "columns"): (x, y) or (y, x). The docs simply say that the first ordering is used. I felt that this was worth pointing out because it is the reverse ordering of the image dimensions, which are assumed to be height x width (not width x height).

And yes, the implementation in Kornia is more polished than this one---just be aware that the normalisation of coordinates is different, as noted earlier.

anibali closed this as completed Oct 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Define X,Y grid so that they include -1 and 1 #15

Suggestion: Define X,Y grid so that they include -1 and 1 #15

simonhessner commented Jun 22, 2019 •

edited

Loading

anibali commented Jun 23, 2019

anibali commented Jun 23, 2019

simonhessner commented Jun 23, 2019 •

edited

Loading

anibali commented Jun 23, 2019

simonhessner commented Oct 14, 2019 •

edited

Loading

anibali commented Oct 15, 2019 •

edited

Loading

Suggestion: Define X,Y grid so that they include -1 and 1 #15

Suggestion: Define X,Y grid so that they include -1 and 1 #15

Comments

simonhessner commented Jun 22, 2019 • edited Loading

anibali commented Jun 23, 2019

anibali commented Jun 23, 2019

simonhessner commented Jun 23, 2019 • edited Loading

anibali commented Jun 23, 2019

simonhessner commented Oct 14, 2019 • edited Loading

anibali commented Oct 15, 2019 • edited Loading

simonhessner commented Jun 22, 2019 •

edited

Loading

simonhessner commented Jun 23, 2019 •

edited

Loading

simonhessner commented Oct 14, 2019 •

edited

Loading

anibali commented Oct 15, 2019 •

edited

Loading