-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Define X,Y grid so that they include -1 and 1 #15
Comments
Yes, I am aware of this distinction. In fact, my first implementation took the approach that you are proposing. In reality I don't believe there to be much of a practical difference between the two approaches and I happened to like the property of -1=far left, +1=far right when I created this particular library. There is currently an effort to move much of the functionality from this library into Kornia (kornia/kornia#167), which will once again use the scheme that you propose here. So no need to implement it yourself. |
I don't think that this is true. It should be the same either way. |
May I ask why you decided to use (-1,1) rather than [-1,1]? With the current implementation it is not possible to regress for example (x,y)=(1,1) (as the paper says), but what if you have a task where you need to be able to regress coordinates everywhere in the image? Thanks for the hint with kornia! Did not know about that library :) |
It doesn't make a difference because you should also change how you convert from pixel coordinates to normalised coordinates. So yes, the normalised value corresponding to pixel (0, 0) isn't (-1, -1), but it still maps to (0, 0) so it doesn't matter. You can use the Example: I hope that this clears things up. |
Hi, as I see DSNT has now been merged into Kornia. Is the version in Kornia "final" and I should switch to it? For now I have a question about the normalized_to_pixel_coordinates function. The docs say:
I am not sure if I understand this correctly. For example, my tensors are shaped like this: (BATCH_SIZE, N_LANDMARKS, 2) and the last dimension (2) contains x and y, each in its own "column". So there is no column that contains x1,y1,x2,y2,...,xn,yn. It seems to work, but I want to be sure it is in the correct format to avoid any bad results. When trying this simple example:
I get:
Shouldn't it be [0.0, 128, 64] instead? EDIT: Okay, my fault. I forgot that the coordinate range is (-1,1) and not [-1,1]... So my normalized coordinates are outside the valid range, thus the result is also outside the range. This works:
|
The statement in the docs is simply referring to the order of the coordinates in the last dimension. Let's look at your example:
Here there are two possibilities for ordering the last dimension (ie the order of the two "columns"): (x, y) or (y, x). The docs simply say that the first ordering is used. I felt that this was worth pointing out because it is the reverse ordering of the image dimensions, which are assumed to be height x width (not width x height). And yes, the implementation in Kornia is more polished than this one---just be aware that the normalisation of coordinates is different, as noted earlier. |
I have read the paper and was wondering if there is a fix for the problem stated on page 8:
The reason seems to be that the X and Y grid is defined to lie in the range (-1,1) by the formulas on page 4. Is there a specific reason for this or would the DSNT also work when the grids are in the range [-1,1]?
A formula to define such a grid would be
-1 + (2*(i-1)) / (w-1)
For a heatmap that has the width 5, the grid would have these values in the columns:
i=1 => -1
i=2 => -1 + 2/4 = -0.5
i=3 => -1 + 4/4 = 0
i=4 => -1 + 6/4 = 0.5
i=5 => -1 + 8/4 = 1
So the grid would look like
-1 | -0.5 | 0 | 0.5 | 1
instead of
-0.8 | -0.4 | 0 | 0.4 | 0.8
So my question is if there is a reason to use the second grid instead of the first one? From what I see this should also work. If there is interest in this change, I could try to implement it.
The advantage would be that the system will be able to regress coordinates on the border and not just very close to the border (depending on the heatmap dimensions)
The text was updated successfully, but these errors were encountered: