-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Map coordinate fix #206
Map coordinate fix #206
Conversation
Thanks a lot for working on these ndimage issues! |
Can you comment on why you chose the 0.4999 / 0.5001 numbers? Looks like that should be 0.5 in both cases? |
This is a fundamental issue in ndimage that we never explicitly sorted out, and it comes down to two different ways of seeing data. The first is to see pixels as blocks or bins, with the index value indicating either the start or center (a design decision):
The second is to see the pixel values as data points:
The latter case is what is currently implemented, whereby "wrap" mode moves past 3 straight into 0. In the earlier model, however, Moving 0.5 past 3 could mean either being at 3.5 or at 4 (i.e. -1) depending on how you choose the center position. We need to make a call on this, and then have consistent behavior all round. |
Guess the history and motivation for (2) needs some more explanation. I have always thought of (1) when using ndimage and considered wrap mode broken. Both images and gridded data require (1), not (2). |
The question, I guess, is just whether wrap mode should implement:
or
My guess would be the latter? |
I'd guess the former. The reason being that the location has meaning in n-D (for images, gridded data). So for 0 1 2 3 the 4 is a neighbor of 0 and 5, not of 3 and 5. If you want the latter behavior, simply reshape to 1-D? |
I'm not sure I follow. Actually, I'm now pretty sure that behavior 1 cannot be implemented, because it would mean that 3 and 0 are the same data-point (but they may have distinct values). In your example, the padding according to suggestion 2 above would look like this:
with all data spaced one unit apart. |
I would vote for the later. This seem to be what the the authors of Ticket #1520 and #796 were expecting. The downside of this is that we would also have to agree on how to extrapolate values in the space between the last index (n) and the first index of the wrap array (n+1). Maybe we should bring this up on the scipy-dev list to get a few more voices in on the discussion? I can write up a summary of the two options and present them to the list. As for the 0.4999 / 0.5001, they are present because there are some edge cases where you get indices that are out of range only due to floating point math rounding. Ticket #1378 is a manifestation of this problem, math.sin(180.0 * pi / 180.0) is ~1.2 e -16 and results in an index of 2 + some_tiny_value which is marked out of range. |
The interpolation is not a problem, since you have two data-points with a single unit spacing between them, just like everywhere else. We should also carefully consider mirror mode, which may currently be implemented as:
instead of
(i.e. two unit spacings between the N and N+1 values). |
The out of bounds issue is really tricky. Typically, you want the last data-point to still have the extent of a pixel, just like the rest of the values, but that's not really feasible. The problem could be worked around by making sure (like zoom does) that you never go outside the (0, N-1) interval -- although we should build in towards-zero rounding around a bracket of a certain size (1e-10 or so) there to handle this problem. |
I've been trying to understand the various boundary filling modes and created this document to help keep track of the varous modes. Can we agree that this is how we should be doing 'wrap' mode? It is consistent with the short documentation in the Scipy documentation, http://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html Boundary filling modesThere are five modes for filling in data outside of borders of inputted data, which are described below. Constant modePoints with an index, including fraction indices, less than 0 or greater than
Nearest modePoints with an index, including fraction indices, less than 0 or greater than
Reflect modePoints with an index less than 0 or greater than len - 1 are reflected with a
Mirror modePoints with an index less than 0 or greater than len - 1 are reflected
Wrap ModePoints with an index less than 0 or greater than len-1 are wrapped by
Currently the first and last point are taken to be the same data point:
With this pull all of the above modes work as described with integer shifts. Fractional indices fail at the boundaries for the reflect and wrap modes because the repeated values are not used for interpolation. I'm looking into this but if anyone can help I would appreaciate it. |
The description of the modes looks correct; but the 0 / 3 combination of points cannot be (I assume you were just documenting the current state--but if that is the current state it needs to be fixed). |
Have you considered the behaviour of Matlabs image processing toolbox? Might make sense to be, at least, consistent in that regard. For what it's worth, I agree with Stefans points about the representation of pixels, it's a really common thing to ignore coordinate systems but not always the best thing to do. I guess treating things as an evenly spaced regular grid is an implicit thing here but it would be great to see that be described in code. Also, consider the possibility of more complex boundary extrapolation algorithms, I think it would be nice to allow the user to define their own whacky way of extrapolating. In weather fields we do things like a local mean extrapolation. |
@Stefan: after reading Jonathan's description I now see that in your option 1, the 0/3 meant 0 and/or 3 for the same point. Since that indeed doesn't make much sense, I had interpreted your slash as a line break. So never mind my earlier comment. |
Considering pixels as bins with the indexes at the center of the pixels is pretty standard, IDL and Matplotlib do this, for instance. It has some advantages, nice behavior under reflections with the consequent symmetries, centroids behave nicely, and rounding can be used to find which pixel a point is contained in. |
It is, but it also doesn't make much sense in N-dimensions where N > 3. Since both models are fairly common, do you agree that it makes sense in this case to just see the pixels as data points in space? |
Sure, they are data points in space, but where are they located and what do they represent? For pixels and voxels they are the value of an integral over a region and locating them at the center is in some sense the best estimate of the value at the center. But apart from that, the choice is fairly arbitrary. The extension modes are related to the descrete cosine transforms and the fft. Reflect mode is DCT-II, mirror mode is DCT-I, the wrap mode is the FFT. For general data points in space that is probably a good way to look at things. |
And I'll add that the reflect mode, nee pixel centers, is also related to the trapazoidal rule. Generally there are numerical advantages that derive from the symmetry. |
This reverts commit c26ea2c.
I now have a better idea how to fix this issue. map_coordinate is being used to find the location of the shifted point, but not the points surrounding that point which are needed during the spline interpolation. I have to think a bit longer on what the best method for doing this should be. Until then please hold off on the including this pull request. |
Closing this pull request for the moment. I don't have time to work on a proper fix for these issue right now. |
I did some more digging, and will try to summarize the two modes of thinking about wrapping on the boundary: Do we perceive the first and last datapoints to be on the image boundary, i.e. that the last and first value—when wrapping—lie essentially on top of (very close to) one another. Or do we imagine a regular grid of measurement values that is extended further into a larger regular grid, with values depending on the mode of wrapping. This is in line with viewing the data as a periodic signal, but in contrast with a view of the data as pixels. It basically comes down to where you define the boundary of your image to be: does the boundary run on top of the outside data points of the image, or does it run outside of those data points by a distance of 0.5. Both concepts are potentially valid, depending on what is desired, although I would argue that the currently implemented version—that the boundary lies on the outside data points, is valid and should remain the default. We can perhaps consider adding a new wrapping mode, |
add alias grid-mirror for reflect
add alias grid-mirror for reflect TST: test_splines.py now needs to use 'grid-wrap' rather than 'wrap'
add alias grid-mirror for reflect TST: test_splines.py now needs to use 'grid-wrap' rather than 'wrap'
add alias grid-mirror for reflect
add alias grid-mirror for reflect TST: test_splines.py now needs to use 'grid-wrap' rather than 'wrap'
add alias grid-mirror for reflect
add alias grid-mirror for reflect
add alias grid-mirror for reflect TST: test_splines.py now needs to use 'grid-wrap' rather than 'wrap'
add alias grid-mirror for reflect
…#12767) * ENH: add mode NI_EXTEND_WRAP_GRID to map_coordinates * MAINT: remove duplicated code for the mirror extension case The next step is to use other modes here in addition to the existing map_coordinates call above * pass the mode argument to spline_filter1d calls in interpolation.py * BUG: use wrap and reflect spline boundary conditions appropriately * use grid-wrap instead of wrap-grid as suggested in gh-206 add alias grid-mirror for reflect TST: test_splines.py now needs to use 'grid-wrap' rather than 'wrap' * ENH: modify NI_ZoomShift to support all spline boundary modes as well the same approach for spline_mode as used in NI_GeometricTransform is repeated * BUG: For line buffer functions set grid-wrap equivalent to wrap * TST: add map_coordinates boundary tests for new modes TST: add new zoom and shift tests TST: add reflect via affine shift test * DOC: update ndimage.interpolation docstrings * DOC: add boundary and interpolation grid figures ndimage tutorial * DOC: pep8 in tutorial examples * DOC: update plot_interp_grid.py tickmarks for matplotlib 3.3 compatibility * DOC: update tutorial on reviewer feedback * DOC: link to boundary mode illustrations from interpolation docstrings DOC: mention grid-mirror and grid-wrap synonyms in ndimage filter docstrings DOC: update mode descripts in the ndimage filter tutorial text
add alias grid-mirror for reflect
This pull request address Trac Tickets #796, #1378 and #1520 which all stem from the map_coordinate function in ni_interpolation.c
As currently written, this pull causes two tests in test_ndimage to fail (boundary modes and boundary modes 2) which is the reason for the long explaination. In these two tests the 'wrap' mode need further clarification. Take a simplified version of the boundary mode test:
This generates an index mapping to take from the input for the output of [0.5, 1.5, 2.5, 3.5, 4.5]. With "wrap" mode determining what elements these correspond to is difficult.
The first element is well defined, halfway between the first and second elements in input, 2 and 4, so 3.
The second and third elements are also well defined, 5 and 7.
The fourth element has an index of 3.5, but the input only has indices to 3. Currently in scipy this is "wrapped" an index of 0.5, with a value of 3, but this seems incorrect to me. The fourth element in the input has an index of 3, if the data is wrapped (appended before and after the original data as needed) then the first element would have an index of 4. In this sense an index of 3.5 is halfway between the fourth (last) element and the first element and might be represented as -0.5. The problem come when we try to find a value for this element, we need to extrapolate a half point past either the last element or the first element.
The fifth element in the output has an input index of 4.5, which is currently "wrapped" to an index of 1.5 and takes the value of 5. In my mind, if an index of 4 wraps to the first element of the input, then the 4.5 should be between the first and second element of the input, so 3.
So the question becomes should output be:
1). As scipy has it currently so that 3.5 -> 0.5 and 4.5 -> 1.5. output = [3, 5, 7, 3, 5]
2). As in this pull so that 3.5 -> -0.5, 4.5 -> 0.5. output = [3, 5, 7, ?, 3] and then also what should ? be.
I'm in support of 2, but would be happy if someone can convince me that 1 is correct.
As an addition exame, please see what happens to the third element vs. all other elements of output when shift is defined as: