New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed ragged map with zero dim and Larger Iterating signal #2877
Fixed ragged map with zero dim and Larger Iterating signal #2877
Conversation
…ger than your signal
Codecov Report
@@ Coverage Diff @@
## RELEASE_next_minor #2877 +/- ##
======================================================
+ Coverage 77.13% 77.24% +0.11%
======================================================
Files 206 206
Lines 31868 31883 +15
Branches 7161 7165 +4
======================================================
+ Hits 24582 24629 +47
+ Misses 5533 5500 -33
- Partials 1753 1754 +1
Continue to review full report at Codecov.
|
This is an important and not a trivial change, can you elaborate on this?
Can you give an example to illustrate what is the issue? This will help with considering other alternatives. There are recent discussions on this topic: and one important aspect is that we need to make sure that ragged and non-ragged signals behave consistently with map, slicing, etc. with respect to signal and navigation dimensions. |
Sorry I should have given some examples, I submitted this a little quickly last night.
The issue is that ragged signals built using the constructor can possible have a signal_shape = () and navigation_shape = (). This results in an error when you use the map function on a ragged array with one dimension. For example this block of code would fail(In this case the signal have a nav dimension of ( ) ) import hyperspy.api as hs
import numpy as np
x = np.empty((1,), object)
x[0] = np.ones((2,3))
s = hs.signals.BaseSignal(x, ragged=True)
s.map(np.sum) But this code block will not (In this case the signal have a nav dimension of 1)
The reason this fails is that the rechunking function try to rechunk a fundamentally 1 dimensional array with hyperspy trying to tell that it is a zero dimensional array. I should also note that there are some inconsistencies with numpy which had some small effects when trying to guess the output. import hyperspy.api as hs
import numpy as np
x = np.empty((1,), object)
x[0] = np.ones((2,3))
s = hs.signals.BaseSignal(x)
s.ragged=True
s.inav[0]
# returns a ragged array
s.data[0]
# returns the object at index 0
Yea so the issue is this chunk of code doesn't run: def add_sum(image, add):
out = np.sum(image) + np.sum(add)
return out
import hyperspy.api as hs
import numpy as np
x = np.ones((10,20,2,3))
s = hs.signals.Signal2D(x)
s_add = hs.signals.BaseSignal(2 * np.ones((10, 20, 2, 2, 2))).transpose(3)
s_out = s.map(add_sum, inplace=False, add=s_add)# This doesn't work
s_out = s_add.map(add_sum, inplace=False, add=s)# This does work The issue here is that the
Ahh, I must have missed the recent stuff. I think that this is more consistent than before. At the very least the errors thrown in the above cases are very uninformative and potentially confusing. There is a high likelyhood that few people have come across them, but I recently was trying to map a couple of ragged signals and came across both go these edge cases. |
Thanks for the details, now I see where does it come from. I think, we need to define what is the expected and desired behaviour.
This looks to me like a bug: these two (slightly different) ways of creating a bug should give the same type of signals. What I would expect here is to have the same navigation shape regardless if the signal is ragged or not. In this case, the navigation dimension is zero so I would expect the navigation shape to be
Same bug as mentioned directly (in this comment) above,
It looks to me that |
I guess that makes some sense, as long as the map function works on zero dimensional signals I think that it is not terribly important. I am still thinking that the array print(np.empty((), dtype=object).shape) # ()
print(np.empty((1), dtype=object).shape) # (1,)
print(np.empty((2), dtype=object).shape) # (2,) I doesn't seem like the second and third example should have different numbers of dimensions in hyperspy when in numpy they have the same number of dimensions.
I think it should be? Based on my comment above. I would rather it actually return a non ragged signal kind of like how numpy handles it. I think that would fix some of the weird edge cases you are talking about.
I think the issue is bigger than ragged signals as from my example you can see that it happens with non ragged signals. Ragged signals are just the signals where it happens the most because they have no signal dimensions. Anyways, let me know what you think. I haven't really seen much of an error with ragged signals always having a navigation axis = 1 but obviously you have some more experience with this. |
I looked over this a little bit more. It seems like there is still a bug with ragged signals mapped with a larger signal. This test fails for example because there are extra dimensions added. def test_iter_kwarg_larger_shape_ragged(self):
def return_img(image, add):
return image
x = np.empty((1,), dtype=object)
x[0] = np.ones((4, 2))
s = hs.signals.BaseSignal(x, ragged=True)
s_add = hs.signals.BaseSignal(2 * np.ones((1, 201, 101))).transpose(2)
s_out = s.map(return_img, inplace=False, add=s_add, ragged=True)
np.testing.assert_array_equal(s_out.data[0], x[0]) I'll look this over a little bit more and see if I can find a way to fix this. |
Yes, this is exactly one think that I have to make more consistent as part of #2842. The rational is that the ragged dimension is in the signal dimension and this is communicated to the users through the data = np.empty((2, ), dtype=object)
data[0] = np.arange(3) # data index 0 has shape (3, )
data[1] = np.arange(4) # data index 1 has shape (4, )
print(data.shape) # data shape is (2,) and the ragged dimension is not visible at all, while it is definitely there! Considering this example, hyperspy is behaving consistently with numpy, as the ragged dimension is not visible. Hyperspy does explicitly mention it in the Sometimes it is not possible to find a solution, which provide a perfect consistency between numpy and hyperspy and all hyperspy features and we have to define a compromise somewhere. One of the concern that I have tried to address in #2842 is that slicing of ragged signal should behave consistently, as illustrated by the example in #2842 (comment). Then other things come up during the review. I would suggest to separate the discussions, because otherwise it will difficult to follow for anyone (and also for our future self! ;)):
On these two points, it seems that you reported two different bugs here (inconsistency when creating ragged signal for the first point). |
That sounds like a good idea. I kind of imagined this as a small fix, but obviously there is more to it that first meets the eye. I will probably close this PR, move the relevant discussion and refactor the PR into two separate cases. |
@CSSFrancis, I think this PR can be closed now, after #2878 and #2903 have been merged? |
Description of the change
This is kind of a small change to fix some bugs that I have come across with the map function
Progress of the PR
upcoming_changes
folder (seeupcoming_changes/README.rst
),