New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ragged array #2842
Fix ragged array #2842
Conversation
Codecov Report
@@ Coverage Diff @@
## RELEASE_next_minor #2842 +/- ##
===================================================
Coverage 77.11% 77.11%
===================================================
Files 206 206
Lines 31600 31656 +56
Branches 6918 6934 +16
===================================================
+ Hits 24367 24412 +45
- Misses 5482 5489 +7
- Partials 1751 1755 +4
Continue to review full report at Codecov.
|
I very much agree with this improvement. Better handling of this type of "variable length" data will be useful for things like peak finding, where we typically get very different amounts of peaks per navigation position. |
I spent a bit of time trying to sort out failing pyxem functions caused by the improvements in HyperSpy Minimal working examplesIn the current
|
This is different bug, coming from a special case for dask array when creating lazy signal. This should be fixed in b7502ea. |
pyxem failure are:
details of the failures
From a quick look at the failure, it seems that there are most likely related and shouldn't be difficult to fix. |
@pc494, @CSSFrancis, @din14970, I don't know if you are using much ragged array (for sure, some of you use it much more than me!) and it you find working with ragged array annoying, this is a good moment to have a look at it to improve the situation. Thanks! One thing that I have left on this PR is to add a section on creating ragged signal in the user guide. Other than that it is ready for discussion. |
I think the most common use for ragged signals at the moment is peak finding, I'll try it for that use case. |
Yes, most likely. When tasting the water with pyxem on this, I have seen it been used for the indexation too. |
I agree with the above, by far the most prevalent use of ragged signals in pyxem at the moment is peak finding. I think having:
is a far clearer than what we have at present. I'm not sure I have much more to add, it looks good! |
c63bbb2
to
89ebab5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing the functionality:
Signal1D and Signal2D
import numpy as np
import hyperspy.api as hs
data = np.zeros((2, 10), dtype=object)
s = hs.signals.Signal1D(data, ragged=True)
print(s) # <Signal1D, title: , dimensions: (2|10ragged)>
s = hs.signals.Signal2D(data, ragged=True)
print(s) # <Signal2D, title: , dimensions: (|10, 2ragged)>
Should Signal1D
and Signal2D
allowed to be ragged
?
.map
import numpy as np
import hyperspy.api as hs
def afunction(image):
rand = np.random.randint(2, 50)
return image[0, :rand]
data = np.random.random((4, 6, 100, 100))
s = hs.signals.Signal2D(data)
s1 = s.map(afunction, ragged=True, inplace=False)
print(s1) # <BaseSignal, title: , dimensions: (6, 4|ragged)>
#########################
s_lazy = s.as_lazy()
s_lazy.data = s_lazy.data.rechunk((2, 3, 100, 100))
s1_lazy = s_lazy.map(afunction, ragged=True, inplace=False)
print(s1_lazy) # <LazySignal, title: , dimensions: (6, 4|)>
.map(..., ragged=True)
for non-lazy signal returns a ragged
signal. While a lazy signal does not. Not sure if this should be fixed in this pull request or not, since this will most likely be fixed in #2703 (non-lazy map improvements).
Thanks @magnunor for the review, this should all good now, except the suggestion regarding american versus british english, which I have left as it is. |
Some more points: The error messages when using import numpy as np
import hyperspy.api as hs
data = np.zeros((10, 15), dtype=object)
s = hs.signals.BaseSignal(data, ragged=True)
s.as_signal2D(image_axes=(0, 1)) # RuntimeError: Signal with ragged dimension can't be transposed. This also extends to Plotting of lazy ragged signals gives bad error: import numpy as np
import hyperspy.api as hs
data = np.zeros((10, 15), dtype=object)
s = hs.signals.BaseSignal(data, ragged=True).as_lazy()
s.plot() # IndexError: list index out of range As opposed to non-lazy signals: |
I thought about it and I think that the current situation is better:
Done |
…ragged array or when trying to create ragged Signal1D or Signal2D
777b151
to
9a37d1a
Compare
@magnunor, do you want to do another review? |
I'll try to do another review during the weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some minor comments. After those are addressed, the pull request should be good for merging!
40e8419
to
e8f0523
Compare
LGTM! |
The changes in #2773 break the pyxem test suite. In my opinion, using the use of ragged array is not clear enough and there is an inconsistency in slicing array. This has been been discussed and introduced in #1336, in particular starting from #1336 (comment).
Before this PR
With this PR
Progress of the PR
upcoming_changes
folder (seeupcoming_changes/README.rst
),readthedocs
doc build of this PR (link in github checks)