Ensure correct endian in numba_histogram #2678

jlaehne · 2021-03-16T15:05:37Z

Description of the change

Makes sure that numba_histogram runs for big endian datatypes like >u2 or >f4 by converting data to little endian for execution.

Closes #2668

Progress of the PR

Change implemented,
extend to other numba functions,
add tests,
ready for review.

Minimal example of the bug fix or the new feature

import numpy as np
from hyperspy.misc.array_tools import numba_histogram

# works
arr = np.arange(100, dtype='<u2')
numba_histogram(arr, 5, (0, 100))

# works
arr = np.arange(100, dtype='u2')
numba_histogram(arr, 5, (0, 100))

# works
arr = np.arange(100, dtype='>u2')
numba_histogram(arr, 5, (0, 100))

jlaehne · 2021-03-16T15:10:28Z

Question is, whether this affects any other functions that run numba? Namely

_linear_bin_loop in array_tools.py
lowess in lowess_smooth.py
_fast_mean and _fast_std in peakfinders2D.py

Probably, the would need a similar wrapper.

francisco-dlp · 2021-03-16T15:29:39Z

hyperspy/misc/array_tools.py

@@ -392,6 +391,17 @@ def numba_histogram(data, bins, ranges):
    hist : array
        The values of the histogram.
    """
+    if data.dtype.byteorder == '>':


What about the following instead?

if data.dtype.byteorder == '>': data = data.byteswap().newbyteorder() return _numba_histogram(data, bins, ranges)

It is the same, but easier to maintain...

Sure, no problem.

Should I add it for the other 4 numba-functions? I think we cannot rule out that someone calls them with the wrong endian. For the private functions I could just add it to the calling functions, only lowess would need a wrapper.

The issue is that numba only support native endianess and this approach will not work on all platform. What's about:

# Numba only supported native dtype # https://github.com/numba/numba/issues/2243 if not data.dtype.isnative: data = data.astype(data.dtype.type

Yes, that should do the job, indeed.

codecov · 2021-03-16T16:22:21Z

Codecov Report

Merging #2678 (9452c96) into RELEASE_next_patch (c7a662c) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@                  Coverage Diff                   @@
##           RELEASE_next_patch    #2678      +/-   ##
======================================================
+ Coverage               76.62%   76.66%   +0.04%     
======================================================
  Files                     201      201              
  Lines                   29655    29666      +11     
  Branches                 6491     6495       +4     
======================================================
+ Hits                    22722    22743      +21     
+ Misses                   5174     5168       -6     
+ Partials                 1759     1755       -4

Impacted Files	Coverage Δ
hyperspy/misc/array_tools.py	`83.46% <100.00%> (+7.43%)`	⬆️
hyperspy/misc/lowess_smooth.py	`100.00% <100.00%> (ø)`
hyperspy/signal.py	`75.87% <0.00%> (+0.10%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c7a662c...9452c96. Read the comment docs.

ericpre

Nice to take the opportunity to increase the coverage, thanks!

jlaehne · 2021-03-17T13:57:34Z

Nice to take the opportunity to increase the coverage, thanks!

Not much, but at least a few obvious ones.

In the end, I refrained from covering _fast_mean and _fast_std in peakfinders2D.py, because I could not trigger the error using the 'stat' peakfinder routine.

ensure correct endian in numba_histogram

1d2318c

jlaehne added release: next patch status: needs review labels Mar 16, 2021

jlaehne mentioned this pull request Mar 16, 2021

Histogram error with >u2 dtype #2668

Closed

jlaehne added this to the v1.6.2 milestone Mar 16, 2021

francisco-dlp reviewed Mar 16, 2021

View reviewed changes

francisco-dlp added status: waiting for author and removed status: needs review labels Mar 16, 2021

cover other numba functions

cf6a48a

jlaehne added 2 commits March 16, 2021 21:55

add test

fbe6286

increase coverage of rebin

9452c96

jlaehne added status: needs review and removed status: waiting for author labels Mar 16, 2021

ericpre approved these changes Mar 17, 2021

View reviewed changes

ericpre merged commit 19a0a7a into hyperspy:RELEASE_next_patch Mar 17, 2021

ericpre added type: bug-fix and removed status: needs review labels Mar 17, 2021

jlaehne deleted the numba_endian branch March 17, 2021 13:57

ericpre removed the release: next patch label Apr 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure correct endian in numba_histogram #2678

Ensure correct endian in numba_histogram #2678

jlaehne commented Mar 16, 2021 •

edited

jlaehne commented Mar 16, 2021 •

edited

francisco-dlp Mar 16, 2021

jlaehne Mar 16, 2021

ericpre Mar 16, 2021

jlaehne Mar 16, 2021

codecov bot commented Mar 16, 2021 •

edited

ericpre left a comment

jlaehne commented Mar 17, 2021

Ensure correct endian in numba_histogram #2678

Ensure correct endian in numba_histogram #2678

Conversation

jlaehne commented Mar 16, 2021 • edited

Description of the change

Progress of the PR

Minimal example of the bug fix or the new feature

jlaehne commented Mar 16, 2021 • edited

francisco-dlp Mar 16, 2021

Choose a reason for hiding this comment

jlaehne Mar 16, 2021

Choose a reason for hiding this comment

ericpre Mar 16, 2021

Choose a reason for hiding this comment

jlaehne Mar 16, 2021

Choose a reason for hiding this comment

codecov bot commented Mar 16, 2021 • edited

Codecov Report

ericpre left a comment

Choose a reason for hiding this comment

jlaehne commented Mar 17, 2021

jlaehne commented Mar 16, 2021 •

edited

jlaehne commented Mar 16, 2021 •

edited

codecov bot commented Mar 16, 2021 •

edited