Bulk statistics very slow for non-contiguous array data #419

w-k-jones · 2024-03-18T09:50:23Z

I've recently noticed that bulk statistics can run very slowly when applied to data that is non-contiguous. This can happen when slicing dask arrays or broadcasting along the trailing dimension. Calling ravel on these arrays is ~20x slower, which, as we do this for each feature, adds up to a big slowdown. I might look into smarter ways of doing this in future to address this issue

Using np.split might be a fast approach, as shown in https://stackoverflow.com/a/43094244

The text was updated successfully, but these errors were encountered:

w-k-jones added the enhancement Addition of new features, or improved functionality of existing features label Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk statistics very slow for non-contiguous array data #419

Bulk statistics very slow for non-contiguous array data #419

w-k-jones commented Mar 18, 2024

Bulk statistics very slow for non-contiguous array data #419

Bulk statistics very slow for non-contiguous array data #419

Comments

w-k-jones commented Mar 18, 2024