Skip to content

Commit

Permalink
Add Changelog and document reason for changes.
Browse files Browse the repository at this point in the history
  • Loading branch information
CSSFrancis committed Jan 23, 2024
1 parent 995818c commit 68b72bf
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
11 changes: 5 additions & 6 deletions rsciio/_hierarchical.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,13 +263,12 @@ def _read_array(group, dataset_key):
key = "ragged_shapes"
if key in group:
ragged_shape = group[key]
# if the data is chunked saved array we must first
# cast to a numpy array to avoid multiple calls to
# _decode_chunk in zarr (or h5py)
# Use same chunks as data so that apply_gufunc doesn't rechunk
# Reduces the transfer of data between workers which
# significantly improves performance for distributed loading
data = da.from_array(data, chunks=data.chunks)
shape = da.from_array(
ragged_shape, chunks=data.chunks
) # same chunks as data
shape = da.from_array(ragged_shape, chunks=data.chunks)

data = da.apply_gufunc(unflatten_data, "(),()->()", data, shape)
return data

Expand Down
1 change: 1 addition & 0 deletions upcoming_changes/211.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix saving ragged arrays of vectors from/to a chunked ``hspy`` and ``zspy`` store. Greatly increases the speed of saving and loading ragged arrays from chunked datasets.

0 comments on commit 68b72bf

Please sign in to comment.