Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/3704.misc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Remove an expensive `isinstance` check from the bytes codec decoding routine.
11 changes: 2 additions & 9 deletions src/zarr/codecs/bytes.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@
from enum import Enum
from typing import TYPE_CHECKING

import numpy as np

from zarr.abc.codec import ArrayBytesCodec
from zarr.core.buffer import Buffer, NDArrayLike, NDBuffer
from zarr.core.buffer import Buffer, NDBuffer
from zarr.core.common import JSON, parse_enum, parse_named_configuration
from zarr.core.dtype.common import HasEndianness

Expand Down Expand Up @@ -72,20 +70,15 @@ async def _decode_single(
chunk_bytes: Buffer,
chunk_spec: ArraySpec,
) -> NDBuffer:
assert isinstance(chunk_bytes, Buffer)
# TODO: remove endianness enum in favor of literal union
endian_str = self.endian.value if self.endian is not None else None
if isinstance(chunk_spec.dtype, HasEndianness):
dtype = replace(chunk_spec.dtype, endianness=endian_str).to_native_dtype() # type: ignore[call-arg]
else:
dtype = chunk_spec.dtype.to_native_dtype()
as_array_like = chunk_bytes.as_array_like()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just have as_array_like become as_ndarray_like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean just rename the variable? We have to choose where the surprise is: at the as_array_like call (surprising if it returns an ndarraylike) or from_ndarray_like (surprising if it accepts an arraylike) call. I don't see much of a difference here, but happy to rename if you feel strongly

Copy link
Contributor

@dcherian dcherian Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is why I shouldn't reply when half-asleep at night)

Apologies for the confusion. Can we not have Buffer.as_ndarray_like() that makes the bytes ready for the codec pipeline in the form the codec pipeline needs it i.e. NDArrayLike? That way it is type safe and performant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry so a new method, that ensures that the contents of the buffer are an ndarray? yeah I think that should be easy to add!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it OK if we spin that out into a separate issue?

Copy link
Contributor

@dcherian dcherian Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but isn't the needed change just modifying as_array_like to call np.asanyarray in the CPU buffer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this requires changing the as_array_like method for gpu buffer too, and/or the method on the abc?

def as_array_like(self) -> ArrayLike:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, up to you as to when you want to fix it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer is public api so I don't think we want to change that as part of a performance fix.

if isinstance(as_array_like, NDArrayLike):
as_nd_array_like = as_array_like
else:
as_nd_array_like = np.asanyarray(as_array_like)
chunk_array = chunk_spec.prototype.nd_buffer.from_ndarray_like(
as_nd_array_like.view(dtype=dtype)
as_array_like.view(dtype=dtype) # type: ignore[attr-defined]
)

# ensure correct chunk shape
Expand Down