-
-
Notifications
You must be signed in to change notification settings - Fork 383
perf:remove isinstance check #3704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf:remove isinstance check #3704
Conversation
|
35% perf improvement seems good |
| dtype = replace(chunk_spec.dtype, endianness=endian_str).to_native_dtype() # type: ignore[call-arg] | ||
| else: | ||
| dtype = chunk_spec.dtype.to_native_dtype() | ||
| as_array_like = chunk_bytes.as_array_like() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just have as_array_like become as_ndarray_like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean just rename the variable? We have to choose where the surprise is: at the as_array_like call (surprising if it returns an ndarraylike) or from_ndarray_like (surprising if it accepts an arraylike) call. I don't see much of a difference here, but happy to rename if you feel strongly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is why I shouldn't reply when half-asleep at night)
Apologies for the confusion. Can we not have Buffer.as_ndarray_like() that makes the bytes ready for the codec pipeline in the form the codec pipeline needs it i.e. NDArrayLike? That way it is type safe and performant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah sorry so a new method, that ensures that the contents of the buffer are an ndarray? yeah I think that should be easy to add!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it OK if we spin that out into a separate issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but isn't the needed change just modifying as_array_like to call np.asanyarray in the CPU buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this requires changing the as_array_like method for gpu buffer too, and/or the method on the abc?
zarr-python/src/zarr/core/buffer/core.py
Line 234 in e03cfc8
| def as_array_like(self) -> ArrayLike: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, up to you as to when you want to fix it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer is public api so I don't think we want to change that as part of a performance fix.
|
Also, this will close #3703 ? |
it will close part of it, but not the underlying issue about our basic buffer / codec design. |
dcherian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I have a preference for a new as_ndarray_like method that removes this if from the hotpath.
removes an expensive
isinstancecheck insideBytesCodec._decode_single. Isinstance onruntime_checkableprotocols is expensive and this particular check is in a hotspot. Without the check, we are slightly less type-safe, but users who somehow get a non-ndarray into this part of the code will get an immediate a runtime error.