Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce dtype=object for incompatible numpy array conversion #417

Merged
merged 2 commits into from
Jan 13, 2023

Conversation

bnavigator
Copy link
Contributor

@bnavigator bnavigator commented Jan 12, 2023

Numpy 1.24 does not accept ragged sequences without dtype=object anymore: numpy/numpy#22004

Fixes #333

Before (#333 (comment) and #416):

[   51s] ____________________________ test_non_numpy_inputs _____________________________
[   51s] 
[   51s]     def test_non_numpy_inputs():
[   51s]         # numpy will infer a range of different shapes and dtypes for these inputs.
[   51s]         # Make sure that round-tripping through encode preserves this.
[   51s]         data = [
[   51s]             [0, 1],
[   51s]             [[0, 1], [2, 3]],
[   51s]             [[0], [1], [2, 3]],
[   51s]             [[[0, 0]], [[1, 1]], [[2, 3]]],
[   51s]             ["1"],
[   51s]             ["11", "11"],
[   51s]             ["11", "1", "1"],
[   51s]             [{}],
[   51s]             [{"key": "value"}, ["list", "of", "strings"]],
[   51s]         ]
[   51s]         for input_data in data:
[   51s]             for codec in codecs:
[   51s] >               output_data = codec.decode(codec.encode(input_data))
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/tests/test_json.py:72: 
[   51s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   51s] 
[   51s] self = JSON(encoding='utf-8', allow_nan=True, check_circular=True, ensure_ascii=True,
[   51s]      indent=None, separators=(',', ':'), skipkeys=False, sort_keys=True,
[   51s]      strict=True)
[   51s] buf = [[0], [1], [2, 3]]
[   51s] 
[   51s]     def encode(self, buf):
[   51s] >       buf = np.asarray(buf)
[   51s] E       ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/json.py:57: ValueError
[   51s] ____________________________ test_non_numpy_inputs _____________________________
[   51s] 
[   51s]     def test_non_numpy_inputs():
[   51s]         codec = MsgPack()
[   51s]         # numpy will infer a range of different shapes and dtypes for these inputs.
[   51s]         # Make sure that round-tripping through encode preserves this.
[   51s]         data = [
[   51s]             [0, 1],
[   51s]             [[0, 1], [2, 3]],
[   51s]             [[0], [1], [2, 3]],
[   51s]             [[[0, 0]], [[1, 1]], [[2, 3]]],
[   51s]             ["1"],
[   51s]             ["11", "11"],
[   51s]             ["11", "1", "1"],
[   51s]             [{}],
[   51s]             [{"key": "value"}, ["list", "of", "strings"]],
[   51s]             [b"1"],
[   51s]             [b"11", b"11"],
[   51s]             [b"11", b"1", b"1"],
[   51s]             [{b"key": b"value"}, [b"list", b"of", b"strings"]],
[   51s]         ]
[   51s]         for input_data in data:
[   51s] >           actual = codec.decode(codec.encode(input_data))
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/tests/test_msgpacks.py:75: 
[   51s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   51s] 
[   51s] self = MsgPack(raw=False, use_bin_type=True, use_single_float=False)
[   51s] buf = [[0], [1], [2, 3]]
[   51s] 
[   51s]     def encode(self, buf):
[   51s] >       buf = np.asarray(buf)
[   51s] E       ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/msgpacks.py:55: ValueError

@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Merging #417 (bee8aa7) into main (955019b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #417   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           54        54           
  Lines         2089      2095    +6     
=========================================
+ Hits          2089      2095    +6     
Impacted Files Coverage Δ
numcodecs/json.py 100.00% <100.00%> (ø)
numcodecs/msgpacks.py 100.00% <100.00%> (ø)
numcodecs/tests/test_json.py 100.00% <100.00%> (ø)
numcodecs/tests/test_msgpacks.py 100.00% <100.00%> (ø)

@martindurant
Copy link
Member

+1 from me. Will merge unless someone objects.

@martindurant martindurant merged commit 4f2a2e3 into zarr-developers:main Jan 13, 2023
@martindurant
Copy link
Member

Thanks, @bnavigator

@rabernat rabernat mentioned this pull request Jan 15, 2023
7 tasks
@joshmoore
Copy link
Member

+1 from me. Will merge unless someone objects.

Still catching up after the holidays. Probably my only concern is that with v3 we probably want a more explicit handle on when object is permissible, but that's a lot longer-term than numpy 1.24 which is here now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flexible codecs cannot handle compound dtypes containing objects
3 participants