Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix normalize_fill_value for structured arrays #1397

Merged
merged 5 commits into from May 4, 2023

Conversation

alanhdu
Copy link
Contributor

@alanhdu alanhdu commented Apr 20, 2023

The FutureWarning identified in #552 is now a TypeError in the latest numpy release. We can fix this bu checking whether isinstance(fill_value, np.void) before checking whether fill_value == 0 (that way, if fill_value is already a structured dtype, then we do not compare it with a scalar).

One symptom of this is that

import numpy as np
import zarr

dtype = np.dtype([("a", np.float64), ("b", np.int64)])
zarr.open_array([], shape=1000, dtype=dtype

will crash with

TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 zarr.open_array({}, shape=1000, dtype=dtype)

File ~/workspace/zarr-python/zarr/creation.py:586, in open_array(store, mode, shape, chunks, dtype, compressor, fill_value, order, synchronizer, filters, cache_metadata, cache_attrs, path, object_codec, chunk_store, storage_options, partial_decompress, write_empty_chunks, zarr_version, dimension_separator, **kwargs)
    584         if contains_group(store, path=path):
    585             raise ContainsGroupError(path)
--> 586         init_array(store, shape=shape, chunks=chunks, dtype=dtype,
    587                    compressor=compressor, fill_value=fill_value,
    588                    order=order, filters=filters, path=path,
    589                    object_codec=object_codec, chunk_store=chunk_store,
    590                    dimension_separator=dimension_separator)
    592 elif mode in ['w-', 'x']:
    593     if contains_group(store, path=path):

File ~/workspace/zarr-python/zarr/storage.py:438, in init_array(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec, dimension_separator, storage_transformers)
    435 if not compressor:
    436     # compatibility with legacy tests using compressor=[]
    437     compressor = None
--> 438 _init_array_metadata(store, shape=shape, chunks=chunks, dtype=dtype,
    439                      compressor=compressor, fill_value=fill_value,
    440                      order=order, overwrite=overwrite, path=path,
    441                      chunk_store=chunk_store, filters=filters,
    442                      object_codec=object_codec,
    443                      dimension_separator=dimension_separator,
    444                      storage_transformers=storage_transformers)

File ~/workspace/zarr-python/zarr/storage.py:515, in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec, dimension_separator, storage_transformers)
    513 chunks = normalize_chunks(chunks, shape, dtype.itemsize)
    514 order = normalize_order(order)
--> 515 fill_value = normalize_fill_value(fill_value, dtype)
    517 # optional array metadata
    518 if dimension_separator is None and store_version == 2:

File ~/workspace/zarr-python/zarr/util.py:298, in normalize_fill_value(fill_value, dtype)
    295 if fill_value is None or dtype.hasobject:
    296     # no fill value
    297     pass
--> 298 elif fill_value == 0:
    299     # this should be compatible across numpy versions for any array type, including
    300     # structured arrays
    301     fill_value = np.zeros((), dtype=dtype)[()]
    303 elif dtype.kind == 'U':
    304     # special case unicode because of encoding issues on Windows if passed through numpy
    305     # https://github.com/alimanfoo/zarr/pull/172#issuecomment-343782713

TypeError: Cannot compare structured or void to non-void arrays.

Closes #552.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Apr 20, 2023
@joshmoore
Copy link
Member

joshmoore commented Apr 28, 2023

Thanks for taking a look at this, @alanhdu! I've triggered the workflows.

@codecov
Copy link

codecov bot commented May 2, 2023

Codecov Report

Merging #1397 (2ebbf19) into main (d54f25c) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #1397   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           37        37           
  Lines        14852     14857    +5     
=========================================
+ Hits         14852     14857    +5     
Impacted Files Coverage Δ
zarr/tests/test_util.py 100.00% <100.00%> (ø)
zarr/util.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label May 3, 2023
@joshmoore
Copy link
Member

Thanks again, @alanhdu. I've added a changelog entry if you'd like to review. Workflows are running again with an updated main branch. Unless anyone has an objection, I'd say let's get this into 2.15.x.

@joshmoore joshmoore merged commit 25e6036 into zarr-developers:main May 4, 2023
19 checks passed
@alanhdu alanhdu deleted the structarray branch May 4, 2023 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FutureWarning from zarr.open
2 participants