Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reading dtype from zarr dataset #72

Merged
merged 6 commits into from
May 9, 2023

Conversation

alejoe91
Copy link
Collaborator

@alejoe91 alejoe91 commented May 8, 2023

Hi guys,

I'm experimenting now with writing to Zarr using a custom DataChunkIterator from neuroconv bassed to the ZarrDataIO.

Everything works fine, but upon reading the file, I get that the zarr_type is not found in the attirbutes. However, this can be inferred from the dtype of the zarr.Dataset.
Not sure this is the most elegant solution and I didn't have time to dig in and find a better one, but it works in my case :)

@oruebel
Copy link
Contributor

oruebel commented May 8, 2023

@alejoe91 it sounds like the zarr_dtype attribute is not being set when writing with DataChunkIterator. From a quick look I think the following code in __setup_chunked_dataset__

try:
dset = parent.create_dataset(name, **io_settings)
except Exception as exc:
raise Exception("Could not create dataset %s in %s" % (name, parent.name)) from exc
return dset

is likely the problem and should be updated to set the zarr_dtype attribute as follows:

        type_str = self.__serial_dtype__(io_settings['dtype'])
        try:
            dset = parent.create_dataset(name, **io_settings)
            dset.attrs['zarr_dtype'] = type_str
        except Exception as exc:
            raise Exception("Could not create dataset %s in %s" % (name, parent.name)) from exc
        return dset

The fix you propose helps work around the issue if an invalid file is encountered but doesn't fix the issue of creating bad files.

@oruebel oruebel added category: bug errors in the code or code behavior priority: high impacts proper operation or use of feature important to most users labels May 8, 2023
@oruebel oruebel added this to the Next Release milestone May 8, 2023
@alejoe91
Copy link
Collaborator Author

alejoe91 commented May 8, 2023

Yeah I figured there was probably a better solution! Should I update the PR with the fix you suggested @oruebel ?

src/hdmf_zarr/backend.py Outdated Show resolved Hide resolved
@oruebel
Copy link
Contributor

oruebel commented May 8, 2023

Should I update the PR with the fix you suggested

Yes please. I think we can keep your fix as well as since it fixes potential issues on read, but it would be best to make sure we write the file correctly.

@alejoe91
Copy link
Collaborator Author

alejoe91 commented May 8, 2023

Should I update the PR with the fix you suggested

Yes please. I think we can keep your fix as well as since it fixes potential issues on read, but it would be best to make sure we write the file correctly.

Should be done ;)

@alejoe91
Copy link
Collaborator Author

alejoe91 commented May 9, 2023

@oruebel

fixed flake8. I don't think failing tests are due to this change

@oruebel
Copy link
Contributor

oruebel commented May 9, 2023

I don't think failing tests are due to this change

I'll take a look at fixing tests tomorrow. One issue is that codecov has been pulled from PyPi and it looks like there are some other issues in the gallery tests. I'll work on that to make sure we can get all tests passing.

@codecov-commenter
Copy link

Codecov Report

Patch coverage: 50.00% and project coverage change: -0.08 ⚠️

Comparison is base (708e1a7) 82.02% compared to head (35202aa) 81.94%.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev      #72      +/-   ##
==========================================
- Coverage   82.02%   81.94%   -0.08%     
==========================================
  Files          11       11              
  Lines        2642     2647       +5     
==========================================
+ Hits         2167     2169       +2     
- Misses        475      478       +3     
Impacted Files Coverage Δ
src/hdmf_zarr/backend.py 87.75% <50.00%> (-0.36%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@oruebel oruebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I created #80 to make sure we add roundtrip tests to cover this case and the changes from this PR

@oruebel oruebel merged commit 1607862 into hdmf-dev:dev May 9, 2023
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: bug errors in the code or code behavior priority: high impacts proper operation or use of feature important to most users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants