Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anndata writing fails with h5py in Version 3.* and "list of dicts" in "uns" #493

Closed
LustigePerson opened this issue Jan 11, 2021 · 2 comments

Comments

@LustigePerson
Copy link

This might be a rare problem, but it just broke with the new h5py 3.*.
It occurs because we import a metadata dict into anndata.uns["metadata"] .
I was able to track it down to the "list of dicts"

This code works with h5py in version 2.*:

import anndata
adata = anndata.AnnData()
adata.uns["metadata"] = {"my_list": [{"d1":1}, {"d2":2}]}
adata.write("adata.h5ad")

However, with h5py 3.* it raises a TypeError:

TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'uns/metadata/my_dict' of <class 'h5py._hl.files.File'> from /.
@ivirshup
Copy link
Member

So, the way that we use hdf5 files is that dict -> Group, and array -> Dataset. hdf5 doesn't really have a way to have a Dataset of Groups.

In your example where this works (though I'm not sure I'd call this "working"):

import anndata
adata = anndata.AnnData()
orig = [{"d1":1}, {"d2":2}]
adata.uns["metadata"] = {"my_list": orig}
adata.write("adata.h5ad")
result = anndata.read_h5ad("adata.h5ad").uns["metadata"]["my_list"]

assert orig != list(result)
display(type(result[0]))
# str
display(result[0])
# "{'d1': 1}"

The dicts are being encoded as strings, essentially.

If there is a good use case, we could consider working around this to allow lists of dicts. But we'd need a good reason, since this is outside of the "filesystem like hierarchy" that hdf5 and zarr use.

@LustigePerson
Copy link
Author

Ok, I see...

The point is, that we used this for quite some time without "errors" and now it is not working anymore.
But I see that it was not working correctly before, just without errors.

Is there a good use case? I don't know. From a users perspective I expected to be able to pass a dict or a json file to uns.
So, it might be worth to at least prevent writing this kind of list to the anndata object in the beginning.

But I see, that this might be some exotic use case.

Thanks for looking into this so fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants