Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError when the object is not missing #2389

Open
woutdenolf opened this issue Feb 29, 2024 · 6 comments
Open

KeyError when the object is not missing #2389

woutdenolf opened this issue Feb 29, 2024 · 6 comments

Comments

@woutdenolf
Copy link

h5py currently has the following error mapping

H5E_NOTFOUND:       KeyError,    # Object not found
H5E_CANTOPENOBJ:    KeyError,

We want to distinguish between these two errors

KeyError: "Unable to synchronously open object (object 'end_time' doesn't exist)"
KeyError: 'Unable to synchronously open object (addr overflow, addr = 64392369, size = 96, eoa = 64365353)'

The first one is H5E_NOTFOUND and the second is H5E_CANTOPENOBJ but in python both are raised as KeyError.

In the case of H5E_CANTOPENOBJ we can close the file and try again while in the case of H5E_NOTFOUND we know the item is not there. Would it be possible to map H5E_CANTOPENOBJ to another exception type? Perhaps a custom exception type derived from KeyError to not break the API:

class CannotOpenObj(KeyError):
    pass
@Delengowski
Copy link
Contributor

Delengowski commented Mar 17, 2024

There's a few of these that I explicitly check for at work. I don't make the HDF5, another team does and sometimes they're corrupted in various ways. They get made in C++.

I have them documented but I cannot remember the way we check with h5py but changing from a KeyError would break my check, not that I mind, it can be easily changed. Would be nice if its more explicit. I'll be able to report back this week once I'm in office and have access to the code.

Off the top of my head, one is a simple containment check for a dataset I know should exist and the other is access of a datasets id. I think there may be a third but I forget.

@aragilar
Copy link
Member

I wouldn't necessarily be opposed to having bespoke exceptions which subclass from existing exceptions, but I vaguely recall issues in past where we made changes in the error -> exception logic that we thought were backwards compatible, and turned out not to be in a subtle way and so needed to revert them, so I feel any changes here would need to work out why the same thing wouldn't happen again.

@tacaswell
Copy link
Member

I am in favor of adding a more specialized class CantOpenKeyError(KeyError): ... for H5E_CANTOPENOBJ

My memory of past exceptions issues was that libhdf5 does not consider the error codes part of the stable API so have changed them under us.

@Delengowski
Copy link
Contributor

I am in favor of adding a more specialized class CantOpenKeyError(KeyError): ... for H5E_CANTOPENOBJ

My memory of past exceptions issues was that libhdf5 does not consider the error codes part of the stable API so have changed them under us.

I'm slightly skeptical at this given what bit of the H5E code I've used at work. How would there be an expectation to use H5Ewalk if they can change like that.

@tacaswell
Copy link
Member

This came up in #1099 and https://forum.hdfgroup.org/t/hdf5-error-number-changes-in-1-10-3/5013/2?u=thomas1

However, that was a while ago and in HDFGroup/hdf5#37 @takluyver got them considered part of the stable API and in #1815 we started using them!

I'm glad my memory isn't completely gone, but also very happy I am wrong and they are now part of the API!

@takluyver
Copy link
Member

To be clear, there are two different sets of numbers at play here: HDF5's error numbers (major & minor), and system errnos.

It was the format of exposing system errnos in error strings (errno = 3) that I pushed to be preserved as part of the API. Several of the most familiar errnos have corresponding builtin Python exceptions, e.g. FileNotFoundError for ENOENT, so it's nice to be able to reliably get the errno.

The question here is about HDF5's error numbers. I would guess they're not 100% guaranteed to be stable, especially when adding new layers like the VOL, it's hard to imagine you could ensure exactly the same error behaviour in every corner case. In any case, I think it's OK for us to cautiously make some changes to how we translate HDF5 errors into Python exceptions where there's a clear upside.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants