-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch Errors/Warnings for integrity checks on non-raw datasets #102
Comments
I was looking for an easy way to check if the |
The only way to find out whether any filters have had an impact on a dataset is to check whether right below the lines that throw the (Regarding the integrity checks on hierarchy children, a simple |
Ok, thank you!
Both issues dont seem to be directly related and since we will improve the implementation of the |
Yes, good plan! |
Integrity checks are only supposed to run on raw datasets and not on dataset with applied filters or hierarchy datasets. To catch missuse, a 'NotImplementedError' was added for such cases. Fixes #102
Issue Summary
Applying integrity checks on child-datasets or datasets on which filters were applied leads to
KeyError
or untrue Integrity Warnings.Since Integrity checks are only supposed to be run on raw datasets anyways, these bugs can be caught by NotImplementedErrors. For clarity, see problem description below.
Solution
As mentioned above, these two issues can be resolved by introducing NotImplementedError when checking integrity on datasets with filters or datasets further down the pipeline.
Problem Details
Integrity checks on datasets with filters defined:
When defining filters (they don't have to be applied, defining is enough) for a dataset
ds
and then applying an integrity check on that dataset leads to aKeyError
.Example Code:
This is due to differences between the
dataset.config
-dict (which was changed when defining the filter) compared todfn.config_types
-dict.Integrity checks on child-datasets:
When creating a child dataset from a parent dataset, running integrity checks on that child dataset introduce new warnings, despite being the exact same dataset:
Example Code:
Then
ic_child.check()
contains<ICue: 'Metadata: fluorescence channel count inconsistent' at ...>
.This is due to the fact that in
__getitem__()
ofchild
the code refers to parentsds._events
for scalar-values to save storage (I assume). Therefore the corresponding feature names don't need to be present inchild._events.keys()
.These keys (or rather their absence) are also used to generate certain warnings, as is the case for the Metadata warning "fluorescence channel count inconsistent", despite the fact that the fluorescence count is consistent with the data of
child
-dataset.The text was updated successfully, but these errors were encountered: