Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation changes #389

Merged
merged 16 commits into from
May 11, 2020
Merged

Validation changes #389

merged 16 commits into from
May 11, 2020

Conversation

mpsonntag
Copy link
Contributor

@mpsonntag mpsonntag commented Apr 30, 2020

This PR contains a couple of potentially controversial points and should be discussed here before merging and addresses the issues #377, #378 and #379.

Changes to the Validation class enabling Custom Validation Instances sneaked itself in at an earlier commit and should be taken into account when reviewing this PR. Check changes in Validation.__init__() and run_validation and register_custom_handler methods. Sorry for the mixup...

This PR

  • adds a validation_id enum class.
  • adds individual validation_ids to each validation to make them identifyable via an ID rather than by their error message.
  • adds a validation_id field to the ValidationError as well.
  • uses the validation_id when running cardinality validations and now prints only messages when a cardinality is violated. Before a user would be spammed with different validation warnings as well.
  • shortens the general ValidationError.__repr__ to make it more convenient to read.
  • adds a report method to the Validation class to provide a general overview of collected errors and warnings.
  • uses the report method whenever a Document is saved or loaded via the ODMLParser resulting in a warnings.warn message:
UserWarning: The saved Document contains formal issues. Run 'odml.validation.Validation(doc)' to resolve them.
Validation found 0 errors and 3 warnings in 1 Sections and 1 Properties.
  • changes the object_name_readable validation message to Name not assigned to make it clearer to the user that an object name should be set.
  • removes the section_repository_present validation from the default validation list. Since Sections rarely have repositories set, this validation can get a bit spammy when validating a Document.

Fixes:

  • An unrelated change in Section and Property is added: when trying to set the name attribute to None, it now silently sets the name to id instead, since name must not be empty. It would be set to id on load and can cause AttributeError exceptions with some methods if its not set.

Points for discussion before merging:

  • should we keep the standalone Validation class features introduced in an earlier commit, even though we are not using them right now? They might be useful for future standalone validations.
  • should we keep the validation warning message on save and load as it is now or make any changes to it?
  • this validation is only run when using the ODMLParser, odml.load and odml.save, not when using the underlying XMLParser for XML or DictParser for YAML and JSON to keep the validations out of the core parsers. Any different opinions on this point?

The Section repository present validation is no longer part
of the default validation and should be added on demand.
A validation report is now printed as a warnings module
UserWarnings if a saved or loaded Document contains
Validation warnings or errors.
On load the warnings will only be printed when the
Parsers 'show_warnings' attribute is True.
@coveralls
Copy link

coveralls commented Apr 30, 2020

Coverage Status

Coverage increased (+0.08%) to 77.027% when pulling 271d78d on mpsonntag:refactorValid into 266f1c1 on G-Node:master.

test/test_validation.py Show resolved Hide resolved
odml/validation.py Outdated Show resolved Hide resolved
odml/tools/odmlparser.py Outdated Show resolved Hide resolved
odml/tools/odmlparser.py Outdated Show resolved Hide resolved
@achilleas-k
Copy link
Member

Looks great! Thanks.

One small note: I think the reason I brought up 3-digit status codes in our last meeting was to separate the categories by the first digit, so 1xx would signify a required attribute value is missing or empty, 2xx would mean a uniqueness constraint is violated, 4xx is property related, and so on. The idea was that if we somehow end up with more than 10 rules for a category (e.g., 10 rules for properties) the 2-digit system wouldn't suffice.

@achilleas-k
Copy link
Member

That was quick. @jgrewe I'm good to merge.

@achilleas-k achilleas-k merged commit 8e292c8 into G-Node:master May 11, 2020
@mpsonntag mpsonntag deleted the refactorValid branch May 12, 2020 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants