New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provide graded variant validation that includes uncertainty/severity #365
Comments
Original comment by Meng Wang (Bitbucket: icebert, GitHub: icebert): Hi Reece, Based on our last discussion, I think we can add a function in the validator class that provide graded validation. This functional may be called graded_validate or get_validate_info or something else (I am not sure which name is better). The original validate function in the validator remains the original behavior so that the api would not change. In the graded validation, a tuple of (res, msg) would be returned. The res is an enum of the error level, which could be VALID, WARNING or ERROR. The msg is the error or warning message. When the res is VALID, the msg is None. Looking forward to your comments. Meng |
Original comment by Reece Hart (Bitbucket: reece, GitHub: reece): Hi Meng- Your general proposal looks right to me. What do you think about the following ideas?
Rationale for an iterator: Evaluating all criteria for every variant could be expensive. It will be common to want to exit on the first ERROR (or WARNING if strict). Using an iterator will enable a structure like this (untested):
Then, in validate(), something like:
Does that make sense? |
Originally reported by: Reece Hart (Bitbucket: reece, GitHub: reece)
Dependents: #366, #315
Currently, validation is essentially binary: variants are either valid or invalid. For intermediate states, such as when validating reference sequence, an HGVSUnsupportedOperation error is raised. This makes it's difficult to implement two common use cases simultaneously: 1) Is this variant definitely valid (conversely, possibly invalid)?, 2) Is this variant definitely invalid (conversely, possibly valid)?
This issue proposes to create a three result states: invalid, uncertain, and valid.
In addition, there would be a notion of two modes: strict and permissive.
In strict mode, invalid and uncertain variants would raise errors; only known-valid variants would pass.
In permissive mode, only invalid variants would raise errors; uncertain and known-valid variants would pass.
Here are some examples:
The last two cases provide motivation for this issue. Both of those variants raise HGVSUnsupportedOperationError, which means that they can't be validated. That is, there's no known error, but they're also not definitively valid. Issue #366 proposes to validate variants prior to projection. However, that issue can't be implemented now because validating these example variants would raise exceptions, even though they could be projected to genomic sequence reliably.
The interface could be:
That is, the
strict
argument in the initializer might be overridden by the method.The text was updated successfully, but these errors were encountered: