Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define structure for validation report #14

Open
goodb opened this issue Jul 10, 2019 · 4 comments
Open

Define structure for validation report #14

goodb opened this issue Jul 10, 2019 · 4 comments

Comments

@goodb
Copy link
Contributor

goodb commented Jul 10, 2019

When we apply the shapes to a go_cam model, we need to formalize what the code should be providing in response. The shex libraries provide a mapping of the RDF nodes in the model to the labels of the shapes in the provided schema. This alone seems insufficient for users. I'm thinking of a response that would require some additional logic, something that contained additional elements like:

  • boolean for if the model as a whole should be called 'valid' according to the schema - similar to the OWL consistency check. This might be refined into subtypes of model-level quality.
  • A human readable explanation of 1.
  • anything else? I was thinking it would be useful to integrate the shape validation with the OWL validation so the OWL inference report could go in here as well.

On computing model-level validity, I'm thinking something like:
For each named individual in the model:

  1. It must have an RDF type and a biolink category (these should probably be added to the root gocamentity shape).
  2. The BL:category annotation should match a predefined shape. e.g. anything tagged bl:category [GoMolecularFunction:] must match the shape and must not match anything else.
  3. Anything else ?
@balhoff
Copy link
Member

balhoff commented Jul 10, 2019

The BL:category annotation should match a predefined shape. e.g. anything tagged bl:category [GoMolecularFunction:] must match the shape and must not match anything else.

How does this interact with "inheritance"/shape intersection? The following definitions imply to me that a node matching the <Complex> shape will have two values for bl:category: GoComplex: and GoMolecularEntity:. Is that a problem for this principle?

<Complex> @<MolecularEntity> {
   bl:category [GoComplex:]  ;
}// rdfs:comment  "a protein complex"

<MolecularEntity>  EXTRA bl:category {
   bl:category [GoMolecularEntity:]  ;
}// rdfs:comment  "a molecular entity (a gene product, chemical, or complex typically)"

@goodb
Copy link
Contributor Author

goodb commented Jul 10, 2019

I think it is, but its something we could implement around if we needed to. Basically, do we allow multiple BL categories for individual nodes or not? I feel like we probably do not want to recreate hierarchies with category tags. So here we should either make a subbshape of @ if we need to refer to complexes in shapes or just eliminate the shape and use only .

@cmungall
Copy link
Member

I think explanations will be massively important in the long run but we have some time to defer on this as we can make do with geeky explanations in the short term while the modeling group iterates over some of the basics.

I do think we will need to refer to complexes in the schema, for example to state the expected has-part structure

@goodb
Copy link
Contributor Author

goodb commented Jul 11, 2019

@cmungall they key thing is to get the computation of the multi-node, model-level validity in place. Once that is done, the explanations, geeky or otherwise, will fall out easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants