Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Principle #3 IDs - automated validation #1017

Open
beckyjackson opened this issue Aug 9, 2019 · 8 comments
Open

Principle #3 IDs - automated validation #1017

beckyjackson opened this issue Aug 9, 2019 · 8 comments
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. principles Issues related to Foundry principles

Comments

@beckyjackson
Copy link
Contributor

beckyjackson commented Aug 9, 2019

FP 3 - URI/Identifier Space

Automated checks:

  1. Core terms must follow NAMESPACE_NUMID format

Mechanism:

Any IRI that starts with http://purl.obolibrary.org/obo/IDSPACE must end with _NUMID. the IDSPACE comes from the registry. All things that start with http://purl.obolibrary.org/obo/ must be a valid registry ID.

There may be external classes that we can't predict. But, if the ontology uses oboInOwl:hasOBONamespace on terms, we can check those entities. The value of that annotation should match NAMESPACE (ignoring case).

If any class annotated with oboInOwl:hasOBONamespace in the core namespace does not follow NAMESPACE_NUMID format (e.g. 'doid' namespace follows DOID_0000001 etc.), throw a warning.

I know some ontologies may use text in the identifier of properties, so maybe that can be an info message? It is currently an error

@beckyjackson beckyjackson added attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools labels Aug 9, 2019
@beckyjackson beckyjackson self-assigned this Aug 9, 2019
@beckyjackson
Copy link
Contributor Author

Should ontologies be allowed to annotate external terms with hasOBONamespace in their namespace? For example, ARO has some external terms (e.g. from DOID) with the namespace value 'antibiotic_resistance'.

In cases where the namespace annotation isn't used, we can just check that classes use numeric format if the IRI contains /obo/NS_ (where NS is the actual namespace)

@beckyjackson
Copy link
Contributor Author

All IRIs must be unique. If an IRI is duplicated, the annotations will be merged in OWL. Duplicate labels and definitions may be a sign of two different terms with the same IRI. Labels and definitions may be duplicated for other reasons, though.

Some ontologies do not use numeric identifiers for everything (e.g. PR).

In the future, we are aiming to have base artefacts for each ontology. The base only contains the terms in that ontology's namespace.

Revised check:
Any IRI that starts with http://purl.obolibrary.org/obo/IDSPACE must end with _NUMID. the IDSPACE comes from the registry. All things that start with http://purl.obolibrary.org/obo/ must be a valid registry ID.

@cmungall
Copy link
Contributor

cmungall commented Aug 19, 2019 via email

@beckyjackson
Copy link
Contributor Author

On the call on Thursday, we revised this to my comment above so that oboInOwl:hasOBONamespace is no longer involved. I'll update the mechanism in my original post to reflect this.

@beckyjackson
Copy link
Contributor Author

For historical reasons, annotation properties may uses hashes (e.g. subset definitions). Object and data properties should not.

This check will ignore annotation properties and apply only to classes, object, and data properties.

The IRI must start with http://purl.obolibrary.org/obo/IDSPACE_.
It is recommended that IRIs end with NUMID. If the identifier following _ is not numeric, we will issue a warning. This requires review to ensure that the identifiers don't include semantics.

@cmungall cmungall added the principles Issues related to Foundry principles label Nov 22, 2019
@cmungall cmungall changed the title Principle #3 automated validation Principle #3 IDs - automated validation Nov 22, 2019
@wdduncan wdduncan added the automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. label Apr 28, 2020
@wdduncan wdduncan moved this from To do to In progress in Automated Validation of Principles Apr 28, 2020
@wdduncan
Copy link
Member

Update.
The counterexample doesn't really apply. NCIT is not necessarily an OBO Foundry ontology.
There is an OBO version of NCIT that does use NCIT_ namespace. e.g:
http://purl.obolibrary.org/obo/NCIT_C12218

AFAIK: this principle is to apply to OBO Foundry ontologies.

Question: At present we don't allow numbers in the namespace. I think we should consider it. For example, should OBO:COVID-19_ be allowed?

cc @bpeters42

@ramonawalls
Copy link
Contributor

This issue is only for automatic validation. Please move other discussion of the principle to issue #954, which I am about to update with the latest text.

@wdduncan
Copy link
Member

Question: At present we don't allow numbers in the namespace. I think we should consider it. For example, should OBO:COVID-19_ be allowed?

@beckyjackson beckyjackson removed their assignment May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. principles Issues related to Foundry principles
Projects
Ontology Reviews
  
Awaiting triage
Development

No branches or pull requests

4 participants