Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Ontology subsets in OMO #80

Open
cmungall opened this issue Nov 30, 2021 · 6 comments
Open

Include Ontology subsets in OMO #80

cmungall opened this issue Nov 30, 2021 · 6 comments

Comments

@cmungall
Copy link
Contributor

cmungall commented Nov 30, 2021

Proposal is to formally include

  • oio:inSubset - annotation property that connected a term to a subset
  • oio:Subset - subset definition

in OMO

Note the way @dosumis and I chose to model subsets is a little odd, see

https://owlcollab.github.io/oboformat/doc/obo-syntax.html#5.0.3

This avoids introducing individuals into the ontology

Subsets are particularly important for ontologies including but not limited to

  • GO
  • Uberon
  • Mondo
  • CL
  • ENVO

Subsets provide a variety of purposes

  • visualization of rolled up annotations for a single gene, e.g. https://www.alliancegenome.org/gene/SGD:S000002812#function---go-annotations
  • summarization of a whole genome, e.g. https://go.princeton.edu/cgi-bin/GOTermMapper
  • selection of a smaller set of terms for curation purposes
  • making views of the ontology for specific communities
    • e.g. metagenome slim for GO
    • e.g. organ system subsets for Uberon
    • various subsets in ENVO
  • "anti-slims" - ways to exclude terms for some purposes, e.g. exclude from gene enrichment analysis
    • uberon has an anti-slim for all ontological upper level terms to exclude from some applications or analyses
  • GO uses subsets of other ontologies for faceted browsing in AmiGO
    • subsets of ECO corresponding to GAF codes
    • taxon filters for core organisms and useful groupings (mammal, vertebrate)
  • metaclasses:
    • PRO uses subsets to describe what level of a protein belongs to (TODO: check is this subset or structured comments?)
    • NCBITaxon uses subsets to indicate membership in a taxonomy rank, e.g. species

We need a standard way of annotating a subset with its purpose. Tools can then use this to determine the correct behavior

Alternative to using inSubset should be considered:

  • TODO: check what mechanism OBI uses to define obi-core
  • Mondo uses dcterms:conformsTo to partition terms into patterns/metaclasses
  • Some ontologies use biolink:category to assign classes to biolink metaclasses
  • supersets such as go-plus are modeled as distinct ontologies
  • linkml enums allow simple mapping of a set of enumeration values to ontology terms

After these alternatives are considered, subsets are still necessary

We also need to standardize how different tools work with different slims.

[incomplete list, I will update later]

Advanced applications can use metadata associated with subsets intelligently, e.g.

  • automatic creation of binning/"other" classes, see https://metacpan.org/dist/go-perl/view/scripts/map2slim
  • blocking propagation over intermediate slim terms
  • gap filling when performing ontology extraction, e.g. A part-of some B part-of some C, transitive: part-of, extract(A,C) => A part-of some C

I think this issue replaces

@zhengj2007
Copy link
Contributor

Hi @cmungall, What will oio:Subset be used for? Could you please give more details?

@matentzn
Copy link
Contributor

Her is an example of a term that has many subsets:

http://www.ontobee.org/ontology/GO?iri=http://purl.obolibrary.org/obo/GO_0038023

The general idea is to group terms together in a way that cannot be done in a logical way: for example, a user group may care about certain terms in an ontology but not others (aka the flybase subset). Or in a disease ontology, you could say: this disease belongs to the "Harrison" subset, i.e. is mentioned by the textbook by Dr. Harrison (while, lets say, its sub and superclasses are not).

@matentzn
Copy link
Contributor

From @mellybelly (OBOFoundry/OBOFoundry.github.io#1989)

We are getting closer than ever to having the OBO library ontologies work really well together and as we aim for orthogonality, we also need to aim to be able to generate slices of multiple ontologies for different applications. The creation of a common set of subset tags and registry for such artifacts might help with the proliferation of overlapping ontologies and promote reuse and alignment.

@dosumis
Copy link
Contributor

dosumis commented Jul 14, 2022

@mellybelly - can you expand a bit more on use cases for a common set of subset tags and how we might maintain them?

@cmungall
Copy link
Contributor Author

This issue is about putting the oio:inSubset and oio:Subset into OMO. Having a shared ontology for the subsets themselves is a good idea but should be a separate issue.

The case for including these two are clear, they are used in dozens of ontologies, is there any objection to adding these in OMO?

@matentzn
Copy link
Contributor

I don't think there is any big objection now! We can go ahead, see #123 on how to do it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
OBO Metadata
Property Standardisation
Development

No branches or pull requests

4 participants