Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a metadata tag to indicate whether an ontology adopts terms or puts terms up for adoption #2324

Open
cmungall opened this issue Mar 16, 2023 · 18 comments
Labels
ontology metadata Issues related to ontology metadata vote Issue that is open to voting (by whom?)

Comments

@cmungall
Copy link
Contributor

IMPORTANT please read the whole issue and linked issues before commenting.

Proposed Terminology

For the purposes of this issue, adopting a class/term means to take ownership of that term, retaining the original ID/URI, which has a different prefix/namespace

Adoption

An example scenario:

  1. OMIT creates a term "tentacle development", with ID OMIT:1234
  2. it later transpires that this term was out of scope for OMIT and in scope for GO
  3. A coordinated series of actions then occur:
    1. OMIT removes OMIT:1234 from its edit file
    2. GO takes ownership of OMIT:1234. The term is moved to the GO edit. It gets a rdfs:isDefinedBy with value go
    3. its is placed in the appropriate place in the GO hierarchy (e.g. under "appendage development")
    4. The term OMIT:1234 is present in both the full GO release and base file release
    5. OMIT imports/mireots OMIT:1234 from GO
    6. the adoption is recorded in https://github.com/OBOFoundry/purl.obolibrary.org/blob/master/config/obo.yml OR the omit.yml
    7. this means the PURL http://purl.obolibrary.org/obo/OMIT_1234 would redirect to the native GO browser, e.g emigo

In this hypothetical scenario, we say GO "adopts" OMIT:1234, and OMIT:1234 was "adopted by" GO (or "donated to" GO)

An example of this is scenario is exemplified by a handful of RO terms with BFO IDs. See this explanation

If the above scenario is only carried out partially, we say "partial adoption"

An example of partial adoption is the partial adoption of
http://purl.obolibrary.org/obo/CHMO_0000089 by OBI. The term is officially owned by OBI (as far as I know, but I don't know how this is determined). However, the URL resolves to the CHMO ontology, which is a year old, and may potentially not have the latest OBI changes.

(as far as I know, apart from the RO/BFO situation, all adoptions in OBO are currently partial)

Clearly for this scenario to unfold there must be mutual agreement on this side of both ontologies. In the above non-real scenario, we might say GO is open to adopting terms and OMIT is open to donating.

Obsolete and Replace

An alternative scenario:

  1. OMIT creates a term "tentacle development", with ID OMIT:1234
  2. it later transpires that this term was out of scope for OMIT and in scope for GO
  3. A coordinated series of actions then occur:
    1. GO adds a new term "tentacle development" and gives it an ID GO:9876
    2. OMIT obsoletes OMIT:1234 and adds a "term replace by" annotation pointing to GO:9876

I propose calling this "cross ontology obsolete-and-replace".

A variant of this scenario is when both ontologies independently created terms with the same meaning in their respective namespaces, and one is obsoleted in favor of another. An example of this is when GO obsoleted GO:cell (http://purl.obolibrary.org/obo/GO_0005623) in favor of CL:cell.

Summary of positions

It is clear that different groups within OBO fall within different camps. Some of the discussion is spread across different issues in different repos. Please read before commenting:

Summarizing from the above proponents of adoption believe

  • terms are only deprecated if they are meaningless
  • if we do not allow adoption it breaks people's data
  • it is against web architecture

proponents of obsolete-and-replace believe

  • adoption confuses users
  • implementing adoption increases complexity of the PURL system
  • automatically migrating IDs can be done trivially e.g. with robot migrate or simple scripts, and in cases where legacy data cannot be migrated the cost is low (e.g. users can easily follow the link)
  • the confusion and complexity costs outweigh the downsides of having some legacy data point to an obsolete URI
  • the need for adoption mostly arises from ontologies that are poorly scoped, and fixing https://obofoundry.org/principles/fp-005-delineated-content.html would remove the majority of cases where this occurs.
  • adoption is completely against the norms of all major database resources

Feel free to add your own perspective below (but read the existing issues first).

I am personally against adoption; but even if I were not, I can state with near certainty for every ontology I contribute significantly to or am funded for, every curator, every other PI, and every non-ontologist user of the ontology would be firmly opposed to and confused by adoption. However, I am happy to take a proposal to these groups or broker a conversation.

Similarly, there will be many in OBO who believe firmly in adoption, and who will want to make mutual adoptions between like-minded parties.

Path forward with OBO metadata

With the assumption that there will be groups in both camps for some time to come, the best thing we can do is allow ontologies to be transparent about their policies in the OBO metadata. I think the simplest way to do this is to add two new boolean fields:

  • willing_to_adopt_external_terms
  • willing_to_donate_external_terms

The advantage of this is that it can save unnecessary work and frustration. We now have a situation in OBO whereby for example (again, hypothetical) the OMIT developers may choose to mint many development terms in the mistaken belief that GO will adopt them, when in fact GO has no such intention. The OMIT developers may then be annoyed and frustrated when GO rejects the request, or when GO mints their own IDs for the equivalent concepts.

Similarly, GO sometimes obsoletes terms when they become out of scope. We do not want people adopting the GO IDs, but we can accommodate putting external ontology IDs in the replaced-by field. It would be good to be transparent about this policy in a standard way to reduce frustration for people who use these IDs.

I propose the fields are interpreted to mean willing to adopt/donate in the future. So for example, in the past RO and OBI have both adopted. Many of the editors of both these ontologies believe this to have been a mistake, so it would still be valid for them to state "false" for both fields if they are willing to keep grandfathered adopted terms (errm, mixed metaphors..), but are no longer accepting new ones.

It would also be useful to coordinate with developers of ontology portals to clearly communicate to users who the owner of a term is, but this issue is about the ontology metadata page.

We may want to allow additional metadata fields to indicate criteria for when adoption is allowed. So for example, perhaps IAO may be willing to adopt terms from a small subset of ontologies but they may not be willing to adopt terms that have been randomly created in poorly scoped and poorly designed ontologies. However, it may be sufficient to communicate this in free text.

It could be argued that adding these boolean metadata fields is overkill, that it should be obvious, or that ontologies should at least coordinate in advance if scoping is unclear or there is anticipated need for future adoption. But this clearly hasn't worked, OBO is a broad group with individuals with a multitude of different perspectives, so I think it best to be clear and transparent.

@jonquet
Copy link

jonquet commented Mar 16, 2023

(I have not read all in detail) but on this

In this hypothetical scenario, we say GO "adopts" OMIT:1234, and OMIT:1234 was "adopted by" GO (or "donated to" GO)

Be sure to check PROV-O and/or PAV properties that could be relevant to represent this.

@nlharris nlharris added ontology metadata Issues related to ontology metadata vote Issue that is open to voting (by whom?) labels Mar 16, 2023
@hoganwr
Copy link
Contributor

hoganwr commented Mar 17, 2023 via email

@alanruttenberg
Copy link
Member

It's shortsighted to legislate this. We're building for the long run, longer than the current developers are going to be involved. This is an unnecessary invitation to exercise power unwisely. If there's a need to adopt or to give up terms that should be left to the developers of the time. The guiding principles are that we want good management and orthogonal ontologies and stability of identifiers. During the development of ontologies its not infrequent that one has to define a term in a domain that isn't developed yet. Or, an ontology may become large enough that it's profitable to delegate management of some branch to another group. Once can imagine a variety of scenarios.

I fail to see how this would even be useful under any circumstances.

@matentzn
Copy link
Contributor

It seems this is one of the practices across OBO that causes significant disagreement - wouldn't it be good to provide a way to simply name, from the perspective of an ontology, which practice they follow so that we have some grounds to start a conversation? This is not normative or anything, just capturing the current practice, and if future owners of the ontology will change their strategy, that is no problem at all? While I am also against legistlating anything, I think capturing the different strategies in the metadata is a useful basis for a future workshop on the subject.

@alanruttenberg
Copy link
Member

This proposal would ends the conversation before it starts, instead of considering the issue on its merits.

@matentzn
Copy link
Contributor

Getting everyone to document their position publicly would end the conversation? How?

@alanruttenberg
Copy link
Member

So we license any bad practice as long as it's disclosed? It's bad practice to change IRIs and it's bad practice to duplicate IRIs. These are principles that were there from the beginning of the effort. This sort of thing shouldn't be a matter of opinion.

@matentzn
Copy link
Contributor

I don't know Alan. It's pretty clearly a trade-off. Bad practice according to you, good practice according to others. There is no universal truth here. I respect your opinion, but please provide an alternative path to further this discussion in a constructive way.

@alanruttenberg
Copy link
Member

alanruttenberg commented Mar 20, 2023 via email

@matentzn
Copy link
Contributor

Ok, how do we proceed? Everyone adopting your position?

@wdduncan
Copy link
Member

In general, I am apprehensive about changing IRIs unless the semantics have been changed.

That being said, do we have an metrics on the scope of the problem?

  • How many ontologies (or ontology groups) are affected?
  • What is the cost of leaving things as is vs. adopting a new policy?

@wdduncan
Copy link
Member

I want to clarify my comment. It was not meant to be hostile. I apologize to those of you who interpreted it that way.

I just wanted us to pause and try (if that is possible) to assess the level of importance regarding this issue.

  • Is it mission critical?
  • Does having these tags need to be an OBO wide policy? Or is it something that is nice to have?
  • If an ontology doesn't adopt the metadata tags, is there a default interpretation?
  • If the GO community decides that it will only adopt another ontology's terms if it can mint a new IRI, then does the rest of Foundry have to also make its preference known?

@cmungall
Copy link
Contributor Author

Thanks @wdduncan!

To clarify, my proposal was not to make the field a required field in json schema. Like most of the metadata fields we have, this would be optional. I was not proposing to create any new policies (except insofar as a statement that not having a policy is a statement of policy)

If an ontology doesn't adopt the metadata tags, is there a default interpretation?

An absence of a value would mean that we have no information on whether the ontology is willing to donate or accept terms. Perhaps the information is not curated, perhaps the ontology does not wish to declare this, perhaps they are still discussing it internally, perhaps the maintainers are even unaware that adoption/donating is a thing.

Is it mission critical?

I suppose not (I am not sure how many of our current 48 fields are).

However, I think the entire OBO community would vociferously agree that identifiers, PURLs, and identifier lifecycle are important issues and deserving of the attention of OBO operations.

Unfortunately, we do not all agree what that means in practice when it comes to transferring IDs across ontologies. This can be seen in the same discussions that repeatedly crop up in different issue trackers. In the absence of any one best practice across OBO, we owe it to our users, to developers of existing ontologies, and to developers of new ontologies to be as clear and transparent as we can be.

@ddooley
Copy link
Contributor

ddooley commented Mar 21, 2023

A circumspect side note that this problem exists because of a technical decision to bake into the IRI directly the ontology it originates from e.g. obo:FOODON_12345670, rather than at the community level (e.g. obo:OBOLIB_000012345670) where a community considers a term a part of its language, and makes consensual decisions about which ontology team curates it over time. Some day a curation platform will provide a purl resolver that returns not just the semantics of a term, but also the current ontology curation team that manages it; this just requires a trustworthy centralized dispenser of new ids on request by curators, like DOI or ISBN do. (A future fantasy is that OBO could be transitioned to this via an overlay.)

@alanruttenberg
Copy link
Member

@ddooley I've been thinking the same thing. When we started up I had colleagues who suggested following how handles work. There's less emotional attachment to a number than a meaningful string. What this discussion has reinforced is that if you give an identifier even a smidgen of semantics, that bit will be a magnet for making unnecessary changes. If I had to do it over again, knowing what I do now, I would do as you suggest.

I'm not keen on the idea of an overlay, but there's no technical reason that starting today we switch to the scheme you suggest. It's not like its rocket science to represent and use an association between identifier and manager.

@gouttegd
Copy link
Contributor

do we have an metrics on the scope of the problem?

I second that question.

And relatedly:

There is no new information that has entered the picture that is even close to worth considering compared to the damage this practice causes.

I’ve got a glimpse first-hand of the problems caused by adoption (more than enough for my taste), but I have not experienced (yet!) any issue caused by obsolete-and-replace. Any pointers where I could learn more about said damages? Discussions about past cases or things like that?

The only case I know about is the one mentioned in the ticket (GO:00005623 -> CL:0000000), and I am not aware that it has caused huge issues (but admittedly, this happened before my time, so maybe I just don’t know where the bodies are buried!).

@wdduncan
Copy link
Member

Following up on the proposal (in the example) to:

GO takes ownership of OMIT:1234. The term is moved to the GO edit. It gets a rdfs:isDefinedBy with value GO

Perhaps we should automatically do this for every ontology class as part of the release process?

@zhengj2007
Copy link
Contributor

Regarding the adoption approach, I think we don't have a SOP yet and it might be not very simple.

Using APOLLO_SV_00000033 'counting' as an example. Current situation:

  1. APOLLO_SV_00000033 'counting' is in IAO and suppose to maintain the term.
  2. http://purl.obolibrary.org/obo/APOLLO_SV_00000033 redirect to APOLLO_SV rather than IAO.
  3. APOLLO_SV_00000033 'counting' is in APOLLO_SV as well and but not import from IAO.
  4. OBIB has the APOLLO_SV_00000033 'counting' that is imported from APOLLO_SV rather than IAO.

For adoption, I think we need to

  1. Indicate APOLLO_SV_00000033 'counting' is an adopted term in IAO. It seems we have reach agreement to use 'rdfs:is defined by'.
  2. http://purl.obolibrary.org/obo/APOLLO_SV_00000033 should redirect to IAO.
  3. APOLLO_SV should import the term from IAO since APOLLO_SV gave the term to IAO. Otherwise, if IAO modify the APOLLO_SV_00000033 'counting', the APOLLO_SV won't update the term accordingly.
  4. OBIB should import the term from IAO.
    However, how to notify all ontologies importing APOLLO_SV_00000033 that IAO adopted the APOLLO_SV_00000033 'counting' and to import the term from IAO?

Maintaining an adopted term is an issue unless the term won't change after adoption. That is why I prefer the migration if the term is not widely used like 'part of'/'has part'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ontology metadata Issues related to ontology metadata vote Issue that is open to voting (by whom?)
Projects
None yet
Development

No branches or pull requests

10 participants