Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

id_prefixes: allow for open vs closed and additional metadata #194

Open
hsolbrig opened this issue Apr 30, 2021 · 15 comments
Open

id_prefixes: allow for open vs closed and additional metadata #194

hsolbrig opened this issue Apr 30, 2021 · 15 comments
Assignees
Labels
prefixes Related to prefixes
Milestone

Comments

@hsolbrig
Copy link
Contributor

Right now, any id that instantiates a class should have a curie in the set of id_prefixes class. This needs to be incorporated into the id loader

@hsolbrig
Copy link
Contributor Author

See disease class in biolink model

@hsolbrig
Copy link
Contributor Author

We need to decide whether id_prefixes should be interpreted as a "should" or a "must". Note: Tie this in to the issue about "strict" ...

@cmungall
Copy link
Member

See microbiomedata/nmdc-metadata#308 for use case

@hsolbrig
Copy link
Contributor Author

Decision: id_prefixes can include "..." as last element, which says informative vs. check

@cmungall
Copy link
Member

Proposal discussed on linkml call:

function:
  id_prefixes:
     - KEGG.ORTHOLOG
     - GO
     - NOTGO
  id_local_part_syntaxes:
     KEGG.ORTHOLOG: "^K\d+$"
     GO: "^\d{7}$"
     NOTGO: "^F\d+"
process:
  id_prefixes:
     - NOTGO
  id_local_part_syntaxes:
     NOTGO: "^P\d+"

@wdduncan
Copy link
Contributor

In the above example, the id_prefixes operate at the class level. So, I suppose that two different classes could have the same id_prefixes list but different id_local_part_syntaxes.

Do you want to allow for the id_local_part_syntaxes to also be declared at the level of the schema?

@cmungall
Copy link
Member

@wdduncan good point. The idea was that different classes may have different rules. E.g. ZFIN has different local id syntaxes for genes, genotypes, etc. But in many cases the same format is used across different types.

Note for these cases we can leverage bioregistry.io

@cmungall
Copy link
Member

It seems we have walked back on the idea of using ...s to indicate open id prefixes.

Do we have an alternative proposal?

note that id_prefixes is now a list, so we can't embed something within it, so it would have to be a sibling key, e.g.

 class: Foo
   id_prefixes:
      - GO
      - MESH
      - NCIT
    id_prefixes_meta:
       closed: True

we could also have this at the level of the whole schema, so that by default we could have all of them be open or closed.

@cmungall
Copy link
Member

cmungall commented Jun 25, 2021

another piece of information is whether the idprefixes should be inherited or not

see biolink/biolink-model#789

@sierra-moxon
Copy link
Member

I would also be interested to hear other's thoughts on if id_prefixes field should be allowed on an abstract class.

@sierra-moxon
Copy link
Member

and can a logical assumption be made that a parent class id-space is the sum of its children's id-spaces?

@cmungall
Copy link
Member

Discussed on todays call

  • semantics of open vs closed. If it is open this is considered as ISO SHOULD, i.e. generate warnings but still be valid
  • we prefer having ... but we would have to change the range of id_prefixes to be a string

default semantics of id_prefixes and inheritance. Proposal:

  • add inherited: true to id_prefixes, such that id prefixes are inherited by subclasses

@hsolbrig
Copy link
Contributor Author

hsolbrig commented Jul 5, 2021

See closed issue linkml/linkml-model#28 for further discussion

@cmungall cmungall changed the title Implement id_prefixes id_prefixes: allow for open vs closed and additional metadata Sep 9, 2021
@nlharris nlharris added the prefixes Related to prefixes label Oct 13, 2022
@cmungall
Copy link
Member

cmungall commented Dec 2, 2022

alternate proposal from @hsolbrig:

Use ... generically for all multivalued slots to interpret the list as being open. This is potentially useful for:

  • open lists of slots on a class
  • open lists of permissible values on an enum [todo - link to issue]

@cmungall
Copy link
Member

cmungall commented Apr 14, 2023

Notes from today's call: we are rejecting the ... option

We discussed 3 options

option 1, allows for more metadata:

class: Foo
   id_prefixes:
      - GO
      - MESH
      - NCIT
    id_prefixes_meta:
       closed: true

option 2: direct

class: Foo
   id_prefixes:
      - GO
      - MESH
      - NCIT
    id_prefixes_closed: true

option 3:

class: Foo
   id_prefixes:
      - GO
      - MESH
      - NCIT
    reified_slot_properties:
       id_prefixes:
         closed: true

We decided on 2.

Note that open is the default (but validators may choose to emit warnings if prefix not in the list)

We also discussed use cases around profiling - models intended to be reused in many contexts like biolink would leave these open but profiles could close them

@cmungall cmungall modified the milestones: 2021-12-01, 1.7 Release Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prefixes Related to prefixes
Projects
None yet
Development

No branches or pull requests

5 participants