Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace CURIE inference mechanism with curies.Converter.from_extended_prefix_map #363

Closed
matentzn opened this issue Apr 4, 2023 · 3 comments · Fixed by #431
Closed

Replace CURIE inference mechanism with curies.Converter.from_extended_prefix_map #363

matentzn opened this issue Apr 4, 2023 · 3 comments · Fixed by #431
Assignees

Comments

@matentzn
Copy link
Collaborator

matentzn commented Apr 4, 2023

Right now, we are using the bioregistry bimap, and old bioregistry tooling for getting from CU:RIE string to URIs and back.

@cthoyt has now introduced the concept of "extended prefix maps" which are much more powerful for guessing and standardising prefixes then the bimap.

Here is a snippet:

import curies

#: OBO context's extended prefix map
url = "https://github.com/biopragmatics/bioregistry/raw/main/exports/contexts/obo.epm.json"
converter = curies.Converter.from_extended_prefix_map(url)

>>> converter.expand("GO:1234567")
'http://purl.obolibrary.org/obo/GO_1234567'

>>> converter.compress("http://purl.obolibrary.org/obo/GO_1234567")
'GO:1234567'
>>> converter.compress("http://amigo.geneontology.org/amigo/term/GO:1234567")
'GO:1234567'

>>> converter.standardize_prefix("gomf")
'GO'
>>> converter.standardize_curie"gomf:1234567")
'GO:1234567'
>>> converter.standardize_uri("http://amigo.geneontology.org/amigo/term/GO:1234567")
'http://purl.obolibrary.org/obo/GO_1234567'

We should replace the old way of doing this with the new one.

For now, I still want to mirror the prefix map as part of the package rather than relying on bioregistry to doing the hosting/versioning.

@matentzn
Copy link
Collaborator Author

This is a mid-priority refactoring that can be done as a subsequent PR to #396

@matentzn
Copy link
Collaborator Author

As per discussion in #396, we can replace everything, including the built-in prefix map, with curies.

Hwoever, I would like to ensure we have a test in place in the testing framework that ensures that whenever the epm is updated, the built-in prefixes are exactly what we expect them to be (which is what is in sssom context).

hrshdhgd added a commit that referenced this issue Jul 27, 2023
Part of #363 

This PR does the following:

1. Adds a minimum version of `curies` that has the strict compress and
expand functions
2. Rewrites the SPARQL utils and RDF utils to use `curies` functionality
3. Updates custom `curie_from_uri` to use `curies` (will make a
follow-up PR that replaces this completely)

---------

Co-authored-by: Harshad Hegde <hegdehb@gmail.com>
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
@cthoyt
Copy link
Member

cthoyt commented Sep 27, 2023

@matentzn this is effectively done besides #431

cthoyt added a commit that referenced this issue Oct 2, 2023
Closes #363 (final nail in the coffin)

This provides an alternative to
#429 that makes more
explicit the chaining operations done on the metadata and prefix maps.
This is also a good change to carefully document the way that this is
handled, since I might not have captured it accurately. As it is, The
priority order for combining prefix maps are:

1. Internal prefix map inside the document
2. Prefix map passed through this function inside the ``meta``
3. Prefix map passed through this function to ``prefix_map``
4. Default prefix map (handled with ensure_converter)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants