Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add upper- and lowercase prefix synonyms #969

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented Oct 30, 2023

Closes #935

This PR automatically adds both the upper- and lowercase variants of all prefix synonyms for each record. This makes it much more simple to create comprehensive EPMs (instead of having to refer on programmatic logic for matching)

Depends on

@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (488c954) 40.57% compared to head (6fdc65f) 40.90%.
Report is 16 commits behind head on main.

❗ Current head 6fdc65f differs from pull request most recent head c650c44. Consider uploading reports for the commit c650c44 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #969      +/-   ##
==========================================
+ Coverage   40.57%   40.90%   +0.32%     
==========================================
  Files         148      138      -10     
  Lines        8244     7916     -328     
  Branches     1910     1849      -61     
==========================================
- Hits         3345     3238     -107     
+ Misses       4690     4475     -215     
+ Partials      209      203       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cthoyt added a commit that referenced this pull request Nov 2, 2023
This PR gets rid of code that focuses on lists of `curies.Record`
objects and instead works directly with `curies.Converter` objects.

Along the way, this also identified issues with the data integrity on
MIRIAM, N2T, and Prefix Commons with respect to the TAIR resources
(`tair.gene` and `tair.protein`) which all used non-specific,
overlapping URLs. Therefore, these needed to get cleaned out before
being import.

Why do this? If we work directly with converters, we can make use of the
CURIE prefix reconciliation tooling to more cleanly refactor the
Bioregistry to Converter pipeline (which is causing issues when adding
prefix casing variants in a related PR #969)
@cthoyt
Copy link
Member Author

cthoyt commented Apr 22, 2024

The current issue why this isn't working is in the OBO context, the OBO synonyms are prioritized, which gives the Geographical Entity ontology the GEO prefix, but doesn't unprioritize the geo prefix for Gene Expression Omnibus since this logic is case-sensitive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve EPM export
1 participant