-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QC to ensure that if we provide evidence for a subset, the mapping must be exact #7689
Conversation
@matentzn I have not reviewed this since it the QC failed |
I assigned this to you because the QC needs to be fixed by a curator! It fails because of the test.. |
@matentzn I am not sure I am understanding the query correctly. What I think it is checking is to make sure that for the If that is what is happening, then why is this line an error: |
My best guess: This has already been fixed by some other PR? Else I also dont understand it. |
The OBO snippet I posted was from |
Here is another error that I think does make sense to report as an error: |
After the update of this branch with the latest |
Here are the remaining 15 errors and the relevant
|
Thanks! I would suggest we continue this after:
Some of the examples you found sound like real bugs in the query, but I cant pinpoint them right now. |
That plan sounds good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was another run of the mondo-ingest pipeline Tuesday (11-Jun) and some subset updates to Mondo (albeit it does not look like updates for ordo) so I updated this branch with master again and now the QC checks pass.
However, there are these remaining questions we did not have time to discuss at the Tech call on 5-Jun:
- Why is this SPARQL query only for “ordo_disease” and not the other two subsets related to Orphanet?
- Should obsolete terms be in an “ordo_disease” subset and also have an xref to Orphanet? See Sabrina’s comments:
"If a term is obsolete in Mondo, it doesn't make sense (to me) that it is in a rare disease subset (it would be like saying "this term does not exist anymore, but it is in a subset")." Updating the Orphanet rare disease subset #7681 (comment) - Are there any situations where a MONDO term would have an xref to Orphanet, but then that Orphanet ID not be a source for an Orphanet subset? Is this an issue with the SPARQL query?
See MONDO:0014017 that has an xref to "Orphanet:106 {source="OMIM:615032"}", but Orphanet:106 is not a source for the Orphanet subsets?
See OBO snippet below (from latest master
on 12-Jun, 9:40am PT):
id: MONDO:0014017
subset: ordo_disorder {source="Orphanet:642675"}
xref: Orphanet:106 {source="OMIM:615032"}
xref: Orphanet:642675 {xref="MONDO:equivalentTo"}
No particular reason other than that this was an important use case - ideally we add all other subsets to this qc check as well. Maybe just remove the
IMO: we should have a really, really good reason for any ORDO class in the
Hmmmmm. Yeah I guess that is possible. For example when there are two Orphanet mappings (proxy merge) and only one of them is in the ordo_disorder subset. Good question! |
?entity oboInOwl:hasDbXref ?xref . | ||
VALUES ?mondo_source { | ||
"MONDO:obsoleteEquivalent" | ||
"MONDO:equivalentTo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, from @sabrinatoro we should leave this as MONDO:equivalentTo
until the obsolete Mondo terms that are in an ordo subset (issue 7693) are reviewed.
After those are reviewed, this MONDO:obsoleteEquivalent
should be added back into the query (TODO: Add ticket for this work and link as step to #7693).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think the second line should have been "After those are reviewed, this MONDO:obsoleteEquivalent
could be removed from the query" since it's only needed for now while those obsolete terms are being reviewed.
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Chatted with Sabrina and both |
This now fails due to 1 proxy merge:
What's the best way to handle this? |
Removed xref and subset on obsolete term and added subset to correct/active Mondo term. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good.
I am approving and merging
…g must be exact (#7689) * Create qc-ordo-subset-exact-mapping.sparql * Update src/sparql/qc/mondo/qc-ordo-subset-exact-mapping.sparql * Update src/sparql/qc/mondo/qc-ordo-subset-exact-mapping.sparql Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com> * add back MONDO:obsoleteEquivalent * change annotation to source * fix proxy merge/qc issue * add ordo subset to correct term --------- Co-authored-by: Trish Whetzel <trish@tislab.org> Co-authored-by: Trish Whetzel <plwhetzel@gmail.com>
This QC check was created as a follow up to #7681
It ensures that, if a subset is declared for a term in ORDO the evidence for it (and ORDO code) must correspond to an exact mapping as well. So:
If
There must also be an exact mapping to
Orphanet:123
.