Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KG2.8.0 edges with a qualified_object_direction, but no qualified_predicate #250

Closed
amykglen opened this issue Dec 14, 2022 · 5 comments
Closed
Assignees
Labels

Comments

@amykglen
Copy link
Member

amykglen commented Dec 14, 2022

I noticed that this edge in KG2.8.0c (which comes from the KG2.8.0pre edge with ID UMLS:C0043552---SEMMEDDB:stimulates---None---None---increased---UMLS:C0006809---SEMMEDDB:) has a qualified_object_direction, but no qualified_predicate:

{
   "predicate":"biolink:regulates",
   "knowledge_source":[
      "infores:semmeddb"
   ],
   "publications_info":"{'PMID:33316364': {'publication date': '2020 Dec 11', 'sentence': 'Both, EOOK and camphor inhibited all articular parameters induced by zymosan.', 'subject score': 1000, 'object score': 1000}}",
   "kg2_ids":[
      "UMLS:C0043552---SEMMEDDB:stimulates---None---None---increased---UMLS:C0006809---SEMMEDDB:"
   ],
   "subject":"MESH:D015054",
   "qualified_object_direction":"increased",
   "id":"27104526",
   "object":"CHEMBL.COMPOUND:CHEMBL1097205",
   "publications":[
      "PMID:33316364"
   ]
}

I don't think this is valid, at least according to the predicate transformations Sierra provided? those transformations seem to suggest this edge should have a qualified_predicate of "causes" and an object_aspect of "activity_or_abundance".

it looks like there are 2.4 million such edges in KG2.8.0c (that have a qualified_object_direction but do not have a qualified_predicate) based on this neo4j query:

match (n)-[e]->(m) where e.qualified_predicate is null and e.qualified_object_direction is not null return count(distinct e)

based on the KG2pre ID of the specific edge reported above, it seems that the qualified_predicate was already missing in KG2pre vs. being lost during the KG2c build process, so that's why I'm writing up this issue in this repo.

@amykglen amykglen added the bug Something isn't working label Dec 14, 2022
@amykglen
Copy link
Member Author

amykglen commented Dec 14, 2022

it looks like all of the 2.4 million affected edges have the predicate 'regulates':

match (n)-[e]->(m) where e.qualified_predicate is null and e.qualified_object_direction is not null return distinct e.predicate, count(distinct e)

I wonder if it would be worth adding a temporary patch in the KG2c build code that fixes this (by simply giving all such edges a qualified_predicate of "causes" and an object_aspect of "activity_or_abundance")... otherwise there appear to be only 15k 'qualified' edges in KG2.8.0 that could be returned for queries that use qualifiers.

perhaps @sundareswarpullela would be interested in writing such a patch?

@sundareswarpullela
Copy link
Collaborator

sundareswarpullela commented Dec 15, 2022

Sure! I can pick this up @amykglen

@acevedol
Copy link
Collaborator

Thank you for the alert, Amy! I'll make the correction in KG2pre

@amykglen
Copy link
Member Author

great, thanks @sundareswarpullela! I created an issue in the RTX repo for the patch so you can link your commits to it: RTXteam/RTX#1942

@acevedol
Copy link
Collaborator

A lot of entries in predicate-remap.yaml were missing qualified_predicate, so I added those. Hopefully, this solves this issue. I will verify in next build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants