Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update KG to comply with the emerging Translator KG spec #34

Closed
saramsey opened this issue Mar 26, 2018 · 37 comments
Closed

Update KG to comply with the emerging Translator KG spec #34

saramsey opened this issue Mar 26, 2018 · 37 comments

Comments

@saramsey
Copy link
Member

Here is a proposed minimal integration:
https://docs.google.com/spreadsheets/d/1zXitcR1QjHyh6WocukgshSR7IoAVg7MJQG-HNh96Jec/edit#gid=3366698

Here is a proposed maximal integration:
https://docs.google.com/spreadsheets/d/1zXitcR1QjHyh6WocukgshSR7IoAVg7MJQG-HNh96Jec/edit#gid=421374962

@saramsey saramsey self-assigned this Mar 26, 2018
@saramsey
Copy link
Member Author

DO THIS WORK IN A BRANCH

@dkoslicki
Copy link
Member

What would also be helpful is to have URL's/URI's/PURL's as node properties. Eg. our node "OMIM:603903" would have as a property (called URL/URI/whatever): "http://omim.org/entry/603903"

@dkoslicki
Copy link
Member

See #19

@saramsey
Copy link
Member Author

This has been requested by April 16

@saramsey
Copy link
Member Author

planning to do this work in branch "newkg"

@saramsey
Copy link
Member Author

I am working on issue # 34 (updating our KG to comply with the spec from Matthew Brush et al.). FYI, Neo4j doesn’t seem to allow a relationship type to have a space in it. And most of the predicates in the new spec have spaces in them. So I am mapping spaces to underscores, which are more neo4j-friendly.

If I am way off base in “interpreting” the spec in this way, please let me know.

@saramsey
Copy link
Member Author

saramsey commented Apr 1, 2018

I wrote:

Eric & David,

What is the actual specification for the properties of the nodes in the KG? (I have the relationships under control, I think).

I need a list of name-value pairs, like this:

name = CURIE identifier of the bioentity
iri = IRI of the bioentity
description = human-readable description

Thanks,
Steve

Response from @edeutsch:

I don’t know that this has been hammered out. Maybe we can set the standard. In my proposed API example and definition I have this:

{ "@id": "https://www.uniprot.org/uniprot/P00738",
"@type": "Protein",
"name": "Haptoglobin",
"symbol": "HP",
"accession": "P00738",
"description": "Haptoglobin captures, and combines with free plasma hemoglobin...",
"node_attributes": [
{ "@type": "",
"name": "",
"value": "",
"url": "",
} ]

I think we should align these if possible.

In the KG merging document (one of them anyway) is this:

node properties:
· id
· name
· description
· identifiers:
o e.g. DOID, HGNC, Orphanet
· metadata (JSON string)

Maybe a good discussion topic for Monday is to decide how we should set the standard.

What would really help the discussion is some real live example for various node types (i.e. robust examples with copious properties, not bare bones ones)

Eric

saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 1, 2018
saramsey added a commit that referenced this issue Apr 2, 2018
saramsey added a commit that referenced this issue Apr 2, 2018
@dkoslicki
Copy link
Member

@saramsey Did you see the question regarding s/participates_in/participates_in/g? participates_in wasn't in the old KG, so I don't know if you meant a different substitution here...

dkoslicki added a commit that referenced this issue Apr 10, 2018
dkoslicki added a commit that referenced this issue Apr 10, 2018
@saramsey
Copy link
Member Author

@dkoslicki the predicate "participates_in" is in the KG as of the completion of issue #13

see this version of BioNetExpander.py, lines 82 and 240: https://github.com/RTXteam/RTX/blob/master/code/reasoningtool/BioNetExpander.py

@saramsey
Copy link
Member Author

Not sure if the pre-issue-34 but post-issue-13 KG ever got pushed to rtx.ncats.io, or even rtxdev. But it did exist on rtxsteve, as shown in this screen capture from an email I sent to you on 3/26:
image

screen shot 2018-04-10 at 12 56 00 pm

@saramsey
Copy link
Member Author

@dkoslicki Does that clarify things?

@dkoslicki
Copy link
Member

@saramsey Yes, that makes sense. My code doesn't yet leverage GO (and it wasn't updated in rtx.ncats.io), so that's why I wasn't seeing it. So I think I'm good now

@saramsey
Copy link
Member Author

s/is_parent_of/subset_of/g (with direction of edge reversed)

@dkoslicki
Copy link
Member

@saramsey Got it! I'm about to do the refactoring, so let me know if anything about the following seems amiss. I'll wait to hit the button until I get confirmation (hard to go back after I do it):

s/\<uniprot_protein\>/protein/g
s/\<pharos_drug\>/chemical_substance/g
s/\<ncbigene_microrna\>/microRNA/g
s/\<reactome_pathway\>/pathway/g
s/\<anatont_anatomy\>/anatomical_entity/g
s/\<geneont_bioprocess\>/biological_process/g
s/\<phenont_phenotype\>/phenotypic_feature/g
s/\<omim_disease\>/disease/g
s/\<disont_disease\>/disease/g
s/\<disease_affects\>/affects/g
s/\<is_member_of\>/participates_in/g
s/\<gene_assoc_with\>/associated_with_condition/g
s/\<phenotype_assoc_with\>/has_phenotype/g
s/\<interacts_with\>/directly_interacts_with/g
s/\<controls_expression_of\>/regulates/g
s/\<is_expressed_in\>/expressed_in/g
s/\<targets\>/directly_interacts_with/g
s/\<controls_state_change_of\>/regulates/g
s/\<participates_in\>/participates_in/g
s/\<is_parent_of\>/subset_of/g
s|http://neo4j:precisionmedicine@rtx.ncats.io:7473/db/data|http://neo4j:precisionmedicine@rtxdev.saramsey.org:7674/db/data|g
s|bolt://rtx.ncats.io:7687|bolt://rtxdev.saramsey.org:7887|g

In particular, that's the right KG and bolt protocol, correct?

@dkoslicki
Copy link
Member

@saramsey @edeutsch From my perspective, the newkg branch can be merged into master (as my QuestionAnswering code is running without error). Shall we merge and close this issue soon?

@saramsey
Copy link
Member Author

saramsey commented Apr 14, 2018 via email

saramsey added a commit that referenced this issue Apr 24, 2018
@saramsey saramsey reopened this Apr 25, 2018
@saramsey
Copy link
Member Author

Team Orange has changed relationship properties to snake-case; updating the code now

saramsey added a commit that referenced this issue Apr 25, 2018
@saramsey
Copy link
Member Author

since description field is optional, marking this closed

see KG here:
http://rtxdev.saramsey.org:7674

saramsey added a commit that referenced this issue May 12, 2018
saramsey added a commit that referenced this issue May 23, 2018
saramsey added a commit that referenced this issue May 23, 2018
saramsey added a commit that referenced this issue May 23, 2018
@saramsey
Copy link
Member Author

latest changes to KG schema, from Matt Brush at the May hackathon in DC:

associated with condition => gene associated with condition
causes or contributes to => contributes to
directly interacts with => physically interacts with
directly interacts with => regulates
enables => capable of
is capable of => capable of
enables (anatomy to cellular component) => has part
participates in => involved in (GO)
participates in => participates in (reactome)

saramsey added a commit that referenced this issue May 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants