-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KG V2.0.0 - Finalizing Knowledge Representation #17
Comments
Also: protein-complex, RO_0002436 |
Good catch, I will add to the original issue, thanks! |
@ignaciot and @bill-baumgartner - the updated KR is shown below (click on image to enlarge). Note that the main data types are ontologies (yellow), open data sources (purple), and experimental data (blue). Note, that this has been verified by Adrianne as well. You will notice that I have added the cell ontology and BRENDA in addition to experimental data in order to satisfy a component of my comps -- creating a KG, which actually includes the central dogma. GTEx is a great source to start with since it includes many disease types and has the results of both microarray and RNA-seq (for several types of samples), and includes connections to phenotypes. Anywho, happy to talk about this more this afternoon! |
This is fantastic! Yes, let’s chat this afternoon. |
UPDATES: Incorporating feedback from @bill-baumgartner and @LEHunter resulted in the updated KR shown below. Some important questions that I would like help answering:
|
Yup, that is the reason I chose Maybe we could generate two graphs, one with the inverse relations where applicable and another without, to assess how much it could mess with random walk-based algorithms? I do suspect it may affect the node2vec results. The rest looks fine to me (BTW, I like the new diagram!). |
OK, great. I also agree about the protein-protein interactions. Maybe we do this:
Yep, that's what Bill and I were thinking too :D.
Great, thanks for your feedback! |
Thanks, great! I think |
And for chemical-gene is probably better to leave it as the more generic |
I suggest getting Mike to weigh in on these. I would like to be consistent with what he is doing with CRAFT for relations.
On Nov 26, 2019, at 3:26 PM, Tiffany J. Callahan <notifications@github.com<mailto:notifications@github.com>> wrote:
Yup, that is the reason I chose molecularly interacts with (binding). Looking at the definition of that other relation (An interaction that holds between two genetic entities (genes, alleles) through some genetic interaction (e.g. epistasis)) I don't think that is appropriate for protein-protein interactions. It's probably fine for gene-gene interactions.
OK, great. I also agree about the protein-protein interactions. Maybe we do this:
* gene-gene genetically interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002435>
* protein-protein interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002434>
* chemical-gene interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002434>
* protein-cofactor/catalyst molecularly interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002436>
* chemical-complex molecularly interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002436>
* protein-complex molecularly interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002436>
* complex-complex molecularly interacts with<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002436>
Maybe we could generate two graphs, one with the inverse relations where applicable and another without, to assess how much it could mess with random walk-based algorithms? I do suspect it may affect the node2vec results.
Yep, that's what Bill and I were thinking too :D.
The rest looks fine to me (BTW, I like the new diagram!).
Great, thanks for your feedback!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#17?email_source=notifications&email_token=AACWZKP26NURHFGKTVCT3V3QVWPANA5CNFSM4JBOGE32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFHUT5A#issuecomment-558844404>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACWZKOLYRZKK47H3G3ZB5DQVWPANANCNFSM4JBOGE3Q>.
|
On Nov 26, 2019, at 12:38 PM, Tiffany J. Callahan <notifications@github.com<mailto:notifications@github.com>> wrote:
1. @LEHunter<https://github.com/LEHunter> - @bill-baumgartner<https://github.com/bill-baumgartner> and I were talking about the possibility of adding the inverse properties. What are your thoughts on this? Do you think it would have negative ramifications for things like random walk?
Inverse relations are a good idea. You can try random walk with and without them to see if it’s having any impact. Maybe also approaches like prohibiting the walk from traversing an edge twice.
|
OK, I will reach out to Mike. Thanks @LEHunter! |
@LEHunter - I'd like to substitute the following
|
Seems reasonable to me, but please do check with Mike. It’s important to be consistent with CRAFT. And Mike has thought a lot about this stuff
L
On Nov 26, 2019, at 5:53 PM, Tiffany J. Callahan <notifications@github.com> wrote:
@LEHunter<https://github.com/LEHunter> - I'd like to substitute the following BFO terms for the proposed RO terms, do you agree?:
* BFO:realizes<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000055> to RO:realized in response to<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0009501>
* EXAMPLE: Biological Process realized in a pathway
* BFO:has component<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000055> to RO:has function<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0000085>
* EXAMPLE: Pathway has function molecular function
* BFO:has part<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000051> to RO:has component<https://www.ebi.ac.uk/ols/ontologies/ro/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FRO_0002180>
* EXAMPLE: Pathway has component cellular location
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#17?email_source=notifications&email_token=AACWZKNGQ66QLKK6HBM4MCDQVXAI3A5CNFSM4JBOGE32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFH5K2I#issuecomment-558880105>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACWZKPLUDL7DURSSB4S7XTQVXAI3ANCNFSM4JBOGE3Q>.
|
Sounds good, I will follow-up with him. Thanks! |
UPDATE: Mike has been emailed to ask about the BFO-RO and interaction triples. In the meantime, I am going to move forward with the representation shown below. Will also create a Bada-version 😉, once I hear back from him. NOTE. For space reasons, I am not showing all edges with labels, but am suggesting there are inverse edges via the inclusion of a dotted line. |
Turns out the relation I was thinking of is in the Sequence Ontology and not the RO: https://www.ebi.ac.uk/ols/ontologies/so/properties?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2Fso%23variant_of |
Thanks for letting me know! In that case, unless you are opposed, I will keep |
@ignaciot - would mind telling me how you found these?
|
These came from Uniprot, and they are sourced from the IntAct database monthly (so it would be a good idea to keep those from STRING as well).
All of those came from Reactome, where they specified the participants (from Uniprot or ChEBI) to pathways, complexes and reactions. |
Thanks! Update on protein-protein interactions. It looks like we will only need STRING as they already cover the data in IntAct (see screenshot below) 🎉
Perfect! I will update the file parsers. |
Awesome! One less thing to worry about, then. |
Yup! That makes sense. |
@ignaciot - can you confirm that you agree with how I am building the FILE DATAFile Columns: Example File Output:
BUILDING EDGESProtein-Complex:
Example from file (from table above):
Complex-Complex:
Example from file (from table above):
|
These all look correct! |
Good news, the draft of the sources of data for the edges and documentation of sources are complete. The edge counts will be updated and the files listed on the release page will be added as the KG is built. @ignaciot - would you mind taking a gander at the following pages and let me know if anything seems incorrect?
What's Left before KG Build:
Once I confirm a few last details with @bill-baumgartner tomorrow (who was super helpful today, thanks Bill!), I will begin the build! |
This looks AWESOME!! I'm glad those Uniprot/Reactome triples resulted to be useful, can't wait to play with the built graph. I went through all of the above and didn't see any errors. Thanks for adding those inverse relations, too! Happy to help check the items above that need testing. Maybe we could think of a set of unit tests to write for each build, too (can wait until the next subrelease). |
Thanks so much for your help @ignaciot! I think writing tests is a great idea. I also think it would be great if we added continuous integration. Perhaps we can chat more about this on Monday? |
@bill-baumgartner and @ignaciot - the human version of the PRO is finally done! 🎉 Things to keep in mind:
If you'd like to use it, you can download the closed and unclosed versions here: @LEHunter and @bill-baumgartner - should we offer this version and/or the script used to create it to the ProConsortium? 📢Now that the core ontologies are good to go, I will begin building KG. Updates to follow! |
Awesome!! This is really cool. I take it the |
Thanks! 😄 After reading a bunch of articles, I chose to leave both types in for now as this is the way to create the most "correct"/authentic version of the human PRO. There is a way to keep 1 single component with removing each type. The |
Oh, absolutely, I don't think this is a reason to halt building the next KG version! I'm excited to try this out once it's built! |
Awesome! I’ll let you know when it’s ready for you! |
Extending Knowledge Representation for current KG
Current Release: v2.0.0
Description
Adding the following entities/data sources to the current KG build:
TODO 📋 💻 📝
@callahantiff Due Dates:
The text was updated successfully, but these errors were encountered: