Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need functions for determining term-to-term relatedness #206

Open
selewis opened this issue Aug 3, 2018 · 10 comments
Open

Need functions for determining term-to-term relatedness #206

selewis opened this issue Aug 3, 2018 · 10 comments
Assignees

Comments

@selewis
Copy link
Contributor

selewis commented Aug 3, 2018

Annotations vary considerably in precision, but for conciseness and to determine coverage, we need to be able to answer basic graph traversal questions. For example, given two terms is one of them a subclass of the other? Or what is the closest common parent term of two terms. Right now this functionality is missing and we're dealing with work-arounds or it's completely holding things up.

@selewis selewis changed the title Need a function for determining term-to-term relatedness Need functions for determining term-to-term relatedness Aug 3, 2018
@lpalbou
Copy link
Contributor

lpalbou commented Aug 3, 2018 via email

@selewis
Copy link
Contributor Author

selewis commented Aug 3, 2018

Quite nice, but doesn't really quite fit the bill yet. You would still have to traverse the graph in this JSON structure to answer the simple t/f question of 'is A a subclass of B' or conversely 'is B a subclass of A'. Plus would also be useful to have 'what is the closest parental term shared by A and B'. Burying all of the repetitive traversal stuff down inside the server code.

Be great to have this in BioLink

@lpalbou
Copy link
Contributor

lpalbou commented Aug 3, 2018

Correct, this query is for general purpose but I should be able to create the two specific queries you mentioned by next week.

@selewis
Copy link
Contributor Author

selewis commented Aug 3, 2018

Is this just is_a relations?

Also need to know if a term is flagged as 'do_not_manually_annotate' or 'do_not_annotate'

@selewis
Copy link
Contributor Author

selewis commented Aug 3, 2018 via email

@deepakunni3
Copy link
Member

@selewis Yes, @lpalbou and I had a quick chat.

We can add couple of routes to biolink-api that gives a more direct answer as opposed to the JSON graph normally returned.

@lpalbou
Copy link
Contributor

lpalbou commented Aug 20, 2018

@selewis sorry, I am a bit late on this but I have deployed a route this morning to answer your first question:

http://api.geneontology.cloud/association/subclass/{goid1}/{goid2}
=> return true if and only if goid1 is_a or part_of goid2 (the question is oriented)

I have also deployed a sharedclass route:
http://api.geneontology.cloud/association/sharedclass/{goid1}/{goid2}
=> return the terms (derived from is_a and part_of) that two terms share

To answer the closest common parent of two terms, do you want parents from both is_a and part_of relations ? Note this query could return several parents (example)

I am waiting for a PR on ontobio (biolink/ontobio#217) but if this looks good to you, I'll do a second PR to deploy these routes on BioLink. Following BioLink syntax, they will be mapped respectively to (@cmungall your opinion ?) :

  • /association/between/{goid1}/{goid2}/subclass
  • /association/between/{goid1}/{goid2}/sharedclass

Notes:

  • the /association/between/ route description will need some modification as it was only described for gene and disease associations
  • also, instead of adding /subclass or /sharedclass as path parameters, we could pass them as string parameters to keep the path clean, let me know your preferences
  • behind the scene, I am calling golr with queries such as:
    http://golr-aux.geneontology.io/solr/select?fq=document_category:%22ontology_class%22&q=*:*&fq=id:%22GO:0030182%22&fl=isa_partof_closure,isa_partof_closure_label&wt=json

@selewis
Copy link
Contributor Author

selewis commented Aug 20, 2018

Be nice if the first one would provide a way to indicate which relationships to follow. Like Deepak (I think) did for the slimmer code.

For the second, yes return all of them. If possible it would be useful to know the route taken to get there for each of the two children.

@lpalbou
Copy link
Contributor

lpalbou commented Aug 20, 2018

@selewis I also saw your question about 'do_not_manually_annotate' or 'do_not_annotate' tags. There is no specific route for this question only, but you can see if those tags are present in the subsets section of this general go-term query: https://api.geneontology.cloud/go/GO_0036288

(will be available on BioLink when PRs merged)

@lpalbou
Copy link
Contributor

lpalbou commented Aug 21, 2018

@selewis I have updated the API to be more consistent with BioLink syntax and to determine if two terms are related for any of is_a, part_of or regulates relationships:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants