Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added GO aspect convenience methods for #203 #281

Merged
merged 2 commits into from
Jan 24, 2019
Merged

Conversation

dustine32
Copy link
Collaborator

Let me know if these are in the wrong place (ontol.py) or they're just, plain wrong.

Copy link
Contributor

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put these in a separate static/non-OO convenience module called something like go_utils

I'd also rename get_ancestors to be something more explicit like get_isa_part_of_ancestors

The logic for any is_X method should be to follow is_a only. Remember there are part-ofs between MF and BP

@deepakunni3
Copy link
Member

@cmungall Would get_isa_partof_closure be a better name for that method? It seems to do exactly that.
We could also add get_isa_closure along with get_isa_partof_closure

ontobio/ontol.py Outdated
"""
### BFO:0000050 = part of
### I assume "subClassOf" = is a?
all_ancestors = self.ancestors(go_term)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dustine32

If you call self.ancestors(go_term, reflexive=True) then you won't have to do all_ancestors.append(go_term) as the term itself is included in the list of ancestors returned by self.ancestors().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @deepakunni3 ! I'll try that option out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it worked!

ontobio/ontol.py Outdated
"""
Returns the ancestors from the is_a, part_of relation traversed GO subontology of go_term's ancestors.

subontology() primarily used here for speed when specifying relations to traverse.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a note for discussion - why does calling subontology() improve speed?
It looks like subontology() is called every time this method is called. Not sure if there is some caching happening somewhere.

Ideally, the following would achieve the same result:

# for isa_closure
self.ancestors(go_term, relations=['subClassOf'], reflexive=True)

# for isa_partof_closure
self.ancestors(go_term, relations=['subClassOf', 'BFO:0000050'], reflexive=True)

but while testing, I see that the proposed method is more quick at getting ancestors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured this out a long time ago so it might've changed (or cache is messing me up), but specifying relations without using subontology first will calculate the filtered graph using the whole GO ontology rather than just the smaller subset of "all relations" ancestors of a node.

Like, specifiying relations requires looking at edge properties (in ontol.get_filtered_graph()) whereas not just cares that an edge is there. Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @dustine32
Yes, I see the difference and where the bottleneck is 👍

@dustine32
Copy link
Collaborator Author

Cool, thanks for the feedback! I'll reorganize this into a file under utils/ and make those changes.

I do remember running into the MF-part_of-BP issue for this a while ago, causing MF terms to be erroneously identified as BPs. But I got around it by just asking if it was an MF first, though is_biological_process would still be left saying an MF's a BP. I'll try limiting this to just use the 'is a' relation.

@dustine32
Copy link
Collaborator Author

I think these changes are ready to go. I now have this in util/go_utils.py and I'm only traversing the graph through is_a relations. @cmungall @deepakunni3 Let me know if I missed anything. Thanks!

Copy link
Contributor

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Note if you call go_aspect a lot, it could be made more efficient as it does 3 ancestor queries, but we can worry about optimization later

@deepakunni3
Copy link
Member

deepakunni3 commented Jan 24, 2019

@dustine32 Looks good 👍
Thanks for adding these. I do see get_isa_closure() and get_isa_partof_closure() methods being used a lot in the coming days.

Feel free to merge this PR

@dustine32
Copy link
Collaborator Author

Awesome thanks!

@dustine32 dustine32 merged commit a268706 into master Jan 24, 2019
@dustine32 dustine32 deleted the dustine32-issue-203 branch January 24, 2019 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants