Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complexes (without active sites) can enable functions again #34

Closed
goodb opened this issue Dec 19, 2018 · 19 comments
Closed

complexes (without active sites) can enable functions again #34

goodb opened this issue Dec 19, 2018 · 19 comments
Assignees

Comments

@goodb
Copy link
Contributor

goodb commented Dec 19, 2018

When no active site is annotated, and the complex (or protein set) catalyzes a MF, then add the complex as the enabler of that function. The members of the complex (part_ofs) should get contributes_to relations to the MF, though these may be inferable via geneontology/minerva#110

@goodb goodb self-assigned this Dec 19, 2018
@goodb
Copy link
Contributor Author

goodb commented Jan 7, 2019

Noting that this is going to increase the need to address this issue in Noctua geneontology/noctua#581

@goodb
Copy link
Contributor Author

goodb commented Jan 19, 2019

Assuming we keep complexes in models and allow them to enable reactions as the issue here says, what should we do with complexes (or protein sets) that are listed specifically as inputs or outputs? My take would be to take them into the models the same way as they are for the enablers. The only harm they do is cloud the display and that is a UI problem, not a modeling problem. Thoughts @ukemi ?

@goodb
Copy link
Contributor Author

goodb commented Jan 21, 2019

I am holding a push of a changes resulting from Geneva meeting in relation to this question. Can complexes be inputs and/or outputs of activities in go-cams and if not, what to do with Reactome? Once again, my take is that a clean, automatic transfer of the knowledge in reactome keeps this information somehow, and has_output, has_input seem the natural fit. The fact that it breaks the display is a separate issue. Need bio project owner input to resolve. @thomaspd

@ukemi
Copy link

ukemi commented Jan 22, 2019

In some cases, we have named complexes in GO that reflect complexes in Reactome. Until a decision is made about how we are going to handle complexes in GO, it is difficult to determine what to do with the reactome complexes. We will discuss tomorrow, but it would be nice if we could leverage off of logical defs that are in theo ontology to assign reactome individuals to GO classes.

@goodb
Copy link
Contributor Author

goodb commented Jan 22, 2019

I'm experimenting with a pattern that uses an OWL Intersection class to represent the complex in the go-cam based on its members. This has the advantage of allowing the complexes to fold into the Reaction/Function nodes in the current Noctua view, keeps access to the relevant protein members, and, if logical definitions of complexes based on parts ever make it into the GO, should support automatic classification. I'll send out examples to see what you folks think.

@ukemi
Copy link

ukemi commented Jan 22, 2019

Complexes should come up in tomorrow's discussion. IMO the complexes capable of some MF are more valuable. Any idea why I can't get to the Reactome glycolysis model? Was going over it with Peter this morning and it wasn't available. Fortunately I made a spreadsheet.

@goodb
Copy link
Contributor Author

goodb commented Jan 22, 2019 via email

@goodb
Copy link
Contributor Author

goodb commented Jan 22, 2019

screen shot 2019-01-22 at 3 28 42 pm

Here is an example of a complex with an annotated active site enabling a function. The 'intersections[]' are OWL class expressiona for describing complexes based on their parts. The distinct individual node for a part of the complex only comes out, linked via has_part, when it is an active unit enabling a function.

@ukemi
Copy link

ukemi commented Feb 6, 2019

The above view does clean up the graph quite a bit. But I'm still confused about the exact data model. It seems to me that the protein-containing complex and intersection[12] are equivalent, correct?

Another point in this model, why is intersection[12] the output of the kinase reaction and the complex that catalyzes it?

@ukemi
Copy link

ukemi commented Feb 6, 2019

Looking at the pathway in Reactome, it looks like the TLR4 complex activates TAK1 by phosphorylating it. Then Phospho-TAK1 activates MAP2K by phosphorylation and activates NFKB.

@goodb
Copy link
Contributor Author

goodb commented Feb 6, 2019

(Looking at this model http://noctua-dev.berkeleybop.org/editor/graph/gomodel:284362402 )

It seems to me that the protein-containing complex and intersection[12] are equivalent, correct?

Sort of. The OWL Individual representing that node in the model has 2 assigned RDF:types, one 'protein-containing complex' and one OWL:Intersection. If you expand the model out (View:Evidence Folded), you can inspect the node (green button) and see the elements of the intersection. e.g.
screen shot 2019-02-06 at 1 33 46 pm

Actually looking at it now, I think it might be better to merge these into one expression instead of 2. e.g. in OWL it would like:
'protein-containing complex' and ('has part' some O00206) and ('has part' some O43318) and ('has part' some P08571) and ('has part' some Q15750) and ('has part' some Q86XR7) and ('has part' some Q8IUC6) and ('has part' some Q8N5C8) and ('has part' some Q9NYJ8) and ('has part' some Q9Y4K3) and ('has part' some Q9Y6Y9) and ('has part' some CHEBI_16412)

Not sure how that will render in Noctua, but can test.

Another point in this model, why is intersection[12] the output of the kinase reaction and the complex that catalyzes it?

I think viewing in expanded mode might be clarifying again here because the folding hides the name of the complexes and stops you from inspecting their parts. One intersection[12] might not be same thing as another intersection[12]. In the reaction we've been looking at, the input and enabler is 'activated TLR4:TICAM1:K63pUb-TRAF6:free K63pUb:TAK1complex' and the output (also an intersection[12] is 'pT-MAP3K7:TAB1:TAB2,3:K63pUb-TRAF6:TICAM1:TRAM:TLR4:LY96:LPS:CD14:'

Here is an expanded view:
screen shot 2019-02-06 at 1 55 42 pm

@goodb
Copy link
Contributor Author

goodb commented Feb 14, 2019

Closing this until some one has a better idea for how to represent this ;)

@ukemi
Copy link

ukemi commented Mar 1, 2019

After discussion at NYU, we decided that although this strategy works, this is the type of biology that a GO curator will want to model in a de novo model. They will not create an intersection set. Therefore it makes most sense for us to create has_part relationships between the complex and the members of the complex. Then to make the complex and the active subunit of the complex enable the molecular function.

See:http://noctua.geneontology.org/editor/graph/gomodel:5c4605cc00001601

Note that this complex is interesting because it also contains two UNION sets, one that contains TAB1,2,3 and the other contains UBA52, UBB, UBC and RPS27A. These would replace the single proteins that we put in the model.

@ukemi
Copy link

ukemi commented Mar 1, 2019

In the model in #34 (comment)
what are there two output complexes? We only see one, unless you are also representing the downstream dissociation.

@goodb
Copy link
Contributor Author

goodb commented Mar 1, 2019

@ukemi are you sure you want the complex to enable the function? In Geneva the idea was that the complex contributes_to the function when a specific active site is known to enable it.

This formulation points out an assumption for go-cam overall that we should make explicit somewhere. When we create a structure like a complex instance with has_part relations or a function with inputs and outputs, it sounds like we are expecting these to be interpreted logically as intersections. The instance nodes in these models actually represent classes in the philosophical sense and these classes are defined as having all of the attributes that are attached to the instance. I still prefer the intersection formulation as that specifically represents what I think we are doing. However, if we started down that road, the pattern should probably be applied to all elements of the models, not just complexes.

When you say "They will not create an intersection set" its clear they won't now as its impossible and its also pretty clear that most people won't want to work that close to the OWL metal. However, it doesn't mean that the logic isn't correct or that we could not make that happen under-the-hood if we wanted to.

@goodb goodb reopened this Mar 1, 2019
@goodb
Copy link
Contributor Author

goodb commented Mar 1, 2019

@ukemi there is only one output complex in #34 (comment). The UI is showing 2 types (1 protein complex, 1 intersection) for the same thing. The newer release has these grouped into one construct.

@goodb
Copy link
Contributor Author

goodb commented Mar 1, 2019

@ukemi and @kltm if we go back to the complex being represented via has_part relations, the view will be nearly completely obscured for most reactome-generated models. Unless this geneontology/noctua#581 is addressed.

@ukemi
Copy link

ukemi commented Mar 1, 2019

There was another representation that we came up with this morning that we thought was intuitive and worked. I think we should leave the vizualiztion aspect as a separate issue because what we want is to get the underlying data correct with respect to how a biologist would view it and for this project that is consistent with how it is represented in Reactome. If we remover the assertion that the complex enables the MF in cases where we know the active subunit. The model will have a complex that has_parts all of the subunits and then a single part (the active subunit) that will enable the molecular function. If we could then create a property chain that represents part_of-o-has_part-o-enables->contributes_to I know we have to break it up), it would allow us to represent all the continuants intuitively. If we don't know the active subunit, then the complex will have_part all of its members and the complex will enable the MF. We will also have the property chain part_of-o-enables->contributes to. We think that this allows us to represent any biology at the level of knowledge and also after discussion with @deustp01 think it accurately represents what is in Reactome.

@goodb
Copy link
Contributor Author

goodb commented Mar 1, 2019

I think we should leave the vizualization aspect as a separate issue because what we want is to get the underlying data correct

agreed. I was just raising the issue to alert everyone of immediate consequences.

@goodb goodb closed this as completed Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants