Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

goodb · 2018-12-21T17:40:49Z

Reactome often connects physical entities (complexes, compounds, etc.) to reactions using regulates relationships. E.g. see https://reactome.org/PathwayBrowser/#/R-HSA-201681&SEL=R-HSA-201685&PATH=R-HSA-162582,R-HSA-195721 . Currently these come through the first pass of go-camification as: Entity involved_in_regulation_of MF triples (seen in grey). There is currently a rule that converts these into MF regulates MF relationships when the upstream MF that produces the physical entity as an output is present in the pathway. See e.g. : "Beta-catenin is released from the destruction complex" is positively regulated by 'Phosphorylation of LRP5/6 cytoplasmic domain by CSNKI'� (See also #18 )

But sometimes, the upstream function is either unknown or not in the current pathway. In these cases, create a simple 'binding' function node in its place and assert that that function regulates the targeted downstream function. Remove the entity involved in function relationship.

goodb · 2018-12-21T17:41:46Z

See also: http://wiki.geneontology.org/index.php/Noctua#Activities_mediated_by_small_molecule_concentration

deustp01 · 2019-01-02T22:26:44Z

For metabolism (and perhaps also for signaling cascades), the rule (if I understand Ben's first comment correctly) that the proper regulator of an MF is another MF seems wrong. The standard biochemistry textbook description of a positive or negative regulation event is that small molecule X positively or negatively regulates the enzyme that mediates the conversion of small molecule Y to small molecule Z. And while the text often notes that the major source of X is some other named reaction / MF, it also typically points out that the regulatory effect is the same regardless of the source of X in any given instance and indeed there are cases where the sources of X change quite drastically according to the cell type where the regulated event is happening or the physiological state of the cell.

Also, that binding even looks uninformative. In addition to the concerns about whether binding per se is in scope for GO annotation and GO-CAM model construction, binding is implied by regulation (just as it is for catalysis - molecules can only do these things by direct contact with one another - no action at a distance here) so the explicit binding event adds no information.

One caveat - I'm confident of this argument in cases (the vast majority for metabolism) where the regulator molecule X is different from the inputs and outputs Y and Z. In cases where one of Y and Z is also the regulator, I'd need to think some more.
@ukemi ?

ukemi · 2019-01-03T12:56:12Z

In the traditional GO representation of regulation, something that happens regulates another thing that happens. That is why we have process and function terms that describe regulation. In the textbook cases where some molecule X regulates the enzymatic activity of an enzyme E, it is often described as X regulates E. In the more explicit GO representation we are interested in the action taken by X or E that causes the change in the activity of E. In the classic textbook case this would be described as something like 'allosteric inhibition'. In a gene-centric view of the world, the allosteric inhibition can be described as a binding event. So instead of X bind E and that event negatively regulates the activity of E, we say that E binds X and that event negatively regulates the activity of E.

Note that the most important thing is that we focus on what the molecules are doing, This gets around the caveat because even if X is input or output it is the action of X with respect to E that is achieving the regulation when it is the 'regulatory molecule', it is doing something with respect to the E to achieve the regulation and that activity is different from any non-regulatory activity that it can also do.

This means that the binding is not meaningless in these cases. If E binding X negatively regulates E converting Y to Z, then that binding is meaningful. I think what you want is the ability to deduce that X is the 'regulatory molecule' that has an effect on E. I think this can be done with clever querying. Show me all of the binding inputs for any kind of binding that regulates the activity of E.

ukemi · 2019-01-03T13:23:34Z

tagging @huaiyumi

deustp01 · 2019-01-03T14:20:55Z

So I guess the way this would work is that the association of X regulator molecule with P catalytic protein or complex would be tagged with positive/negative regulation of the molecular function that P uses to mediate the conversion of input Y to output Z? That could all be scripted, though some human checking (and probably some re-annotation of Reactome reactions to get rid of inconsistencies that fool the script) would be needed. Phrased that way, it doesn't even sound so weird.

ukemi · 2019-01-03T15:36:59Z

Yes. That would be consistent with the view that we originally had for regulation. AND in the case of direct physical interaction, the association of X with P would be described as an MF that is executed by P, since we are the gene ontology. I think this will work, but we need to identify and examine some specific examples from Reactome. I think glycolysis is a good start and I think it is in pretty good shape. At this point, I think the most important thing to do moving forward is to continue what we started in Geneva and look at specific pathways.

ukemi · 2019-01-03T15:41:56Z

Finish glycolysis
Finish MAPK (interesting because there is more than one instance in Reactome
Look at Wnt signaling
Find other pathways where the pathway and its regulation are well understood and are non-controversial, preferably finding examples of different types of regulatory mechanisms.

Also tagging @vanaukenk and @thomaspd

goodb · 2019-01-08T22:24:45Z

@deustp01 @ukemi @thomaspd many of the 'entity involved_in_regulation_of MF' relations are the result of how pathways are split up in the go-camification process (but not really split up in Reactome as it is one single data structure). Have a look at the reaction 'WNT binds to FZD and LRP5/6' for example.

We get a lot of the grey entity-regulates links on that reaction. These would already be replaced with MF-regulates-MF relations using existing rules if the reactions from the pathway: "Negative regulation of TCF-dependent signaling by WNT ligand antagonists" were present in this model (which was created from the pathway "Disassembly of the destruction complex and recruitment of AXIN to the membrane". The negative regulation pathway contains the binding reactions that generate the complexes that do the regulating. For example, the reaction 'SOST binds LRP5/6' generates 'SOST:LRP5/6' which negatively regulates 'WNT binds to FZD and LRP5/6'. So in the GO-CAM we would end up with 'SOST binds LRP5/6' directly negatively regulates 'WNT binds to FZD and LRP5/6'.

I think the main issue to resolve for this reaction and many others is really model boundaries. If you look at that picture, its clear that though we have different named pathways, they are highly intertwingled. Replacing the grey entity-mf links with fake-mf->mf links won't really solve the problem.

I think in Geneva we came to the conclusion that we need to be able to have edges linking nodes in different models together to solve problems like these. This is captured in this issue Noctua geneontology/noctua#592

I need feedback from @cmungall @kltm @balhoff about whether that kind of technical (linking across models) change is possible in the near term. If not, we can try to figure out another solution - likely meaning either large models or models with a lot of redundancy.

thoughts??

goodb · 2019-01-18T04:09:43Z

@ukemi I want to make sure some discussion of model boundaries gets into the Wednesday meetings as it will impact everything else we want to do - as it does here.

ukemi · 2019-01-18T13:37:40Z

It will be a major discussion point based on my analysis of glycolysis.

ukemi · 2019-01-23T20:28:47Z

D-fructose 6-phosphate + ATP => D-fructose 1,6-bisphosphate + ADP

ukemi · 2019-01-23T21:06:19Z

I think the boundary in the WNT model above can be delimited. The ((protein binding enabled by LRP5/6) has input DKK and has_input KRM1/2)negatively regulates the ((protein binding enabled by WNT) has input FZD and has_input LRP5/6). I wouldn't include this in the Wnt signaling model, but would make it its own model that is a negative regulation of Wnt signaling (asserted as R-HSA-3772470.2).

ukemi · 2019-01-23T21:12:45Z

I would do the same thing for R-HSA-170822.3 with respect to glycolysis and the new pathways that @deustp01 split out. The issue here is that in the case of the WNT above and R-HSA-170822.3, these are defined processes that are regulatory, whereas in the case of ATP binding to PFK, R-HSA-70467.4, this is a single MF that is inhibiting a MF that is part of a pathway and regulation the pathway. In the latter cases, I would be willing to say that the binding not only negatively regulates the function, but also is part of a generic process (allosteric regulation of glycolysis) that negatively regulates glycolysis.

goodb · 2019-01-23T22:13:13Z

@ukemi the negative regulation pathways are split out now as they are distinct pathways in Reactome. The relations in question currently reside in both models and that is the key tension to figure out.

Apart from that, I can capture the case where there appears to be no defined regulatory process by searching upstream from the event in question.

goodb · 2019-02-14T20:13:36Z

Noting all above discussion related to the impact of cross-pathway relationships, I'm going to go ahead and make conversion as requested again here: https://docs.google.com/presentation/d/1_UAQN09WPCA5win5mbMs1ORMALNwiRwMBgZDPuyJEW8/edit#slide=id.g4ec4a6d029_0_28 e.g. generate statements along the lines of "ATP-binding enabled by (PFKP, PFKL, P08237) negatively regulates 6-phosphofructokinase activity, Part_of negative regulation of glycolysis"

Pending some way of handling the cross-model problem in Noctua, it seems appropriate to build the best individual GO-CAMs we can for each pathway. Since people really don't like entity-involved_in-function relations, this pattern should be an improvement.

ukemi · 2019-02-14T20:15:47Z

I think this is most consistent with respect to the way @deustp01 modeled it in Reactome.

functioning but need to see about node identities for folding.

goodb · 2019-02-15T19:20:53Z

@ukemi here is what it looks like now. As I said on the slide, I don't really like the extra bp nodes this generates. They feel redundant and I'm concerned that the regulates relationship linking them to the main process is could sometimes be incorrect. (And, if you thought extra edges from other pathways obscured the view, you should have seen this before I cleaned it up by hand..) Let me know what you think. (Note how the reasoner is really helping here.)

Before, this reaction node looked like this (chebis now resolving properly. chebi_16761 = ADP, chebi 15422 = ATP as some points of reference):

ukemi · 2019-02-15T19:55:41Z

I think this is representation is correct and in the case of Reactome, I think the regulation link back to the process is always true (but not always in GO-CAM models). However, the process that is being inferred is '+/- reg of glycolysis through F6P' and not '+/- reg of canonical glycolysis.' Peter and I have recently discussed whether we should map the parent process to 'glycolysis through G6P' because of this side blip: PGM2L1:Mg2+ phosphorylates G6P to G1,6BP that branches from the 'canonical' pathway. That blip, when executed takes phosphoglycerate kinase activity out of the pathway (1,3-bisphospho-D-glycerate + ADP <=> 3-phospho-D-glycerate + ADP---- Peter, there is a typo in the reaction description, shouldn't it be 1,3-bisphospho-D-glycerate + ADP <=> 3-phospho-D-glycerate + ATP), but if we take that out then the pathway shouldn't classify as 'glycolytic process' because it is a necessary activity:

glycolytic process='carbohydrate catabolic process'
and ('has part' some 'phosphoglycerate kinase activity')
and ('has part' some 'phosphoglycerate mutase activity')
and ('has part' some 'phosphopyruvate hydratase activity')
and ('has part' some 'glyceraldehyde-3-phosphate dehydrogenase (NAD(P)+) (phosphorylating) activity')
and ('has participant' some 'NAD(P)(+)')
and ('has participant' some 'ADP(3-)')
and ('ends with' some 'pyruvate kinase activity')
and ('has output' some 'NAD(P)H')
and ('has output' some pyruvate)
and ('has output' some 'ATP(4-)')

Is it because in this 'superpathway' representation we are also including the branch with the phosphoglycerate kinase activity? So this necessary part is provided?

It seems that we are either missing: glucose-6-phosphate isomerase activity which if included would have classified this as '+/- reg of glycolysis through G6P' and maybe missing 'glyceraldehyde-3-phosphate dehydrogenase (NAD+) (phosphorylating) activity' which if included would have made this a '+/- reg of canonical glycolysis'

But the cool thing is that what is indicated above make sense to me as a biologist, it would just be cooler if the reasoning inferred more specificity. What's there isn't wrong regulating canonical glycolysis is a regulation of glycolysis through F6P. @deustp01 should have a look. I think he will agree this is correct.

ukemi · 2019-02-15T19:59:45Z

But there is still the issue of it being messy to view.

ukemi · 2019-02-15T20:08:39Z

And you're right. The reasoner is taking the uninformative binding functions and turning them into the informative kinase activator/inhibitor activities, which reflects the allosteric regulation of the enzyme activity by binding the metabolites.

ukemi · 2019-02-15T20:22:24Z

They feel redundant and I'm concerned that the regulates relationship linking them to the main process is could sometimes be incorrect. (And, if you thought extra edges from other pathways obscured the view, you should have seen this before I cleaned it up by hand..)

I suspect that the relationships to the regulatory processes are redundant. This gets to the sticky issue of process instances. Is the +/- regulation of glycolysis one big process to which all of these contributes or is it lots of different processes, each a different instance of regulation? From the viewpoint of the cell, I think that it is one process. All of these input functions work together to control glycolysis.

I don't think the relationship to the regulatory processes is incorrect. Do you remember the drawing on the board at the hackathon when were were discussing the invalidity of regulates-o-part_of->regulates? If an MF1 regulates an MF2 that is part of the processA, and MF1 is not part of processA then MF1 is involved in 'regulation of processA'. If an MF1 regulates an MF2 that is part of the processA, and MF1 is also part of processA then MF1 does not regulate processA.

goodb self-assigned this Dec 21, 2018

goodb pushed a commit that referenced this issue Feb 14, 2019

most of the way to finishing #39

2d3284b

functioning but need to see about node identities for folding.

goodb closed this as completed in e5f9967 Feb 15, 2019

goodb mentioned this issue Feb 15, 2019

Add active unit information from Reactome #31

Closed

3 tasks

goodb mentioned this issue Feb 27, 2019

Open Questions Feb 2019 #53

Closed

goodb mentioned this issue Mar 20, 2019

Implement rules for non-protein entities regulating molecular functions #56

Closed

goodb mentioned this issue Feb 25, 2020

define and implement model for transport and dissociation processes #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

goodb commented Dec 21, 2018

goodb commented Dec 21, 2018

deustp01 commented Jan 2, 2019

ukemi commented Jan 3, 2019 •

edited

Loading

ukemi commented Jan 3, 2019

deustp01 commented Jan 3, 2019

ukemi commented Jan 3, 2019

ukemi commented Jan 3, 2019

goodb commented Jan 8, 2019

goodb commented Jan 18, 2019

ukemi commented Jan 18, 2019

ukemi commented Jan 23, 2019

ukemi commented Jan 23, 2019

ukemi commented Jan 23, 2019

goodb commented Jan 23, 2019

goodb commented Feb 14, 2019

ukemi commented Feb 14, 2019

goodb commented Feb 15, 2019 •

edited

Loading

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019 •

edited

Loading

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

Comments

goodb commented Dec 21, 2018

goodb commented Dec 21, 2018

deustp01 commented Jan 2, 2019

ukemi commented Jan 3, 2019 • edited Loading

ukemi commented Jan 3, 2019

deustp01 commented Jan 3, 2019

ukemi commented Jan 3, 2019

ukemi commented Jan 3, 2019

goodb commented Jan 8, 2019

goodb commented Jan 18, 2019

ukemi commented Jan 18, 2019

ukemi commented Jan 23, 2019

ukemi commented Jan 23, 2019

ukemi commented Jan 23, 2019

goodb commented Jan 23, 2019

goodb commented Feb 14, 2019

ukemi commented Feb 14, 2019

goodb commented Feb 15, 2019 • edited Loading

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019

ukemi commented Feb 15, 2019 • edited Loading

ukemi commented Jan 3, 2019 •

edited

Loading

goodb commented Feb 15, 2019 •

edited

Loading

ukemi commented Feb 15, 2019 •

edited

Loading