Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

Closed
goodb opened this issue Dec 21, 2018 · 21 comments
Closed

Convert Entity_involved_in_regulation_MF to MF_regulates_MF #39

goodb opened this issue Dec 21, 2018 · 21 comments
Assignees

Comments

@goodb
Copy link
Contributor

goodb commented Dec 21, 2018

Reactome often connects physical entities (complexes, compounds, etc.) to reactions using regulates relationships. E.g. see https://reactome.org/PathwayBrowser/#/R-HSA-201681&SEL=R-HSA-201685&PATH=R-HSA-162582,R-HSA-195721 . Currently these come through the first pass of go-camification as: Entity involved_in_regulation_of MF triples (seen in grey). There is currently a rule that converts these into MF regulates MF relationships when the upstream MF that produces the physical entity as an output is present in the pathway. See e.g. : "Beta-catenin is released from the destruction complex" is positively regulated by 'Phosphorylation of LRP5/6 cytoplasmic domain by CSNKI'� (See also #18 )

But sometimes, the upstream function is either unknown or not in the current pathway. In these cases, create a simple 'binding' function node in its place and assert that that function regulates the targeted downstream function. Remove the entity involved in function relationship.

@goodb goodb self-assigned this Dec 21, 2018
@goodb
Copy link
Contributor Author

goodb commented Dec 21, 2018

@deustp01
Copy link
Collaborator

deustp01 commented Jan 2, 2019

For metabolism (and perhaps also for signaling cascades), the rule (if I understand Ben's first comment correctly) that the proper regulator of an MF is another MF seems wrong. The standard biochemistry textbook description of a positive or negative regulation event is that small molecule X positively or negatively regulates the enzyme that mediates the conversion of small molecule Y to small molecule Z. And while the text often notes that the major source of X is some other named reaction / MF, it also typically points out that the regulatory effect is the same regardless of the source of X in any given instance and indeed there are cases where the sources of X change quite drastically according to the cell type where the regulated event is happening or the physiological state of the cell.

Also, that binding even looks uninformative. In addition to the concerns about whether binding per se is in scope for GO annotation and GO-CAM model construction, binding is implied by regulation (just as it is for catalysis - molecules can only do these things by direct contact with one another - no action at a distance here) so the explicit binding event adds no information.

One caveat - I'm confident of this argument in cases (the vast majority for metabolism) where the regulator molecule X is different from the inputs and outputs Y and Z. In cases where one of Y and Z is also the regulator, I'd need to think some more.
@ukemi ?

@ukemi
Copy link

ukemi commented Jan 3, 2019

In the traditional GO representation of regulation, something that happens regulates another thing that happens. That is why we have process and function terms that describe regulation. In the textbook cases where some molecule X regulates the enzymatic activity of an enzyme E, it is often described as X regulates E. In the more explicit GO representation we are interested in the action taken by X or E that causes the change in the activity of E. In the classic textbook case this would be described as something like 'allosteric inhibition'. In a gene-centric view of the world, the allosteric inhibition can be described as a binding event. So instead of X bind E and that event negatively regulates the activity of E, we say that E binds X and that event negatively regulates the activity of E.

Note that the most important thing is that we focus on what the molecules are doing, This gets around the caveat because even if X is input or output it is the action of X with respect to E that is achieving the regulation when it is the 'regulatory molecule', it is doing something with respect to the E to achieve the regulation and that activity is different from any non-regulatory activity that it can also do.

This means that the binding is not meaningless in these cases. If E binding X negatively regulates E converting Y to Z, then that binding is meaningful. I think what you want is the ability to deduce that X is the 'regulatory molecule' that has an effect on E. I think this can be done with clever querying. Show me all of the binding inputs for any kind of binding that regulates the activity of E.

@ukemi
Copy link

ukemi commented Jan 3, 2019

tagging @huaiyumi

@deustp01
Copy link
Collaborator

deustp01 commented Jan 3, 2019

So I guess the way this would work is that the association of X regulator molecule with P catalytic protein or complex would be tagged with positive/negative regulation of the molecular function that P uses to mediate the conversion of input Y to output Z? That could all be scripted, though some human checking (and probably some re-annotation of Reactome reactions to get rid of inconsistencies that fool the script) would be needed. Phrased that way, it doesn't even sound so weird.

@ukemi
Copy link

ukemi commented Jan 3, 2019

Yes. That would be consistent with the view that we originally had for regulation. AND in the case of direct physical interaction, the association of X with P would be described as an MF that is executed by P, since we are the gene ontology. I think this will work, but we need to identify and examine some specific examples from Reactome. I think glycolysis is a good start and I think it is in pretty good shape. At this point, I think the most important thing to do moving forward is to continue what we started in Geneva and look at specific pathways.

@ukemi
Copy link

ukemi commented Jan 3, 2019

  1. Finish glycolysis
  2. Finish MAPK (interesting because there is more than one instance in Reactome
  3. Look at Wnt signaling
  4. Find other pathways where the pathway and its regulation are well understood and are non-controversial, preferably finding examples of different types of regulatory mechanisms.

Also tagging @vanaukenk and @thomaspd

@goodb
Copy link
Contributor Author

goodb commented Jan 8, 2019

@deustp01 @ukemi @thomaspd many of the 'entity involved_in_regulation_of MF' relations are the result of how pathways are split up in the go-camification process (but not really split up in Reactome as it is one single data structure). Have a look at the reaction 'WNT binds to FZD and LRP5/6' for example.
screen shot 2019-01-08 at 1 56 40 pm

We get a lot of the grey entity-regulates links on that reaction. These would already be replaced with MF-regulates-MF relations using existing rules if the reactions from the pathway: "Negative regulation of TCF-dependent signaling by WNT ligand antagonists" were present in this model (which was created from the pathway "Disassembly of the destruction complex and recruitment of AXIN to the membrane". The negative regulation pathway contains the binding reactions that generate the complexes that do the regulating. For example, the reaction 'SOST binds LRP5/6' generates 'SOST:LRP5/6' which negatively regulates 'WNT binds to FZD and LRP5/6'. So in the GO-CAM we would end up with 'SOST binds LRP5/6' directly negatively regulates 'WNT binds to FZD and LRP5/6'.

I think the main issue to resolve for this reaction and many others is really model boundaries. If you look at that picture, its clear that though we have different named pathways, they are highly intertwingled. Replacing the grey entity-mf links with fake-mf->mf links won't really solve the problem.

I think in Geneva we came to the conclusion that we need to be able to have edges linking nodes in different models together to solve problems like these. This is captured in this issue Noctua geneontology/noctua#592

I need feedback from @cmungall @kltm @balhoff about whether that kind of technical (linking across models) change is possible in the near term. If not, we can try to figure out another solution - likely meaning either large models or models with a lot of redundancy.

thoughts??

@goodb
Copy link
Contributor Author

goodb commented Jan 18, 2019

@ukemi I want to make sure some discussion of model boundaries gets into the Wednesday meetings as it will impact everything else we want to do - as it does here.

@ukemi
Copy link

ukemi commented Jan 18, 2019

It will be a major discussion point based on my analysis of glycolysis.

@ukemi
Copy link

ukemi commented Jan 23, 2019

D-fructose 6-phosphate + ATP => D-fructose 1,6-bisphosphate + ADP
 

@ukemi
Copy link

ukemi commented Jan 23, 2019

I think the boundary in the WNT model above can be delimited. The ((protein binding enabled by LRP5/6) has input DKK and has_input KRM1/2)negatively regulates the ((protein binding enabled by WNT) has input FZD and has_input LRP5/6). I wouldn't include this in the Wnt signaling model, but would make it its own model that is a negative regulation of Wnt signaling (asserted as R-HSA-3772470.2).

@ukemi
Copy link

ukemi commented Jan 23, 2019

I would do the same thing for R-HSA-170822.3 with respect to glycolysis and the new pathways that @deustp01 split out. The issue here is that in the case of the WNT above and R-HSA-170822.3, these are defined processes that are regulatory, whereas in the case of ATP binding to PFK, R-HSA-70467.4, this is a single MF that is inhibiting a MF that is part of a pathway and regulation the pathway. In the latter cases, I would be willing to say that the binding not only negatively regulates the function, but also is part of a generic process (allosteric regulation of glycolysis) that negatively regulates glycolysis.

@goodb
Copy link
Contributor Author

goodb commented Jan 23, 2019

@ukemi the negative regulation pathways are split out now as they are distinct pathways in Reactome. The relations in question currently reside in both models and that is the key tension to figure out.

Apart from that, I can capture the case where there appears to be no defined regulatory process by searching upstream from the event in question.

@goodb
Copy link
Contributor Author

goodb commented Feb 14, 2019

Noting all above discussion related to the impact of cross-pathway relationships, I'm going to go ahead and make conversion as requested again here: https://docs.google.com/presentation/d/1_UAQN09WPCA5win5mbMs1ORMALNwiRwMBgZDPuyJEW8/edit#slide=id.g4ec4a6d029_0_28 e.g. generate statements along the lines of "ATP-binding enabled by (PFKP, PFKL, P08237) negatively regulates 6-phosphofructokinase activity, Part_of negative regulation of glycolysis"

Pending some way of handling the cross-model problem in Noctua, it seems appropriate to build the best individual GO-CAMs we can for each pathway. Since people really don't like entity-involved_in-function relations, this pattern should be an improvement.

@ukemi
Copy link

ukemi commented Feb 14, 2019

I think this is most consistent with respect to the way @deustp01 modeled it in Reactome.

goodb pushed a commit that referenced this issue Feb 14, 2019
functioning but need to see about node identities for folding.
@goodb
Copy link
Contributor Author

goodb commented Feb 15, 2019

@ukemi here is what it looks like now. As I said on the slide, I don't really like the extra bp nodes this generates. They feel redundant and I'm concerned that the regulates relationship linking them to the main process is could sometimes be incorrect. (And, if you thought extra edges from other pathways obscured the view, you should have seen this before I cleaned it up by hand..) Let me know what you think. (Note how the reasoner is really helping here.)

screen shot 2019-02-15 at 11 16 23 am

Before, this reaction node looked like this (chebis now resolving properly. chebi_16761 = ADP, chebi 15422 = ATP as some points of reference):
screen shot 2019-02-15 at 11 22 15 am

@ukemi
Copy link

ukemi commented Feb 15, 2019

I think this is representation is correct and in the case of Reactome, I think the regulation link back to the process is always true (but not always in GO-CAM models). However, the process that is being inferred is '+/- reg of glycolysis through F6P' and not '+/- reg of canonical glycolysis.' Peter and I have recently discussed whether we should map the parent process to 'glycolysis through G6P' because of this side blip: PGM2L1:Mg2+ phosphorylates G6P to G1,6BP that branches from the 'canonical' pathway. That blip, when executed takes phosphoglycerate kinase activity out of the pathway (1,3-bisphospho-D-glycerate + ADP <=> 3-phospho-D-glycerate + ADP---- Peter, there is a typo in the reaction description, shouldn't it be 1,3-bisphospho-D-glycerate + ADP <=> 3-phospho-D-glycerate + ATP), but if we take that out then the pathway shouldn't classify as 'glycolytic process' because it is a necessary activity:

glycolytic process='carbohydrate catabolic process'
and ('has part' some 'phosphoglycerate kinase activity')
and ('has part' some 'phosphoglycerate mutase activity')
and ('has part' some 'phosphopyruvate hydratase activity')
and ('has part' some 'glyceraldehyde-3-phosphate dehydrogenase (NAD(P)+) (phosphorylating) activity')
and ('has participant' some 'NAD(P)(+)')
and ('has participant' some 'ADP(3-)')
and ('ends with' some 'pyruvate kinase activity')
and ('has output' some 'NAD(P)H')
and ('has output' some pyruvate)
and ('has output' some 'ATP(4-)')

Is it because in this 'superpathway' representation we are also including the branch with the phosphoglycerate kinase activity? So this necessary part is provided?

It seems that we are either missing: glucose-6-phosphate isomerase activity which if included would have classified this as '+/- reg of glycolysis through G6P' and maybe missing 'glyceraldehyde-3-phosphate dehydrogenase (NAD+) (phosphorylating) activity' which if included would have made this a '+/- reg of canonical glycolysis'

But the cool thing is that what is indicated above make sense to me as a biologist, it would just be cooler if the reasoning inferred more specificity. What's there isn't wrong regulating canonical glycolysis is a regulation of glycolysis through F6P. @deustp01 should have a look. I think he will agree this is correct.

@ukemi
Copy link

ukemi commented Feb 15, 2019

But there is still the issue of it being messy to view.

@ukemi
Copy link

ukemi commented Feb 15, 2019

And you're right. The reasoner is taking the uninformative binding functions and turning them into the informative kinase activator/inhibitor activities, which reflects the allosteric regulation of the enzyme activity by binding the metabolites.

@ukemi
Copy link

ukemi commented Feb 15, 2019

They feel redundant and I'm concerned that the regulates relationship linking them to the main process is could sometimes be incorrect. (And, if you thought extra edges from other pathways obscured the view, you should have seen this before I cleaned it up by hand..)

I suspect that the relationships to the regulatory processes are redundant. This gets to the sticky issue of process instances. Is the +/- regulation of glycolysis one big process to which all of these contributes or is it lots of different processes, each a different instance of regulation? From the viewpoint of the cell, I think that it is one process. All of these input functions work together to control glycolysis.

I don't think the relationship to the regulatory processes is incorrect. Do you remember the drawing on the board at the hackathon when were were discussing the invalidity of regulates-o-part_of->regulates? If an MF1 regulates an MF2 that is part of the processA, and MF1 is not part of processA then MF1 is involved in 'regulation of processA'. If an MF1 regulates an MF2 that is part of the processA, and MF1 is also part of processA then MF1 does not regulate processA.

@goodb goodb closed this as completed in e5f9967 Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants