Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inferring MFs from annotations to complex portal IDs to individual complex participants #1662

Closed
bmeldal opened this issue Oct 18, 2017 · 17 comments

Comments

@bmeldal
Copy link

bmeldal commented Oct 18, 2017

Follow up from GOC meeting in Cambridge 2017:

The working group agreed that the CC (but see #1639 with regards to the required qualifiers!) and BP annotations would be equally appropriate for the complex subunits as they are for the complex so we'll extract them in the new Complex Portal GDAP.

However, extrapolating to MF is tricky:

We decided that we could infer the activity of the active subunit if it has been annotated as 'enzyme' in the CP. However, we could we also extrapolate something about the in-active complex members. In the past people used contributes_to but it has been proposed that this relationship should be obsolete or at least used with great care (#1650 - proposal guidelines). Can we say anything else about the MF of an inactive complex member?

Please post your use cases and proposals here.

@bmeldal
Copy link
Author

bmeldal commented Oct 18, 2017

Initial thoughts from breakout group:

  1. Remove contributes_to, replace with either enables_activity_in, part_of, localises_in or found_in,...
  2. In Noctua: GPs with function go into GPAD, those without known function (emerging function) will be filtered out.
  3. Known function: MF and AE with complex function
  4. For unknown/emerging function annotations use MF root term and AE with complex MF enabled_by CP ID
  5. Don’t annotate unknown/emerging function at all as it confuses users
  6. Use Complex Portal ID enables_activity_in subcellular location
  7. Consider combinatorial evidences

@vanaukenk
Copy link
Contributor

vanaukenk commented Oct 18, 2017

GO-CAM models of different types of complexes and associated functions of their respective members are here:

http://noctua.berkeleybop.org/editor/graph/gomodel:59c8885900000281

These are proposed models for different types of use cases, but we will likely want to model more before coming to any decisions.

Note that the rules for GPAD annotation outputs have not been formulated yet, so what is currently in the annotation preview is not final.

@bmeldal
Copy link
Author

bmeldal commented Oct 19, 2017

@bmeldal bmeldal added this to the 2018-05 GOC meeting milestone Nov 7, 2017
@bmeldal
Copy link
Author

bmeldal commented Dec 1, 2017

After discussing this for a little while yesterday (30/11/17) we decided it would be good to get user input to what they need/want.

  • set up short survey

@bmeldal
Copy link
Author

bmeldal commented Dec 1, 2017

2017_12_01_complex_go_annotation_relationships

I've tried to summarise the situation regarding the way we extract annotations from complexes and from individual GPs:

Along the black lines, the top relationships are those used currently, the ones added in red pen are the ones we are discussing.

Points:

Please review the picture and post your MF class related comments here.

Please post CC class related comments in #1639

@bmeldal
Copy link
Author

bmeldal commented Jan 15, 2018

geneontology/go-ontology#14847
an example of where contributes_to was discussed as a possible relationship

@bmeldal
Copy link
Author

bmeldal commented Feb 13, 2018

Call from 13/2:

We need to have examples of a complex:
1: 1 catalytic subunit + other subunits --> Val wouldn't annotate to non-catalytic subunits
2: complex that requires all members to have catalytic activity --> "contributes_to" - but was tricky in it's usage. #1650

@bmeldal
Copy link
Author

bmeldal commented Feb 14, 2018

@ValWood @RLovering @ukemi @sylvainpoux @vanaukenk
You've all had comments on the subject, please add them if not already captured.

Collecting use cases here:
https://docs.google.com/document/d/1ZtAcjIyIQ_ycbuMHyvLA-KIJQtGenh82lxS-MKC6a_A/edit#
(that's the link Kimberly shared at the call yesterday)

Draft survey:
https://docs.google.com/document/d/1P_VLM9g13kj9lu3CRAgotAI3cUmWkS1yWaVg95u2Vbk/edit?usp=sharing

Thanks.

@bmeldal
Copy link
Author

bmeldal commented Feb 23, 2018

Minutes from WG call on 22/2/18:

Present:
@judyblake , Li (sorry, don't have GH ID), @hdrabkin , @sandraorchard , @ValWood , @NancyCampbell , @tberardini , @RLovering , @deustp01 , @vanaukenk , @nataled , @ggeorghiou , Pascale, @edwong57 , @bmeldal (I hope I haven't forgotten anyone! - in no order!)

Sides:
2018_02_22_inferring_GOannotations_from_complex_to_GPs.pptx
[link updated on 27/2/18, had the wrong file!]

Complex Names:
Harold: Complex names can be very long and cumbersome which makes them hard to search for, can we find short forms?
Sandra: There are short labels in the DB but we don't display them.
Birgit: If a shorter form exists it's in the synonyms which are in the search index. Also can search with gene symbols (which make up the systematic name).

**Use of contributes_to **:
also #1650 - ticket for guidelines proposal for contributes_to
We spent most of the call on this!

  1. How do users use qualifiers and annotation extensions?
    Val (about PomBase users):
  • Enrichment usually run over BP, maybe CC, so loss of MF qualifiers for enrichment tools not too dramatic.
  • Qualifiers are displayed on Gene Pages where people can see them in context and is useful.
  • Searches for GPs with X function. If regulatory subunits$ annotated with contributes_to MF and qualifier is stripped resulting list is strange - users can filter them out but they need to know about it!

$ regulatory subunits: any GP (protein or otherwise) that has not been identified to be carrying out the enzymatic activity of the complex but are consistently found as complex member. They may or may not be essential (we don't distinguish essential subunits in the CP as most experiments don't go into that detail consistently).

  1. How GO annotators use the qualifier contributes_to:
  • only if >1 protein is required for the function. If function shown experimentally for the protein in isolation --> direct annotation to MF

  • Harold: Maybe need regulates for the regulatory subunits as a new term?

  • only where catalytic subunit not identified. Then add qualifier to all proteins of the complex.

  • Nancy's example of Telomerase: direct annotation to TERT enzyme subunit and contributes_to for telomeric RNA component
    image

Ruth: What about homodimers? Annotate directly.
Discussion highlighted issue that we can never know if the function is carried out by the monomer or homodimer (or even homomultimer) if protein selfassembles in solution.
AI: Birgit to add PDGF examples

Summary:
Different groups use slightly different guidelines (and it may even vary within groups) either annotating all regulatory subunits of a complex with contributes_to or only in cases where the catalytic subunit has not been identified.
Solution:
Draw up new annotation guidelines (#1650) and revisit all annotations.
Birgit: to provide a list of GPs that have NOT enzyme as biological role in complexes in CP as a guide (list won't be comprehensive as DB has not got full coverage yet!).

  1. How can we automatically infer MF annotations to complex ACs from Complex Portal to GP?

Rational WHY I want to do this:

  • We spend a lot of time annotating the complexes and many of these annotation would get lost when users can't consume Complex Portal ACs. There's a lot we can infer from the complex annotations but we need rules as we need to do it by script (we don't have the facility to annotate to individual GPs in our Editor).
  • Many GPs are "moonlighters" and have different functions in different complexes so exporting the MF in context of the complex seemed like a reasonable thing to do. This is where Kimberly suggested to use AEs:

Summary from GOC meeting in Cambridge (Oct 2017):
image
(AE suggestion was Kimberly's)

Val: Annotations on Gene Pages link MFs to complexes using occurs_in
Birgit: Are the MF and CC annotations connected? If not we have a list of functions and a list of complexes but no link.
Can we have an example (screenshot) please, @ValWood ?

Ruth (initial gut feeling): export for catalytic subunits but not regulatory subunits.
Pascale/Kimberly: make the distinction between catalytic and regulatory subunits
Ruth: is there a clear line between what is a catalytic subunit?

Birgit: Who would use these annotations??? What do they really need???
Judy: may know some power users that may be able to make use of these complex annotations.

NO SOLUTION YET!!!
Options:

  • only export direct annotations to catalytic subunits
  • export annotations for all regulatory subunits with contributes_to
  • export annotations to GPs with some form of AE (occurs_in / part_of / enabled_by CC)

Going forward:

@judyblake (@hdrabkin /@deustp01 ) to pass on details of users to Birgit, Birgit to get some feedback before next call (8/3/18).

Birgit

(I've probably forgotten something or someone so please add your comments. I'll be updating the contributes_to guideline ticket later.)

@bmeldal
Copy link
Author

bmeldal commented Feb 23, 2018

Ok, forgot one thing:

"X binding":

So far we have discussed what to do with catalytic activity but we also have MF annotations to "binding". We don't use "protein/complex" binding, that sort of data is captured by IntAct and exported from there, but any other type of "binding", e.g.:

image

Caveat:
We don't know which subunit binds the target, unfortunately, we haven't captured that yet (but I just got an idea how I could do it so I could go back and add it in if we want it!).
[Note to self: either by using the reference column with pipe or adding with/from as new field to our editor - which would be helpful anyway for creating our files.]

"Homework" for everyone:

Think about binding terms that the user might want/need. @RLovering can you think about this in the context of GREEKC, please?

@bmeldal
Copy link
Author

bmeldal commented Feb 23, 2018

Not discussed on the call:

but I'll run the issues by the users as well when I speak to them.

@bmeldal
Copy link
Author

bmeldal commented Feb 23, 2018

Example for homodimers:
https://www.ebi.ac.uk/complexportal/complex/EBI-2881436
Platelet-derived growth factor AA complex
Homodimer of P04085 PDGF subunit A
Only exists as dimer and functions a ligand for PDGF receptors
I haven't looked for the experiments for the activity as the complex evidence came from a crystal but from memory PDGF ligands are well described as obligate dimers.

PDGF ligands come in 5 flavours: AA, AB, BB, CC, DD.

And, the receptors (alpha-alpha, beta-beta or alpha-beta) don't dimerise until the ligand complex binds, forming an obligate heterotetrameric receptor-ligand complex!

So, the activities rely both on the dimeric ligand and the tetrameric receptor-ligand.

Food for thought how you would annotate that!

Added to google doc as well.

@bmeldal
Copy link
Author

bmeldal commented Mar 7, 2018

Google sheet for list of GPs as participants of catalytic complexes and their biol role annotations in the CP:
https://docs.google.com/spreadsheets/d/1-9PdAJ8BvrjhPWLx5pB9N0rS6hDNgYifO4_3CmbplEo/edit?usp=sharing

@bmeldal
Copy link
Author

bmeldal commented Mar 15, 2018

Summary from call on 15/3/18:

Present: Lauren-Philip, Pascale, Peter, Edith, Kimberly, Harold, Val, Tanya, Ruth, Sandra, Birgit

Lauren-Philip has a protein-protein interaction background and introduced his interest in complexes as drug targets.

  • We already link to ChEMBL www.ebi.ac.uk/chembl which might be a way of collecting drug data (ChEMBL incl links to Drugbank).
  • GO has a class "drug binding" but we don't use it in the CP.

We looked through the Google sheet that contains all GPs from complexes annotated to "catalytic activity" and split by their biological roles: enzyme, enzyme regulator and unspecified role.

Points to consider:

  • Users and most tools (even the GOC's own enrichment tool) mostly strip all relationship qualifiers so inferring to the regulatory subunits with qualifier contributes_to won't help but introduce erroneous direct annotations to the MF. (although, plenty such data exists)
  • Most enrichment analyses are run over BP, not MF - do we overthink the usefulness of the inferences? Val mentioned this last time and that most users look at the MF annotations on the gene pages where qualifiers are retained.

Decision 1:
Infer MF only to GPs annotated with biological role=enzyme!

Thought: can we capture the "regulator activity" by walking through the ontology?

E.g. CPX-1001(EBI-13638510) Calcineurin-Calmodulin complex, gamma-R1 variant
has GO:0033192 calmodulin-dependent protein phosphatase activity.
Its parent GO:0004723 calcium-dependent protein serine/threonine phosphatase activity has a FUNCTION regulates child GO:0008597 calcium-dependent protein serine/threonine phosphatase regulator activity which would be applicable for the complex's 2 regulatory subunits, Calmodulin and Calcineurin.

Decision 2:
Try and infer MF to GPs annotated with biological role=enzyme regulator to the class of regulator activity

  1. X binding

Decision 3:
At the moment we can't infer these as we didn't annotate the X binding property to the specific complex member but only the complex.
Ruth: Could MODs annotates missing binding evidences from papers CP curators find as part of their curation? - add to GOC mtg agenda for Complex topic

Thank you all for your valuable input!!! I have something to work on now (well, Noe and Tony :) )

@bmeldal
Copy link
Author

bmeldal commented Mar 19, 2018

Update: Rather than sending papers with missing GP annotation evidence to MODs we could add those directly through P2GO --> AI: Complex curators to be trained in P2GO!

@bmeldal
Copy link
Author

bmeldal commented May 18, 2018

Update from NYU GOC mtg:

What do we do if there are 2 or more enzyme subunits and 2 or more function annotations on one complex? A script doesn't know which subunit has which function. --> Manually annotate in P2GO?

@pgaudet
Copy link
Contributor

pgaudet commented Jan 31, 2024

This translation is now done. GO MFs are NOT inferred from Intact/Complex portal annotations, only BPs.

@pgaudet pgaudet closed this as completed Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants