Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MF refactoring: edits to DOS's changes #14225

Closed
pgaudet opened this issue Sep 18, 2017 · 15 comments
Closed

MF refactoring: edits to DOS's changes #14225

pgaudet opened this issue Sep 18, 2017 · 15 comments

Comments

@pgaudet
Copy link
Contributor

pgaudet commented Sep 18, 2017

Hello,

@thomaspd and I worked on @dosumis 's branch of the MF refactoring and made further edits.
@dosumis would you please have a look, and it that works for you we'll merge this and keep on editing from there.

see
1f9572f

Thanks, Pascale

@dosumis
Copy link
Contributor

dosumis commented Sep 18, 2017

Hiya,

Do you have a link to a pull request with all changes in?

Cheers,
David

@dosumis
Copy link
Contributor

dosumis commented Sep 18, 2017

OK Got it. I thought you'd branched again, but it seems to be on the same pull request (#14226), with commits starting from:

d502301

I'll comment properly later (I was surprised to see 'role' back - as I thought we'd agreed not to use that. MF terms are mini-processes, which are rather different from roles.)

CC @cmungall

@cmungall
Copy link
Member

cmungall commented Sep 19, 2017 via email

@ValWood
Copy link
Contributor

ValWood commented Sep 19, 2017

"system component function" will be meaningless to biologist. Do these MFs really need a grouping term other than "molecular function"?

@pgaudet
Copy link
Contributor Author

pgaudet commented Sep 20, 2017

Hi,

Perhaps we can do without these new top level classes ? Here's what the top level of MF now looks like. The term with blue background probably could be moved somewhere better. If we can remove some of these highlighted terms, then the top level is small enough that more grouping classes seem unnecessary.

Ideas:

  1. Create 'transcription factor activity' as parent of 'nucleic acid binding transcription factor activity' (perhaps as a grouping, 'do not annotate' term'?) and 'transcription factor activity, protein binding'
  2. Translation regulator activity will probably go, see general query about "factor " terms #13536
  3. I wonder if 'nutrient reservoir activity' really is an activity (it's certainly passive!). There are only about 20 annotations. Seems like these storage proteins are the target of a storage (and later utilization) process; but do they actively mediate this ?
    @tberardini would there be a better way to describe this ?
  4. 'toxin activity' also deserves to be reviewed

image

Thoughts @thomaspd @ValWood @cmungall @dosumis @ukemi @vanaukenk
???

Thanks, Pascale

@tberardini
Copy link
Contributor

Re: 'toxin activity', please see #12766 which documents this term's recent revival and history.

@tberardini
Copy link
Contributor

Re: 'nutrient reservoir activity' I see this as very similar to 'structural molecule activity' in terms of it being a passive function.

@thomaspd
Copy link
Contributor

I agree that we don't need terms above these, at least not for now.

Re 1 (transcription factor classes), we'd suggested a higher lever term called transcription regulator activity.'

Re 2 (translation factor) I don't think it needs to be obsoleted right away, if at all.

Re 3 and 4, I think these are OK for now. About nutrient reservoir, I can't think of a better way to describe egg proteins, or milk proteins. About toxin activity, it's an accepted term for a protein that evolved as a secreted toxin.

Let's merge in David's changes now so we don't accrue too many conflicts before making additional changes.

@ValWood
Copy link
Contributor

ValWood commented Sep 25, 2017

I still think it is totally confusing for curators and users to need to select terms in 2 MF branches to represent TFs fully,

a "transcription factor" branch
e.g.
GO:000370 transcription factor activity, sequence-specific DNA binding

and a "DNA binding" branch
eg
GO:0000977 - RNA polymerase II regulatory region sequence-specific DNA binding

I don't see how the "regulation of transcription branch" differs from a process, and the term names are only subtly different.

Even after a few years of using I still need to go back to look at the ontology every time I use one. I do a consistency check every few months to make sure our TFs are still annotated in both branches and there is usually a little drift due to the confusion even for experienced curators.

When you look at the high level TF terms do you know which is the "DNA binding" branch and which is the "transcription factor activity" branch?

It would be much simpler if we could select a single MF (DNA or protein bindingTF term)
and
part_of BP "transcription/regulation of transcription....."

This is one of the key terms to describe the MF of a DNA binding TF (DNA binding to a specific promoter region)
http://www.ebi.ac.uk/QuickGO/term/GO:0000978
and it is not related to any of the "transcription factor" high level MFs (which are only describing processes).

@pgaudet
Copy link
Contributor Author

pgaudet commented Sep 28, 2017

Changes:

  • new classes:
    -- GO:0104005 hijacked molecular function
    -- GO:0140104 molecular carrier activity (under discussion in MF refactoring: electron carrier activity #14267)
    -- New children of 'catalytic activity' (and moved terms under as appropriate)
    --- GO:0140096 catalytic activity, acting on a protein
    --- GO:0140097 catalytic activity, acting on DNA
    --- GO:0140098 catalytic activity, acting on RNA
    ---- GO:0140102 catalytic activity, acting on a rRNA
    ---- GO:0140101 catalytic activity, acting on a tRNA

@cmungall
Copy link
Member

cmungall commented Jul 7, 2020

Just adding a note for posterity, since this term links to this ticket:

id: GO:0140096
name: catalytic activity, acting on a protein
namespace: molecular_function
def: "Catalytic activity that acts to modify a protein." [GOC:molecular_function_refactoring, GOC:pdt]
is_a: GO:0003824 ! catalytic activity

We don't have a logical definition for this, so this means terms will have to be manually classified here.

It also means we can't auto-infer annotations, if curators want to annotate to an enzyme mechanism that is acting on a protein, then either we need to instantiate protein-specific subclasses for all appropriate activities and train curators to use these subclasses OR train curators to co-annotate. We'd want to do this retrospectively to ensure reasonable annotation completeness.

@deustp01
Copy link

deustp01 commented Jul 7, 2020

Opening a whole new can of worms here three years too late, does it make sense to have top-level terms "acting on DNA / RNA / protein"? It might be better to have "acting on polypeptide / polynucleotide" with the latter having ribo- and deoxyribo- children. That would fit a lot better with the enzymology data focused on the active sites and molecular mechanisms of enzymes and indifferent to the size of the substrate molecule or whether the substrate in genome-encoded or not. This isn't an argument that enzymes don't act on whole proteins, only that such enzymes functionally are specialized children of ones that go for a peptide bond or an amino acid side chain, indifferent to how big the molecule containing it is. @ukemi @hdrabkin

@ValWood
Copy link
Contributor

ValWood commented Aug 1, 2020

Randomly, here are some terms which are "acting on a protein" but do not have the parentage in a file that I found on my desktop.

holocytochrome-c synthase activity (GO:0004408)
Catalysis of the reaction: holocytochrome c = apocytochrome c + heme.

deoxyhypusine monooxygenase activity (GO:0019135)
Catalysis of the reaction: protein N6-(4-aminobutyl)-L-lysine + donor-H2 + O2 = protein N6-((R)-4-amino-2-hydroxybutyl)-L-lysine + acceptor + H2O.

peptide-lysine-N-acetyltransferase activity (GO:0061733)
Catalysis of the reaction: acetyl-CoA + lysine in peptide = CoA + N-acetyl-lysine-peptide.

lipoyl(octanoyl) transferase activity (GO:0033819)
Catalysis of the reaction: octanoyl-[acyl-carrier protein] + protein = protein N6-(octanoyl)lysine + acyl-carrier protein.

dolichyl-phosphate-mannose-protein mannosyltransferase activity (GO:0004169)
Catalysis of the reaction: dolichyl phosphate D-mannose + protein = dolichyl phosphate + O-D-mannosylprotein.

ubiquitin-like modifier activating enzyme activity (GO:0008641)
Catalysis of the activation of small proteins, such as ubiquitin or ubiquitin-like proteins, through the formation of an ATP-dependent high-energy thiolester bond.

@ValWood
Copy link
Contributor

ValWood commented Aug 1, 2020

Reopening because the logical def would fix this?

@ValWood ValWood reopened this Aug 1, 2020
@ValWood
Copy link
Contributor

ValWood commented Nov 5, 2020

I reopened this but it can probablly close?
If it stays open it is only for a logical defs.

pgaudet added a commit that referenced this issue Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants