how ontology issues over-inflation of annotation affects GO analyses #1869

ValWood · 2018-03-12T19:14:18Z

This table (for a paper on unknowns) shows changes in pombe slim over time:

the orange fluctuations are due to stricter annotation criteria, the peach ones due to ontology changes. There are lots of ontology fixes that are not picked up by this snapshot, because they are added and fixed between the snapshots. Slim totals have been quite stable for the past 6-12 months apart from an occasional ontology glitch.

This figure shows the "cellular process slim" totals for pombe, cerevisiae and human. This isn't part of the unknowns paper, I did it out of interest. The green shading are cellular processes where the gene products are (largely) conserved 1:1:1. The blue are where I expected the human numbers to be higher (lots of on to many). Some term totals are a lot higher for human than expected. I have had a glance through some of the lists and identified some obvious errors. There are a lot more.

Personally I think it is critical that the annotation errors are addressed because their presence will obscure true enrichments and make GO unusable for human analysis.

Overinflated numbers appear to be largely due to 2 recurring annotations error types
i) annotations of target genes to a process and
ii) experimental readouts.

vanaukenk · 2018-03-12T20:31:40Z

@ValWood
This is really interesting. In terms of actionable items, I see there are already tickets to review some of the human annotations, but it would be nice to generate the second table for more species and have it available as an ongoing report that curators could systematically check.

ValWood · 2018-03-12T20:40:07Z

I was thinking that @vanaukenk . I was going to tag you because I remembered you asking about evidence for ontology changes affecting analysis.

It's very easy to do....with GO term mapper you can import legacy gafs to look at historical changes...
I waited until now to share because the pombe slim should not have so many large fluctuations from now on (there will be increases over time, but the ontology is much more stable for slim totals, at least for cellular processes).

It's really nice also to have a slim constantly available on our web site
https://www.pombase.org/browse-curation/fission-yeast-go-slim-terms
it's one of our most highly accessed pages.
I makes sanity checking specific process lists very easy.....

ValWood · 2018-03-12T20:41:54Z

I can add other species if I get the protein coding gene lists. That is the trickiest part!

pgaudet · 2018-06-06T12:41:03Z

ACTION ITEM NYC 2018 GOC: Create report based on species slims

ValWood · 2023-08-29T18:34:20Z

Should this be closed? The tool accompanying Marcs paper will help a lot to interrogate slim set differences.

ValWood · 2023-08-29T18:34:31Z

Closing...

vanaukenk added the annotation review label Mar 12, 2018

ValWood added the PomBase label Mar 13, 2018

This was referenced Mar 28, 2018

Unknowns figure :comparative histogram pombe/cerevisae/human pombase/curation#1960

Closed

[moved to annotation tracker] Provide a 'high confidence'/ 'fool-proof' GAF geneontology/go-ontology#15498

Closed

vanaukenk mentioned this issue Apr 13, 2018

Provide a 'high confidence'/ 'fool-proof' GAF #1932

Open

pgaudet assigned pgaudet, vanaukenk, ValWood and sylvainpoux Apr 18, 2018

pgaudet added the action Item from NYC-05-2018 label Jun 6, 2018

ValWood closed this as completed Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how ontology issues over-inflation of annotation affects GO analyses #1869

how ontology issues over-inflation of annotation affects GO analyses #1869

ValWood commented Mar 12, 2018 •

edited

Loading

vanaukenk commented Mar 12, 2018

ValWood commented Mar 12, 2018

ValWood commented Mar 12, 2018

pgaudet commented Jun 6, 2018

ValWood commented Aug 29, 2023

ValWood commented Aug 29, 2023

how ontology issues over-inflation of annotation affects GO analyses #1869

how ontology issues over-inflation of annotation affects GO analyses #1869

Comments

ValWood commented Mar 12, 2018 • edited Loading

vanaukenk commented Mar 12, 2018

ValWood commented Mar 12, 2018

ValWood commented Mar 12, 2018

pgaudet commented Jun 6, 2018

ValWood commented Aug 29, 2023

ValWood commented Aug 29, 2023

ValWood commented Mar 12, 2018 •

edited

Loading