Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log file check for ID prefixes #614

Closed
ValWood opened this issue Feb 22, 2017 · 8 comments
Closed

log file check for ID prefixes #614

ValWood opened this issue Feb 22, 2017 · 8 comments
Assignees

Comments

@ValWood
Copy link
Member

ValWood commented Feb 22, 2017

Malformed GO ID was not reported in logs....

/GO="aspect=P; term=spermidine acetylation; GOid=O:0032918; evidence=ISS; db_xref=GO_REF:0000024; with=UniProtKB:P21673; date=20140929"

@ValWood ValWood added the logs label Jun 27, 2017
@mah11 mah11 changed the title log file check log file check for ID prefixes Jul 13, 2017
@mah11
Copy link
Member

mah11 commented Jul 13, 2017

also doesn't report "UniProt" instead of "UniProtKB" (led to problem in pombase/website#410)

@ValWood
Copy link
Member Author

ValWood commented Jan 24, 2018

would
pombase/website#665
find
GOid=O:0032918
?

@ValWood
Copy link
Member Author

ValWood commented Feb 14, 2018

Hi @kimrutherford was this covered when you made recent additional checks on the artemis data loaded?

@kimrutherford
Copy link
Member

kimrutherford commented Feb 14, 2018

GOid=O:0032918

That should give a warning now.

also doesn't report "UniProt" instead of "UniProtKB"

There is no checking of the "with" values except for making sure there is a prefix. It doesn't check that the prefix is sensible. I'll fix that.

@kimrutherford
Copy link
Member

Here are is a list of the prefixes in Chado and how often they are used. Could you let me know if this list looks OK? Could you let me know which of these prefixes are valid?

I'll add a query to the chado-checks that warns about any prefix not on your list. The query results will include the publication, session, gene and term of the annotation with the dodgy "with".

    prefix        | count
------------------+-------
AGI_LocusCode     |     5
CBS               |     6
dictyBase         |     1
EC                |   208
ECOGENE           |     7
FB                |     8
GO                |    95
InterPro          |  1598
MGI               |    19
Panther           |     1
PANTHER           |    32
PDB               |     2
Pfam              |   847
PomBase           |  3318
RGD               |     2
SGD               |  5368
SUPERFAMILY       |     1
UniPathway        |    86
UniProtKB         |  1547
UniProtKB-KW      |  1776
UniProtKB-SubCell |    73
UniRule           |     2
WB                |     2

@ValWood
Copy link
Member Author

ValWood commented Mar 13, 2018

I think all are correct, except "Panther" case which I will fix now. More than reported, possibly because of redundancy filtering.

chromosome1.contig:FT BDH1 ortholog; db_xref=Panther:PTHR43490; date=20170625"
chromosome1.contig:FT YKL128C; db_xref=Panther:PTHR43387; date=20170627"
chromosome1.contig:FT db_xref=Panther:PTHR43016:SF3; date=20170726"
chromosome1.contig:FT YKL128C; db_xref=Panther:PTHR43387; date=20170627"
chromosome2.contig:FT YBL029C-A; db_xref=Panther:PTHR28139; date=20170618"
chromosome3.contig:FT with=Panther:PTHR23029; date=20170625"
chromosome3.contig:FT db_xref=Panther:PTHR15892; date=20170227"
chromosome3.contig:FT in fungi; db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT in metazoa; db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT in vertebrates; db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT in eukaryotes; db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT in eukaryoties only; db_xref=Panther:PTHR10957;
chromosome3.contig:FT db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT YPL161C; db_xref=Panther:PTHR10957; date=20170618"
chromosome3.contig:FT db_xref=Panther:PTHR10957; date=20170618"

@ValWood
Copy link
Member Author

ValWood commented Mar 13, 2018

fixed

@kimrutherford
Copy link
Member

I've added Chado check to catch illegal "with" prefixes. It will report the problematic ID, the session ID and the PubMed ID.

kimrutherford added a commit to pombase/pombase-legacy that referenced this issue Dec 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants