Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add motif predictions to parse_ergatis_euk_functional_pipeline.py #11

Open
mchibucos opened this issue Feb 18, 2014 · 0 comments
Open

Comments

@mchibucos
Copy link

The euk functional annotation script (sandbox/jorvis/parse_ergatis_euk_functional_pipeline.py) might be augmented with some additional evidence. I propose adding the following predictions:
SignalP
SecretomeP
TMHMM
TargetP
(More information can be found here: http://www.cbs.dtu.dk/services/ and there are additional prediction tools there, as well.)

With respect to how to handle the annotation name in column 9 of the GFF3 file, I propose adding information to those names that would otherwise be "Hypothetical protein" due to lack of significant matches to other evidence (e.g. no named BLAST hits from UniProt, nor any HMM results). For example, if a protein is putatively secreted, but otherwise has no annotation, we might call it "Hypothetical secreted protein", and if a protein localizes to the membrane, it could be called "Hypothetical transmembrane protein".

For database submissions, this might not be useful (as GenBank would reject annotations following such nomenclature), but we could parse those prior to submission to GenBank. (For example, all proteins called "Hypothetical" followed by any other text would be renamed "Hypothetical protein".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant