# QuckGOProteinAnnotation
Protein function annotations provided by [QuickGO](https://www.ebi.ac.uk/QuickGO/) can be used to
classify proteins specified by [UniProt](https://www.uniprot.org/) ID.
So far, currently class-assignment are non-unique, so a protein can be

## Installation in Conda
If not already installed, install **pip** and **git**:
```
conda install git
conda install pip
```
Then install via pip:
```
pip install git+git://github.com/c-feldmann/QuickGOProteinAnnotation
```

## Quickstart
### From Terminal
```
python annotate_protein_list.py -i demo_data/demo_uniprot_ids.tsv -o demo_data/demo_output.tsv -c "uniprot_id" -s tab
```
### In Python

In [2]:
from go_protein_annotation  import DefaultAnnotation
from go_protein_annotation import  AllFunctionAnnotation

In [3]:
test_proteins = ["Q16512", "P30085", "P25774"]

In [4]:
default_annotation = DefaultAnnotation()
protein_class_df = default_annotation.annotate_proteins(test_proteins)

In [5]:
protein_class_df


Unnamed: 0,uniprot_id,functions
0,Q16512,Transcription regulator
1,Q16512,Kinase
2,P30085,Kinase
3,P25774,Peptidase


## Details
QuckGO functions are ordered hierarchically. E.g. an explicit annotation of
[peptidase activity](https://www.ebi.ac.uk/QuickGO/term/GO:0008233) implies a
[hydrolase activity](https://www.ebi.ac.uk/QuickGO/term/GO:0016787) as well. Provided code extracts
all explicit functional annotations and extends it with implicit annotations.

In [6]:
all_functions = AllFunctionAnnotation()
all_functions.get_protein_functions("Q16512")


Unnamed: 0,uniprot_id,go_id,name
0,Q16512,GO:0098772,molecular function regulator
1,Q16512,GO:0004672,protein kinase
2,Q16512,GO:0036094,small molecule binding
3,Q16512,GO:0019901,protein kinase binding
4,Q16512,GO:0032553,ribonucleotide binding
5,Q16512,GO:0043167,ion binding
6,Q16512,GO:0140110,transcription regulator
7,Q16512,GO:0003712,transcription coregulator
8,Q16512,GO:0009931,calcium-dependent protein serine/threonine kinase
9,Q16512,GO:0050681,androgen receptor binding
