# How to get Descendants Count

  1. Download Ontologies, if necessary
  2. Choose a set of GO IDs to print
  3. GoSubDag contains descendants count
  4. Print the descendants count (_dcnt_)
  

## 1. Download Ontologies, if necessary

In [1]:
from goatools.base import get_godag
godag = get_godag("go-basic.obo")


  EXISTS: go-basic.obo
go-basic.obo: fmt(1.2) rel(2019-05-09) 47,407 GO Terms


## 2. Choose a set of GO IDs to print
### 2a. Choose a GO IDs related to bacteria

In [2]:
# Choose a deep leaf-level GO ID associated with "bacteria"
DESC = 'bacteria'            # GO Term name contains this
NSPC = 'cellular_component'  # Desired namespace

# Create a chooser function which returns True or False
def chooser(goterm):
    """Choose a leaf-level GO term based on its name"""
    b_match = DESC in goterm.name
    # True if GO term is leaf-level (has no children)
    b_leaf = not goterm.children
    # True if GO term is in 'cellular_component' namespace (nspc)
    b_nspc = goterm.namespace == NSPC
    return b_match and b_leaf and b_nspc

# Get GO terms with desired name in desired GO DAG branch
go_ids_selected = set(o.item_id for o in godag.values() if chooser(o))

print('{N} {desc} GO terms'.format(N=len(go_ids_selected), desc=DESC))

24 bacteria GO terms


### 2b. Get the GO-DAG subset for your GO IDs

In [3]:
from goatools.gosubdag.gosubdag import GoSubDag
gosubdag = GoSubDag(go_ids_selected, godag)

INITIALIZING GoSubDag:  24 sources in  60 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth GO alt GO_name dcnt D1 id
             GoSubDag: relationships: set()


### 2c. Get the deepest GO ID in the GO DAG subset

In [4]:
go_id, go_term = max(gosubdag.go2obj.items(), key=lambda t: t[1].depth)

# Print GO ID, using print format in gosubdag
print(go_id, go_term.name)

GO:1990061 bacterial degradosome


### 2d. Get all parents of the deepest GO ID

In [5]:
go_ids_chosen = go_term.get_all_parents()
print('{N} ancestors for {GO} "{name}"'.format(
    N=len(go_ids_chosen), GO=go_term.item_id, name=go_term.name))

10 ancestors for GO:1990061 "bacterial degradosome"


## 3. GoSubDag contains descendants count

### gosubdag.go2nt
The data member, _**go2nt**_, of the class, _**GoSubDag**_, is a dict where:
  * _**go**_ is the GO ID (e.g., GO:1990061)    
  * _**nt**_ is the namedtuple


### The namedtuple field, _dcnt_, contains the descendants count.

Additional namedtuple fields:

| field   | description
|---------|-----------------
| NS      | Namespace: BP, MF, or CC
| level   | Minimum path from the top of the branch
| depth   | Maximum path from the top of the branch
| GO      | GO ID
| GO_name | GO Name - A short description
| dcnt    | Descendants count

In [6]:
# Add the deep GO ID to its list of ancestors for printing
go_ids_chosen.add(go_id)
nts = [gosubdag.go2nt[go] for go in go_ids_chosen]

## 4. Print the descendants count (_dcnt_)

In [7]:
fmt_str = '{I:2}) {NS} {GO:10} {dcnt:11}        D{depth:02}  {GO_name}'

# Print selected GO information
print('IDX NS GO ID      Descendants Count Depth Name')
print('--- -- ---------- ----------------- ----- --------------------')
for idx, nt_go in enumerate(sorted(nts, key=lambda nt: nt.depth), 1):
    print(fmt_str.format(I=idx, **nt_go._asdict()))

IDX NS GO ID      Descendants Count Depth Name
--- -- ---------- ----------------- ----- --------------------
 1) CC GO:0005575        4206        D00  cellular_component
 2) CC GO:0032991        2116        D01  protein-containing complex
 3) CC GO:0044464        3321        D01  cell part
 4) CC GO:0044424        2376        D02  intracellular part
 5) CC GO:1902494         546        D02  catalytic complex
 6) CC GO:0044444        1271        D03  cytoplasmic part
 7) CC GO:1905354           8        D03  exoribonuclease complex
 8) CC GO:0044445          87        D04  cytosolic part
 9) CC GO:0000178           5        D04  exosome (RNase complex)
10) CC GO:0000177           2        D05  cytoplasmic exosome (RNase complex)
11) CC GO:1990061           0        D06  bacterial degradosome


$ ../scripts/go_plot.py GO:1990061
<img src="images/GO_1990061_bacterial_degradosome.png" width="70%">

Copyright (C) 2016-2019, DV Klopfenstein et al. All rights reserved