## Vignette for handling Ontobjs

The COBRA package implements a class ```Ontobj```, wich functions as a container for preprocessing and trimming a given biological (regulatory) network. The class is named ```Ontobj``` because it was originally designed to store a hierarchical biological ontology such as Gene Ontology. However, the class can handle any kind of one-layer (Transcription factor regulons, pathways) or multi-layer structures (GO), which consist of biological entities that are annotated to a set of genes. In the following, I demonstrate how to build a multi-layer or one-layer ```Ontobj```, and how to save and load the class.

### Creating and saving an ontobj

In [None]:
# import packages
import sys
sys.path.append('/workspace')
from cobra_ai.module.ontobj import *

In [None]:
# initialize the Ontobj
ontobj = Ontobj(description='GO') # the user can give any description here to specify what kind of network is stored in the class

In [None]:
# initialize the ontology
ontobj.initialize_dag(obo='/workspace/cobra_ai/data/GO/go-basic.obo', # an obo file for a hierarchical ontology
                   gene_annot='/workspace/cobra_ai/data/GO/hgnc_goterm_mapping.txt', # a two-column annotation file (tab-separated) (1st column: genes, 2nd column: terms)
                   filter_id = 'biological_process') # optional parameter: if obo file should be filtered

After the ontology was initialized, it needs to be trimmed so that it can be accommodated into a Variational Autoencoder model. For trimming, the user chooses a bottom threshold and a top threshold, and only terms where the annotated number of genes lies within the thresholds are kept (bottom_thresh < no. of annotated genes < top_thresh). For the annotated genes, the algorithm considers the descendant genes, genes that are annotated either to the term itself or to any of its descendant terms. If the top_thresh and bottom_thresh parameters are not specified, the defaults of 1000 and 30 will be used. Be aware that these defaults were calibrated for Gene Ontology. 
The user can also run the trimming with different thresholds, then multiple versions will be stored in the same ontobj. 

In [None]:
# trim the ontology
ontobj.trim_dag(top_thresh=1000, 
             bottom_thresh=30)

In [None]:
# create binary masks for decoder initialization
ontobj.create_masks()

In [None]:
# save ontobj
ontobj.save('/workspace/cobra_ai/data/GO/GO.ontobj')

### Loading an ontobj and accessing the data

In [None]:
# initialize Ontobj and load existing object
ontobj = Ontobj()
ontobj.load('/workspace/cobra_ai/data/GO/GO.ontobj')

In [None]:
# extract ontology annot
annot = ontobj.extract_annot(top_thresh=1000, bottom_thresh=30)

In [None]:
# extract ontology genes
genes = ontobj.extract_genes()

## Creating ontobj with one-layer network

If one wants to use a one-layer network such as TF regulons, annotations are provided through the gene_annot parameter, and the obo parameter is not used in the ```ìnitialize_dag()``` function

In [None]:
# initialize one-layer network, eg TFs
ontobj.initialize_dag(gene_annot='/workspace/cobra_ai/data/gene_tf_mapping.txt')