# Parsing the Causal Biological Network Database

**Author:** [Charles Tapley Hoyt](https://github.com/cthoyt/)

**Estimated Run Time:** 1 minute

This notebook outlines the process of parsing the JSON Graph File format used in the Causal Biological Network (CBN) Database. 

In [1]:
import json
import requests
import os
import time

import networkx as nx

import pybel
from pybel.constants import *
import pybel_tools
from pybel.io.jupyter import to_jupyter
# from pybel_tools.visualization import to_jupyter

In [2]:
pybel.version

<module 'pybel.version' from '/home/somya/miniconda3/lib/python3.7/site-packages/pybel/version.py'>

In [3]:
pybel_tools.version

<module 'pybel_tools.version' from '/home/somya/miniconda3/lib/python3.7/site-packages/pybel_tools/version.py'>

In [4]:
time.asctime()

'Wed Apr  8 13:06:22 2020'

## Data Acquisition

Data can be downloaded directly from the `GetJSONGraphFile` endpoint, and are returned in the response as JSON.

In [5]:
# res = requests.get("http://causalbionet.com/Networks/GetJSONGraphFile?networkId=hox_2.0_hs").json()
f = open('COVID19.jgf')
covid_dict = json.load(f)

## Parsing

The structure is traversed, and the [BELParser](http://pybel.readthedocs.io/en/latest/parser.html#pybel.parser.parse_bel.BelParser) is manipulated directly. Normally, during BEL compilation, the usage of this class is hidden from the user.

In [10]:
graph = pybel.BELGraph()
parser = pybel.parser.BELParser(graph)
#parser = pybel.parser.BelParser(graph)

In [11]:
def get_citation(evidence):
    return {
        CITATION_NAME: evidence['citation']['name'],
        CITATION_TYPE: evidence['citation']['type'],
        CITATION_REFERENCE: evidence['citation']['id']
    }

In [12]:
annotation_map = {
    'tissue': 'Tissue',
    'disease': 'Disease',
    'species_common_name': 'Species'
}

In [13]:
species_map = {
    'human': '9606',
    'rat': '10116',
    'mouse': '10090'
}

In [14]:
annotation_value_map = {
    'Species': species_map
}

In [16]:
for edge in covid_dict['graph']['edges']:    
    for evidence in edge['metadata']['evidences']:
        if 'citation' not in evidence or not evidence['citation']:
            continue
        
        parser.control_parser.clear()
        parser.control_parser.citation = get_citation(evidence)
        parser.control_parser.evidence = evidence['summary_text'] 
        
        d = {}
        
        if 'biological_context' in evidence:
            annotations = evidence['biological_context']
        
            if annotations['tissue']:
                d['Tissue'] = annotations['tissue']

            if annotations['disease']:
                d['Disease'] = annotations['disease']

            if annotations['species_common_name']:
                d['Species'] = species_map[annotations['species_common_name'].lower()]
        
        parser.control_parser.annotations.update(d)
        bel = '{source} {relation} {target}'.format_map(edge)
        try:
            parser.parseString(bel)
        except Exception as e:
            print(e, bel)

KeyError: 'evidences'

## Visualization

Finally, the graph is vizualized in the notebook diretly with `pybel_tools.visualization.to_jupyter`.

In [31]:
to_jupyter(graph)

<IPython.core.display.Javascript object>

In [32]:
pybel.to_database(graph)

ValueError: Can not upload a graph without a name

In [33]:
pybel.get_ver

AttributeError: module 'pybel' has no attribute 'get_ver'

# Using PyBEL Functions

This pipeline is implemented directly in PyBEL at [pybel.from_cbn_jgif]()

In [34]:
with open(os.path.join(os.environ['BMS_BASE'], 'cbn', 'Human-2.0', 'Hox-2.0-Hs.jgf')) as f:
    graph_jgif_dict = json.load(f)

KeyError: 'BMS_BASE'

In [35]:
%%time
graph = pybel.from_cbn_jgif(graph_jgif_dict)

NameError: name 'graph_jgif_dict' is not defined

In [36]:
bel_lines = pybel.to_bel_lines(graph)

graph_reloaded = pybel.from_lines(bel_lines)

AttributeError: module 'pybel' has no attribute 'to_bel_lines'

In [37]:
to_jupyter(graph_reloaded)

NameError: name 'graph_reloaded' is not defined