# Loading BEL Documents

We'll always start by importing `pybel`.

In [1]:
import os
from urllib.request import urlretrieve

import pybel
import logging


logging.getLogger('pybel').setLevel(logging.DEBUG)
logging.basicConfig(level=logging.DEBUG)
logging.getLogger('urllib3').setLevel(logging.WARNING)

In [2]:
print(pybel.get_version())

0.15.0-dev


In [3]:
DESKTOP_PATH = os.path.join(os.path.expanduser('~'), 'Desktop')
manager = pybel.Manager(f'sqlite:///{DESKTOP_PATH}/pybel_example_database.db')

DEBUG:pybel.manager.base_manager:auto flush: True, auto commit: False, expire on commmit: True


First, we'll download and parse a BEL document from the Human Brain Pharmacome project describing the 2018 paper from Boland *et al.*, "Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing".

In [4]:
url = 'https://raw.githubusercontent.com/pharmacome/conib/master/hbp_knowledge/tau/boland2018.bel'

A BEL document can be downloaded and parsed from a URL using `pybel.from_bel_script_url`. Keep in mind, the first time we load a given BEL document, various BEL resources that are referenced in the document must be cached. Be patient - this can take up to ten minutes.

In [5]:
boland_2018_graph = pybel.from_bel_script_url(url, manager=manager)

INFO:pybel.io.lines:Loading from url: https://raw.githubusercontent.com/pharmacome/conib/master/hbp_knowledge/tau/boland2018.bel
INFO:pybel.io.line_utils:Finished parsing document section in 0.00 seconds
INFO:pybel.io.line_utils:Finished parsing definitions section in 0.00 seconds
downloading namespaces:   0%|          | 0/13 [00:00<?, ?it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/pharmacome/terminology/c328ad964c08967a0417a887510b97b965a62fa5/external/mesh-names.belns
downloading namespaces:   8%|▊         | 1/13 [00:00<00:09,  1.22it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/pharmacome/terminology/b46b65c3da259b6e86026514dfececab7c22a11b/external/chebi-names.belns
downloading namespaces:  15%|█▌        | 2/13 [00:01<00:07,  1.42it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/pharmacome/terminology/b46b65c3da259b6e86026514dfececab7c22a11b/external/drugbank-name

In [6]:
pybel.to_database(boland_2018_graph, manager=manager)

DEBUG:pybel.manager.cache_manager:inserting Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing v1.0.0
downloading namespaces: 100%|██████████| 13/13 [00:00<00:00, 847.36it/s]
INFO:pybel.manager.cache_manager:creating regex namespace: DBSNP:rs[0-9]+
INFO:pybel.manager.cache_manager:creating regex namespace: TAXONOMY:^\d+$
INFO:pybel.manager.cache_manager:creating regex namespace: PUBCHEM:^\d+$
downloading annotations: 100%|██████████| 10/10 [00:00<00:00, 946.99it/s]
DEBUG:pybel.manager.cache_manager:inserting Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing v1.0.0 into edge store
DEBUG:pybel.manager.cache_manager:building node models
nodes: 100%|██████████| 200/200 [00:01<00:00, 112.49it/s]
DEBUG:pybel.manager.cache_manager:built 200 node models in 1.78 seconds
DEBUG:pybel.manager.cache_manager:stored 200 node models in 0.01 seconds
DEBUG:pybel.manager.cache_manager:building edge models
edges: 100%|██████████| 4

Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing v1.0.0

The graph is loaded into an instance of the `pybel.BELGraph` class. We can use the `pybel.BELGraph.summarize()` to print a brief summary of the graph.

In [7]:
boland_2018_graph.summarize()

---------------------  ---------------------------------------------------------------------------------------
Name                   Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing
Version                1.0.0
Authors                Esther Wollert, Sandra Spalek, and Charles Tapley Hoyt
Number of Nodes        200
Number of Namespaces   9
Number of Edges        400
Number of Annotations  3
Number of Citations    1
Number of Authors      0
Network Density        1.01E-02
Number of Components   3
---------------------  ---------------------------------------------------------------------------------------

Type (7)             Count  Example
-----------------  -------  -----------------------------------------------------------------
Protein                100  p(HGNC:CAPN2)
Abundance               59  a(CHEBI:cilostazol)
BiologicalProcess       21  bp(GO:"response to endoplasmic reticulum stress")
Pathology                7  path(MESH:"Frontotempo

Next, we'll open and parse a BEL document from the Human Brain Pharmacome project describing the 2018 paper from Cabellero *et al.*, "Interplay of pathogenic forms of human tau with different autophagic pathways". This example uses `urlretrieve()` to download the file locally to demonstrate how to load from a local file path.

In [8]:
url = 'https://raw.githubusercontent.com/pharmacome/conib/master/hbp_knowledge/tau/caballero2018.bel'
path = os.path.join(DESKTOP_PATH, 'caballero2018.bel')

if not os.path.exists(path):
    urlretrieve(url, path)

A BEL document can also be parsed from a path to a file using `pybel.from_bel_script`. Like before, we will summarize the graph after parsing it.

In [9]:
cabellero_2018_graph = pybel.from_bel_script(path, manager=manager)

cabellero_2018_graph.summarize()

INFO:pybel.io.lines:Reading BEL script at /Users/cthoyt/Desktop/caballero2018.bel
INFO:pybel.io.line_utils:Finished parsing document section in 0.00 seconds
INFO:pybel.io.line_utils:Finished parsing definitions section in 0.00 seconds
downloading namespaces:   0%|          | 0/13 [00:00<?, ?it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/pharmacome/terminology/b46b65c3da259b6e86026514dfececab7c22a11b/external/mesh-names.belns
downloading namespaces:   8%|▊         | 1/13 [00:00<00:09,  1.20it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/pharmacome/conso/501ceccdc9a27d97edbdc48a89ebe8e1dd3626e9/export/conso.belns
downloading namespaces:  62%|██████▏   | 8/13 [00:01<00:02,  1.68it/s]DEBUG:bel_resources.read_utils:getting resource: https://raw.githubusercontent.com/sorgerlab/famplex/e8ae9926ff95266032cb74f77973c84939bffbeb/export/famplex.belns
downloading namespaces: 100%|██████████| 13/13 [00:01<00:00,  9.21i

---------------------  -----------------------------------------------------------------------------
Name                   Interplay of pathogenic forms of human tau with different autophagic pathways
Version                1.0.1
Author                 Sandra Spalek
Number of Nodes        59
Number of Namespaces   7
Number of Edges        141
Number of Annotations  3
Number of Citations    1
Number of Authors      0
Network Density        4.12E-02
Number of Components   1
---------------------  -----------------------------------------------------------------------------

Type (6)             Count  Example
-----------------  -------  -----------------------------------------------------
Protein                 21  p(CONSO:CONSO00057)
Abundance               12  a(CHEBI:leupeptin)
BiologicalProcess        7  bp(GO:"response to oxidative stress")
Pathology                7  path(MESH:Tauopathies)
Composite                7  composite(a(CHEBI:thapsigargin), p(CONSO:CONSO00053))
Complex 

In [11]:
pybel.to_database(cabellero_2018_graph, manager=manager)

DEBUG:pybel.manager.cache_manager:inserting Interplay of pathogenic forms of human tau with different autophagic pathways v1.0.1
downloading namespaces: 100%|██████████| 13/13 [00:00<00:00, 771.01it/s]
downloading annotations: 100%|██████████| 10/10 [00:00<00:00, 901.07it/s]
DEBUG:pybel.manager.cache_manager:inserting Interplay of pathogenic forms of human tau with different autophagic pathways v1.0.1 into edge store
DEBUG:pybel.manager.cache_manager:building node models
nodes: 100%|██████████| 59/59 [00:00<00:00, 214.12it/s]
DEBUG:pybel.manager.cache_manager:built 59 node models in 0.28 seconds
DEBUG:pybel.manager.cache_manager:stored 59 node models in 0.01 seconds
DEBUG:pybel.manager.cache_manager:building edge models
edges: 100%|██████████| 141/141 [00:00<00:00, 221.44it/s]
DEBUG:pybel.manager.cache_manager:built 141 edge models in 0.64 seconds
DEBUG:pybel.manager.cache_manager:stored 141 edge models in 0.01 seconds
INFO:pybel.manager.cache_manager:inserted Interplay of pathogenic f

Interplay of pathogenic forms of human tau with different autophagic pathways v1.0.1

We can combine two or more graphs in a list using `pybel.union`.

In [10]:
combined_graph = pybel.union([boland_2018_graph, cabellero_2018_graph])

combined_graph.summarize()

---------------------  ---------------------------------------------------------------------------------------
Name                   Promoting the clearance of neurotoxic proteins in neurodegenerative disorders of ageing
Version                1.0.0
Authors                Esther Wollert, Sandra Spalek, and Charles Tapley Hoyt
Number of Nodes        250
Number of Namespaces   10
Number of Edges        541
Number of Annotations  4
Number of Citations    2
Number of Authors      0
Network Density        8.69E-03
Number of Components   2
---------------------  ---------------------------------------------------------------------------------------

Type (7)             Count  Example
-----------------  -------  ----------------------------------------------------------------------
Protein                120  p(HGNC:MAPT, pmod(go:0006468 ! "protein phosphorylation"))
Abundance               70  a(PUBCHEM:46216556)
BiologicalProcess       24  bp(GO:"autophagosome assembly")
Pathology        

Note that there are some overlapping nodes, but no overlapping edges.