<p></p><font size="7" face="courier" color="magenta">cx_assembler </font> **<font size="7" color=red>Documentation </font>**

This notebook explains the how the purpose and implementation of the  [cx_assembler](http://indra.akintunde.org/notebooks/source/cx_assembler.ipynb) file. We explain how each <font color="magenta">aspect</font> in an outputted CX file is created. Below is a list of the 4 main aspects, as well as the other aspects:
* **<font color=magenta>nodes</font>** : Stores all of the nodes' identifiers
* **<font color=magenta>edges</font>** : Stores all of the edges' identifiers
* **<font color=magenta>nodeAttributes</font>** : Stores additional info about each node
* **<font color=magenta>edgeAttributes</font>** : Stores additional info about each edge
* **<font color=magenta>other</font>** : numberVerification, metaData, @Context, networkAttributes, status

### <font color=red>Issues and Future Work :</font>

* <font color=orange>Have statementCreator file call indraView to make code more concise</font>
* <font color=red>If there is no name id's in _get_agent_alias we simply print "NA"</font>
* <font color=red>Too many of the nodes are of type "other" check out the function,</font><font color=blue>_get_agent_type</font>
* <font color=red>Figure out how to deal with the following warnings:</font>
<img src="../data/Images/bel_warnings.png" width=700>

# Basic Summary

## What is an <font color=orange>Indra Statement?</font>

 An <font color=orange>indra statement</font> can be thought of as an **edge** between $2$ **nodes**, containing info about both the **edge** and its $2$ **nodes**. <font color=orange>Indra statements</font> have a pretty complex class structure, with there being [10 different sub-classes](https://indra.readthedocs.io/en/latest/modules/statements.html) of <font color=orange>indra statements</font>. The [Indra Objects](#Indra-Objects) section goes into more detail about what the indra objects are and how they map to the **CX** format. 

## How are the <font color=magenta>aspects</font> created?

The vast majority of the data in a `.cx` file is stored inside $4$ main aspects <font color=magenta>nodes, edges, nodeAttributes, edgeAttributes</font>. The <font color=blue>_add...</font> functions are used to by **Indra** to  collect all the <font color=orange>indra statements</font>. A `for loop` is ran over each <font color=orange>indra statement</font>, adding its information to the $4$ main aspects.

The <font color=magenta>other</font> aspects basically just store metadata about the `.cx` dataset. The values of these aspects are mostly hard-coded inside of the functions <font color=blue>make_model</font> and <font color=blue>print_cx</font>

## Video Tutorial <font color=red>-Not Implemented</font>

**Below is a video summary** of  how `.cx` files are made. 

In [1]:
# Show youtube video
from IPython.core.display import HTML, display
display(HTML("""
<div class="row"><div class="col-xs-12 col-md-offset-3 col-md-6">
<div class="embed-responsive embed-responsive-16by9">
  <iframe class="embed-responsive-item" 
  src='https://www.youtube.com/embed/eV8wWzhIN80'></iframe></div>
</div></div>
"""))

## `.cx` Example File

**Run the cell below** to see a very basic example of the <font color=blue>CX layout</font>

In [34]:
%less /root/Documents/data/CX_output/mini.cx

#  NDEx Common Network Format

In this current implementation of the <font color=magenta>CxAssembler</font> class. We output a `.cx` files that conforms to the <font color=green>NC2</font> NDEx common network format. The document below gives and indepth description of the specification for this data format.

## <font color=green>NC2:</font>

In [6]:
# Displays google doc of CX 
display(HTML("""
  <iframe class="embed-responsive-item" width="100%" height="900"
      src="https://docs.google.com/document/d/13ZKcFBH-E5oiJP2D5zrdFqxLlS9yGtGiiX5Lj92g4EU/edit#heading=h.1t5lf4irpgyj">
  </iframe>"""))


## <font color=green>NC3</font>

In [2]:
# Displays google doc of CX 
display(HTML("""
  <iframe class="embed-responsive-item" width="100%" height="900"
      src="https://docs.google.com/document/d/1EbYQKdsImKp93eh5LnwOsFsxdxs7Hte-g97AuNrT1mo">
  </iframe>"""))


# Indra Objects

 An <font color=orange>indra statement</font> can be thought of as an **edge** between $2$ **nodes**, containing info about both the **edge** and its $2$ **nodes**. There are [10 different sub-classes](https://indra.readthedocs.io/en/latest/modules/statements.html) of <font color=orange>indra statements</font> which can be mapped over to the **CX** format:

1. **Modification** <font color=gray>- represents modification of a protein</font>
1. **SelfModification** <font color=gray>- represents self-modification of a protein</font>
1. **Complex** <font color=gray>- represents set of proteins within a complex</font>
1. **RegulateAmount** <font color=gray>- operations on directed, two-element interactions</font>
1. **Gef** <font color=gray>- Exchange of GTP for GDP on a small GTPase protein mediated by a GEF</font>
1. **Gap** <font color=gray>- Acceleration of a GTPase protein’s GTP hydrolysis rate by a GAP</font>
1. **RegulateActivity**
1. **ActiveForm**
1. **Translocation**

1. **Conversion**

## Nodes and Edges

The first step in creating a `.cx` file from <font color=orange>indra statements</font> is to  figure out how to get dredge the $2$ nodes out of the statements as well as the edge between them. The <font color=orange>indra statements</font> sub-classes can store node and edge data in slightly different ways. So how this info is retreived depends on the type of statement. In the sections below we break down the process of dredging nodes and edges by statement type.

TO DO : Add node edge explanation:

### Modification

**Nodes**
<font color=orange>Modification</font>.enz
<font color=orange>Modification</font>.sub

**Edge**

$($ <font color=orange>Modification</font>.enz , <font color=orange>Modification</font>.sub  $)$ 

``` JSON
{
    '@id' : some_id
    'n' : entit 
    'r' : ( the identifier)

}
```

### SelfModification

**Node**

<font color=orange>SelfModification</font>.enz

** Edge **

$($ <font color=orange>Modification</font>.enz , <font color=orange>Modification</font>.enz  $)$ 

### Complex

The <font color=orange>Complex</font> sub-class represents a set of proteins in a complex. Each protein in the complex becomes a node and and edge is drawn between all proteins <font color=gray>(all nodes)</font> that are part of the complex

**Nodes**

<font color=orange>Complex</font>.members$[\ i\ ]\ \leftarrow \forall \  i \in $ members  

**Edges**

$($ <font color=orange>Complex</font>.members$[\ i\ ]$ , <font color=orange>Complex</font>.members$[\ j\ ]\ )\ \ 
\leftarrow \forall \  i\neq j \in $ members  

#### Code

``` python
def _add_complex(self, stmt):
    for m1, m2 in itertools.combinations(stmt.members, 2):
        m1_id = self._add_node(m1)
        m2_id = self._add_node(m2)
        self._add_edge(m1_id, m2_id, 'Complex', stmt)
```

### RegulateActivity , RegulateAmount

Here <font color=orange>stmt</font> represents either  <font color=orange>RegulateActivity</font>  **or** <font color=orange>RegulateAmount</font> 

**Nodes**
<font color=orange>stmt</font>.subj
<font color=orange>stmt</font>.obj

** Edge **

$($ <font color=orange>stmt</font>.subj , <font color=orange>stmt</font>.obj  $)$ 

### Gef

**Nodes**
<font color=orange>Gef</font>.gef
<font color=orange>Gef</font>.ras

**Edge**

$($ <font color=orange>Gef</font>.gef , <font color=orange>Gef</font>.ras  $)$ 

### Gap

**Nodes**
<font color=orange>Gap</font>.gap
<font color=orange>Gap</font>.ras

**Edge**

$($ <font color=orange>Gef</font>.gap , <font color=orange>Gef</font>.ras  $)$ 

## Data Attributes

### <font color=magenta>Node Attributes</font>

Below we show how <font color=magenta>Node Attributes</font> are added for a single <font color=orange>node</font>, where a <font color=orange>node</font> can be any type of node defined in the [section above](#Nodes-and-Edges) 

<font color=green>type</font> $=$ <font color=blue>_get_agent_type</font> $($ <font color=orange>node</font> $)$

<font color=green>alias</font> $=$ <font color=blue>_get_agent_alias</font> $($ <font color=orange>node</font> $)$

### <font color=magenta>Edge Attributes</font>

Below we show how <font color=magenta>Edge Attributes</font> are added for a single <font color=orange>edge</font>. The following function act on the <font color=orange>statement</font> which contains the corresponding <font color=orange>edge</font> of interest.

<font color=green>mechanism</font> $=$ <font color=orange>statement</font>.\__class__.\__name__

<font color=green>polarity</font> $=$ <font color=blue>_get_stmt_type</font> $($ <font color=orange>statement</font> $)[\ 1\ ]$

<font color=green>citations</font> $=$ <font color=blue>_get_stmt_citations</font> $($ <font color=orange>statement</font> $)$

<font color=gray>causality $=$ <font color=orange>statement</font>.evidence.epistemics 

 $\ \leftarrow\ $ leave out for now</font>

# <font color=blue>Functions</font> for Indra to CX

## How <font color=magenta>Nodes</font> are Added


The <font color=blue>_add_node</font> function is used to add a node to the <font color=magenta>nodes</font> aspect. Each <font color=orange>indra statement</font> contains $2$ nodes that are added to <font color=magenta>nodes</font>. The methodology on how a statement's nodes are added varies a little bit [based on the sub-class](https://indra.readthedocs.io/en/latest/modules/statements.html) of the <font color=orange>indra statement</font>, as can be seen by looking at the various <font color=blue>_add...</font> functions. However, at the end of the day the <font color=blue>_add_node</font> function, is always the final thing called to add a node to <font color=magenta>nodes</font>.

Path of <font color=blue>function calls</font> <font color=gray>(embedded)</font> that lead to the addition of a single node to the <font color=magenta>nodes</font> aspect:  
<p><center>
    make_model $\rightarrow$ print_cx $\rightarrow$ \_add_... $\rightarrow$ _add_node
</center></p>

### <font color=magenta>nodes</font> format:

Each node in <font color=magenta>nodes</font> has the following format:
``` python
node = {'@id': node_id,
        'n': agent.name,
        'r': agent.name} 
```

## How <font color=magenta>Nodes Attributes</font> are Added

When the <font color=blue>_add_node</font> is called to add a node <font color=gray>(see above section)</font>, it calls a sub-function, <font color=blue>_add_node_metadata</font> which put's the node's information in the <font color=magenta>nodeAttributes</font> aspect. <font color=blue>_add_node_metadata</font> adds the node's <font color=green>type</font> <font color=gray>(gene, protein,... )</font> as well at the node's <font color=green>alias</font> names <font color=gray>(uniprot, PubChem,... )</font> to <font color=magenta>nodeAttributes</font>.

Path of the <font color=blue>function calls</font> <font color=gray>(embedded)</font> that lead to the addition of a single node attribute to the <font color=magenta>nodeAttributes</font> aspect:  
<p><center>
make_model $\rightarrow$ print_cx $\rightarrow$ \_add_... $\rightarrow$ _add_node $\rightarrow$ _add_node_metadata
</center></p>

## How <font color=magenta>Edges</font> are Added

The <font color=blue>_add_edge</font> function is used to add an edge to the <font color=magenta>edges</font> aspect. Each <font color=orange>indra statement</font> has $1$ edge that is added to <font color=magenta>edges</font>. The methodology on how a statement's edge is added varies a little bit [based on the sub-class](https://indra.readthedocs.io/en/latest/modules/statements.html) of the <font color=orange>indra statement</font>, as can be seen by looking at the various <font color=blue>_add...</font> functions. However, at the end of the day the <font color=blue>_add_edge</font> function, is always the final thing called to add an edge to <font color=magenta>edges</font>.


Path of the <font color=blue>function calls</font> <font color=gray>(embedded)</font> that lead to the addition of a single edge to the <font color=magenta>edges</font> aspect  
<p><center>
make_model $\rightarrow$ print_cx $\rightarrow$ \_add_... $\rightarrow$ _add_edge
</center></p>

### <font color=magenta>edges</font> format


Each edge in <font color=magenta>edges</font> follows the following basic format:
``` python
edge = {'@id': edge_id,
        's': source,
        't': target,
        'i': interaction}
```

## How <font color=magenta>Edge Attributes</font> are Added

When the <font color=blue>_add_edge</font> is called to add an edge <font color=gray>(see above section)</font>, it calls a sub-function, <font color=blue>_add_edge_metadata</font> which put's the edge's information in the <font color=magenta>edgeAttributes</font> aspect. 

Path of the <font color=blue>function calls</font> <font color=gray>(embedded)</font> that lead to the addition of a single edge attribute to the <font color=magenta>edgeAttributes</font> aspect  

<p><center>
make_model $\rightarrow$ print_cx $\rightarrow$ \_add_... $\rightarrow$ _add_edge $\rightarrow$ _add_edge_metadata
</center></p>

## How <font color=magenta>Other</font> Aspects are Added

### <font color=magenta>numberVerification</font>

This aspect is manually hardcoded in the funct <font color=blue>print_cx</font>. The code for this is as follows:

``` python
full_cx['numberVerification'] = [{'longNumber': 281474976710655}]
```

### <font color=magenta>metaData</font>

<font color=magenta>metaData</font> a little some summary metadata for the aspects <font color=magenta>@Context, networkAttributes, nodes, edges, nodeAttributes, edgeAttributes</font>. It's values are set in the function <font color=blue>make_model</font>. It uses the helper function <font color=blue>_get_aspect_metadata</font>, whose code can also be found within the <font color=blue>make_model</font> function.

``` python
aspects = ['@Context','networkAttributes','nodes', 'edges','nodeAttributes','edgeAttributes']
for aspect in aspects:
    metadata = _get_aspect_metadata(aspect)
```



### <font color=magenta>@Context</font>

<font color=magenta>@Context</font> is added in the initialisation <font color=gray>(\__init__)</font> of <font color=magenta>CxAssembler</font>. The code is as follows:

``` python
self.cx = {'@Context':[ _add_context() ],
```

### <font color=magenta>networkAttributes</font>

All the values of <font color=magenta>networkAttributes</font> are set in the function <font color=blue>make_model</font>. These values, <font color=green>name, description, version</font> are passed as parameters to the <font color=blue>make_model</font> function:

``` python
def make_model(self, add_indra_json=True, 
               name='indra_assembled', description='An Indra Auto-Curated network', version='1.0'):
```

### <font color=magenta>status</font>

This aspect is manually hardcoded in the funct <font color=blue>print_cx</font>. The code for this is as follows:

``` python
full_cx['status'] = [{'error': '', 'success': True}]
```