# Converting RNA dot-brackets to graphs

How do we convert the RNA structure into a graph? 
Well, it all starts with a dot-bracket notation.

## Dot-bracket notation

Given the following notation:

```
((((....((((((..((((.(((...(((((((((((((((((((.((((((((((((((((...............)))))))))))))))).))))))))))))))))))))))))......))...)))))))))).
```

We can express it as a graph.
Each character is a position.
Pairs of open and closed parentheses indicate a pairing between those two positions.
The following function defines a validation check that a dot-bracket string is valid
for a given RNA sequence.

In [None]:
from drosha_gnn.graph import validate

validate??

In [None]:
dot_bracket = "(((((.(((((((((.....))))))).)).((((((...((((.(((((((..((.((.((((((((............)))))))).)).))..))))))).))))...))))))....((((....))))((....)).)))))."
sequence = "U" * len(dot_bracket)
validate(dot_bracket, sequence)

## Pairing parentheses

How do we write an algorithm that pairs up parentheses?

One way is to loop over the dot-bracket notation
and start by collecting the positions of open parentheses.
We keep appending the positions to a list.
As soon as we hit a closing parentheses,
we pop the last position of the open parentheses,
pair it with the current position of the closing parentheses,
and store the pair together in a pairing list.

At the end, openings should be empty,
while pairings will contain the so-called edge list of our graph.

In [None]:
from drosha_gnn.graph import base_pair_edges
base_pair_edges??

In [None]:
e = base_pair_edges(dot_bracket)

Besides that, we also need the backbone edges:

In [None]:
from drosha_gnn.graph import backbone_edges

backbone_edges??

## NetworkX Graph

Once we have the pairing list, we can now construct a NetworkX graph.
The graph definition is as follows:

1. Nodes are integers, which represent the position.
2. Edges are pairs of integers, which represent base pairing.

This is an undirected graph,
as there is no semantically interpretable notion of directionality
in a base pairing system.

In [None]:
from drosha_gnn.graph import to_networkx

to_networkx??

In [None]:
import networkx as nx
G = to_networkx(dot_bracket)
nx.draw(G)