Skip to content

Latest commit

 

History

History
240 lines (172 loc) · 8.69 KB

tutorial.rst

File metadata and controls

240 lines (172 loc) · 8.69 KB

ODGI Python Tutorial

In this tutorial we use the python bindings to explore some basic concepts. We do note that the python bindings may be slow for larger graphs.

Creating Graphs

Let's say that you wanted to create the following graph with ODGI:

/img/exampleGraph.png

This graph is a combination of nodes (labelled as n0, n1, ..., n9) and directed edges (arrows).

ODGI Objects

Edges and nodes are accessed through a :class:`odgi.graph` object. Individual nodes in the graph are pointed at by :class:`odgi.handle`.

Paths in the graph and accessed through :class:`odgi.path_handle`, which is a series of :class:`odgi.step_handle` linked together. Each :class:`odgi.step_handle` points to the node in that step, and also contains directional information regarding the nodes preceeding and following it.

Handles are pointers to specific pieces of the graph, and it is not possible to operate on them directly, aside from comparing whether the objects are equal. To get information regarding the object that each handle is pointing to, use the get accessor method in :class:`odgi.graph`.

Reference materials for these methods can be found at the :ref:`api`, as well as the :ref:`glossary`, which contains lists sorted by object type for :ref:`accessor`, :ref:`mutator`, and :ref:`iterator`.

Making a Graph

First, we must create the graph, then make each node and keep track of their handles.

gr = odgi.graph()
seq = ["CGA", "TTGG", "CCGT", "C", "GT", "GATAA", "CGG", "ACA", "GCCG", "ATATAAC"]
n = []
for s in seq:
        n.append(gr.create_handle(s))

Now we link together these nodes using their handles. Note that each of these handles is directional, so in order to create the bidirectional edge between n5 and n8 we use create_edge twice.

gr.create_edge(n[0], n[1])
gr.create_edge(n[1], n[2])
gr.create_edge(n[2], n[3])
gr.create_edge(n[2], n[4])
gr.create_edge(n[3], n[5])
gr.create_edge(n[5], n[6])
gr.create_edge(n[5], n[8])
gr.create_edge(n[6], n[7])
gr.create_edge(n[6], n[8])
gr.create_edge(n[7], n[9])
gr.create_edge(n[8], n[9])
gr.create_edge(n[8], n[5])

Traversing Edges

If we wanted to traverse these edges, we could do it using the iterator method :func:`odgi.graph.follow_edges`.

def next_node_list(handle):
        lis = []
        gr.follow_edges(handle, False, lambda y: lis.append(y))
        return lis

print(f'n0: {gr.get_sequence(n[0])}')
next_node = next_node_list(n[0])[0]
print(f'n1: {gr.get_sequence(next_node)}')
next_node = next_node_list(next_node)[0]
print(f'n2: {gr.get_sequence(next_node)}')

Which will output the following:

n0: CGA
n1: TTGG
n2: CCGT

A map of the data can be generated using :func:`odgi.graph.to_gfa`.

print(gr.to_gfa())

Creating a Path

Generating a linear sequence from this graph could be done in infinitely many ways, due to the interal loop between n5, n6, and n8. If we wanted to define a single consensus sequence, we would do this by defining a path.

/img/exampleGraphPath.png

To create the hilighted path, we would need to create a :class:`odgi.path_handle` in the graph, and then append each :class:`odgi.handle` to the end of the path.

path = gr.create_path_handle("path")
gr.append_step(path, n[0])
gr.append_step(path, n[1])
gr.append_step(path, n[2])
gr.append_step(path, n[4])
gr.append_step(path, n[5])
gr.append_step(path, n[6])
gr.append_step(path, n[7])
gr.append_step(path, n[9])

Warning

:func:`odgi.graph.append_step` will not stop you from appending nodes that are not connected to the preceeding node.

# the following code runs without error
badpath = gr.create_path_handle("badpath")
gr.append_step(badpath, n[0])
gr.append_step(badpath, n[3])

Traversing a path

To traverse a path, we need to fetch a series of :class:`odgi.step_handle` from the graph. Note that although we are effectively asking the path for these items in it, all accessor methods are a part of the :class:`odgi.graph` object.

step = gr.path_begin(path)
while(gr.has_next_step(step)):
        # get the node handle from the step handle
        current_node_handle = gr.get_handle_of_step(step)
        # ask the node handle for the sequence
        print(gr.get_sequence(current_node_handle))
        # progress to the next step
        step = gr.get_next_step(step)
current_node_handle = gr.get_handle_of_step(step)
print(gr.get_sequence(current_node_handle))

Which will output the following:

CGA
TTGG
CCGT
GT
GATAA
CGG
ACA
ATATAAC

Saving and Loading ODGI Graphs

Graphs can be saved and loaded through the :func:`odgi.graph.serialize` and :func:`odgi.graph.load` methods.

Graph File Example

If you wish to save the graph from the above session, that can be done with:

gr.serialize("example_graph.odgi")

This can be loaded into a new python session by using:

gr = odgi.graph()
gr.load("example_graph.odgi")

Loading in Pre-Existing Data

Provided that data has been serialized in ODGI format, it is possible to read it directly from a file. Download a *.odgi file and load it into python with:

brca2 = odgi.graph()
brca2.load("cactus-brca2.odgi")

We can poke around this data and get the sequence of the path with:

path_handle = []
handles = []
brca2.for_each_path_handle(lambda y: path_handle.append(y))
brca2.for_each_step_in_path(path_handle[0],
        lambda y: handles.append(brca2.get_handle_of_step(y)))
sequence = ""
for handle in handles:
        sequence += brca2.get_sequence(handle)
print(sequence)

Reading in a Graph from a Different Format

Graph assembies can be created with VG. Currently the method to convert to odgi format is broken, but graphs can be converted to .json format and subsequently converted to odgi with :download:`this script <../../exdata/jsoner.py>`.