<span style="float:left;">Licence CC BY-NC-ND</span><span style="float:right;">François Rechenmann &amp; Thierry Parmentelat&nbsp;<img src="media/inria-25.png" style="display:inline"></span><br/>

# Revisited walking

In this notebook, we are going to slightly modify the `walk` algorithm that we had seen in a previous sequence, so as to let it perform the *abridged* walking in the way that was described in the video.

Like always, we start with our magic formulas

In [None]:
# this is so that we can use print() in python2 like in python3
from __future__ import print_function
# with this, division will behave in python2 like in python3
from __future__ import division

### The  `enumerate` function

But before that, we are going to need another piece in the python toolset: the `enumerate` function is very useful to scan a structure with a `for` loop, with the additional benefit that it allows to access the **index in the loop**. Let us see this right away on an example:

In [None]:
# starting from a simple chain 
string = "abc"

# when we scan it with a for loop
for c in string:
    # here we cannot know the index
    # within the loop
    print(c)    

A first naive solution would be to compute the chain length, and then to loop on the integers smaller than the length. Which would give us this:

In [None]:
# len is a builtin function in python
size = len(string)

# range is a builtin function too
for index in range(size):
    print("at index", index, "we have the object", string[index])

You will notice by the way here that **indices start at `0` in python** and not at `1` like it was assumed in the video.

This way of coding loops to access the loop index, although it works, is **not recommended**. It is much preferrable to use the `enumerate` function, that gives us a **more legible** and **more efficient** code:

In [None]:
# we can achieve the same result with 
# a more legible and faster code by using
# the enumerate function that is designed for
# exactly that kind of usage
for index, char in enumerate(string):
    print("at index", index, "we have object", char)

### `path_x_y` revisited

We are now going to rewrite another version of the function that we had called `path_x_y` and that, remember, returns 2 lists for the X's and Y's of the dots in the path we want to draw.

We want to abridge the path, and to keep only one point from time to time. The first change that we need to make on our function, is that we will pass it an additional integer `step` parameter that will specify how often we keep data. 

The second change will then to collect in the list of dots only the ones that correspond to an index that is a multiple of `step`.

In python, we will use the `%` operator that returns the rest of integer division:

In [None]:
# 20 is a multiple of 5, so its rest is 0
20 % 5

In [None]:
# but 20 it is not a multiple of 6, because its rest is 2
20 % 6

We can now write the following code, that creates a point on the walk only for indices that are a amultiple of `step`, which is a new argument to this function:

In [None]:
moves = {
    'C' : [1, 0],
    'A' : [0, 1],
    'G' : [-1, 0],
    'T' : [0, -1],
    }

# un algorithme qui calcule les deux chemins en x et y
# en se deplaçant le long de la chaine, et en ne retenant 
# que les points qui correspondent à un indice multiple de 'pas'
def path_x_y_abridged(dna, step):
    # init results
    path_x, path_y = [], []
    # start on the center
    x, y = 0, 0
    # starting point is on the path
    path_x.append(x)
    path_y.append(y)

    # scan whole DNA
    for index, nucleotide in enumerate(dna):
        # what move ?
        delta_x, delta_y = moves[nucleotide]
        # do it
        x += delta_x
        y += delta_y
        # store this dot in results 
        # only for multiples of step
        if index % step == 0:
            path_x.append(x)
            path_y.append(y)

    return path_x, path_y

### The algorithme in action

We still need `matplotlib`

In [None]:
# so that the graphics appear inside the notebook
%matplotlib inline
# importing the library
import matplotlib.pyplot as pyplot

# finally: the sizes to use when drawing figures
import pylab
pylab.rcParams['figure.figsize'] = 8., 8.

And for more convenience we define a shortcut that does all the work:

In [None]:
def walk_abridged(dna, step):
    # this time we use the new function and
    # pass step along
    X, Y = path_x_y_abridged(dna, step)
    # we can now draw
    pyplot.plot(X, Y)
    pyplot.show()

We can now see the outcome on the sample that we had used with the first complete version of this algorithm:

In [None]:
from samples import sample_week1_sequence7
walk_abridged(sample_week1_sequence7, 10)

That you can compare with the output of the initial complete algorithm, that had produced this:

![](media/prom-slide-17.png)

### A little more interactive

Like we had seen with the first version of the walking algorithm, it is easy to add interactive capabilities with the `mpld3` library:

In [None]:
# the extra interctive layer on top of matplotlib
import mpld3

def zoomable_walk_abridged(dna, step):
    # like above but with mpld3
    # so the result gets browsable
    X, Y = path_x_y_abridged(dna, step)
    # and draw it 
    pyplot.plot(X, Y)
    return mpld3.display()    

In [None]:
zoomable_walk_abridged(sample_week1_sequence7, 10)