# Exercises course 2 - Optimizing Python Code for Better Performance
------------------------------

This notebook contains exercises for Intermediate Python Course 2 - Optimizing Python Code for Better Performance.

<br>
<br>

# Exercises Notebook 2 - optimizing python code
------------------------------------------------------------------------



## Exercise 2.1

Your task is to try to optimize the following functions and benchmark your optimized version agains the original one.

In [None]:
def compute_sequence_similarity(seqA, seqB):
    """Compute similarity between 2 sequence as the fraction of
    positions where they have the same value.
    """
    l = len(seqA)
    similar = 0
    for i in range(l):
        if seqA[i] == seqB[i]:
            similar += 1
    return similar / l

def compute_sequence_similarity_mat(lseq):
    """Compute similarity between all sequence pairs."""
    sim = np.zeros((len(lseq), len(lseq)))
    for i, s1 in enumerate(lseq):
        for j, s2 in enumerate(lseq):
            sim[i, j] = compute_sequence_similarity(s1, s2)
    return sim


* You can use the following line of code to generate some data to benchmark the functions:

In [None]:
import numpy as np

lseq = [ ''.join(np.random.choice(list("ATGC"), 500)) for x in range(100) ]

* And this is how you can run the benchmark:

In [None]:
%timeit -n 3 -r 7 _= compute_sequence_similarity_mat(lseq)

<br>

**Warning:** this exercise is *not* necessarily very easy.
* You will likely have to try different things and delve a bit in the libraries' online
  documentations to get good results.
* Here are some **hints:**
  * **Numpy hint:** to transform string `s` into an array: `np.array(list(s))`.
  * **Cython hint:**
    * Simple solution: the typing of string is `str`.
    * More complex solution: we can use C stuff such as `char*`, but then you need to convert the python `str`
      to unicode, using for instance something like: `c_compatible_string = python_string.encode("UTF-8")`

<br>
<br>

### Solution:

Uncomment and run the cells below to show the solution.

* **Numba** solution:

In [None]:
# %load solutions/solution_22_numba.py

* **Numpy** solution:

In [None]:
# %load solutions/solution_22_numpy.py

* **Cython, simple** solution:

In [None]:
# %load solutions/solution_22_cython_simple.py

* **Cython, more complex** solution:

In [None]:
# %load solutions/solution_22_cython_complex.py

<br>
<br>
<br>

# Exercises Notebook 3 - working with processes / threads
--------------------------------------------------------------------------------------

## Exercise 3.1

 * Re-think the `integrate_f_native` function given below (it is the same as we saw in Notebook 3),
   so it is parallelizable in a few large tasks (rather than in a lot of small tasks as we have done
   in Notebook 3)?
 * Implement your chosen solution.
 

In [None]:
def f_native(x):
    return x ** 2 - x

def integrate_f_native(a, b, N):
    s = 0
    dx = (b - a) / N
    for i in range(N):
        s += f_native(a + i * dx)
    return s * dx

# Test run and benchmark our function.
print(integrate_f_native(0, 2, 100))
%timeit -n 3 -r 7 _ = integrate_f_native(0, 2, 1000000)

<br>

Your implementation here...

<br>
<br>

### Solution:

Uncomment and run the cells below to show the solution.

* Concept / Idea:

In [None]:
# %load -r -4 solutions/solution_31_multiprocessing.py

* Function definitions:

In [None]:
# %load -r 5-35 solutions/solution_31_multiprocessing.py

* Application:

In [None]:
# %load -r 36- solutions/solution_31_multiprocessing.py