# Profilers

A quick introduction to profiling within a Jupyter notebook.

We'll use Paul's prime sieve code from an earlier (or possibly later) training module.

In [None]:
import  math
def sieve_primes(n):
    a = [True for x in range(n + 1)]
    i = 2
    while i <= math.sqrt(n):
        if a[i]:
            for j in range(i*i, n + 1, i):
                a[j] = False
        i += 1
    return [i for i in range(2, len(a)) if a[i]]

In [None]:
#Check it's working OK
sieve_primes(30)

Now we'll measure sieve_primes using the time and timeit magic commands. 

In [None]:
#Time it for all primes less than 5 million
N = 5000000
%time p = sieve_primes(N);

In [None]:
%timeit p = sieve_primes(N);

Note the difference in execution time between `time` and `timeit`. We get more information from `timeit`, i.e. mean and standard deviation, but at the expense of waiting longer for the result. In general it's important to execute code a few times when timing and profiling to ensure the data is representative of typical runs. Code execution times may vary significantly for different input data sets, or different initial conditions. 

There's more information about the `%timeit` line magic and `%%timeit` cell magic in the [IPython docs](https://ipython.readthedocs.io/en/stable/interactive/magics.html), e.g. the `-o` flag can be used to return an object containing the results. 

### Let's use the Python profiler

In [None]:
%prun p = sieve_primes(N)

If we want to apply the profiler to whole cell, we'd use the `%%prun` cell magic.

For this example the data tells us `sieve_primes` is the function with the largest `tottime`, and gives us some insight into the internals of Python, but doesn't really tell us anything we can use to improve performance. However the Python profiler is a powerful tool whenever we have multiple functions and want to find out where time is spent.   

### Now let's try the line profiler
This isn't part of core Python, **so make sure the `line_profiler` package is installed** e.g. using `pip` or `conda`. The extension needs to be loaded using the `%load_ext` magic command.

In [None]:
%load_ext line_profiler

Also, we need to tell the profiler which function(s) to profile with the `-f` flag. In this example we want to profile sieve_primes, so we need `-f sieve_primes`.  If we we're running in a Jupyter Notebook we'd need to use the `@profile` decorator. 

Also we can set the time units using with `-u`, i.e. `-u 1e-3` for ms. We'll use units of seconds. 

In [None]:
%lprun -u 1 -f sieve_primes p = sieve_primes(N)

The `Total time` is much larger that the unprofiled run time, so are these values useful? Sometimes when profiling, there's no clear answer to questions like this! 

### What next?
As you might expect, there's plenty of Python packages out there aiming to improve on these profilers. For example, SnakeViz is a browser based graphical viewer which can be used to display the output of Python’s cProfile module, and can be run within notebooks. You'll need to install `snakeviz` first using `pip` or `conda`, and then load the extension. 

In [None]:
%load_ext snakeviz

Next we'll run SnakeViz. 

SnakeViz has two views:

* Icicle
 * Fraction of time in a function is represented by width of a rectangle
 * The first function to be called is the top-most rectangle, with the functions it calls directly below it, etc.
 * Internal time (i.e. tottime) for each function is shown as an additional child function 
* Sunburst 
 * Fraction of time in a function is represented by the angular extent of an arc
 * The first function to be called is the inner most circle, with functions it calls around it, etc. 
 
Clicking on a function in the visualisation will zoom in. 

SnakeViz also displays the same data we saw above with `%prun` in a table at the bottom. Clicking on a function in this table makes it the root function in the visualization.

For more information e.g. on Cutoff and Depth see [the SnakeView docs](https://jiffyclub.github.io/snakeviz/). 

In [None]:
%snakeviz p = sieve_primes(N)

## Exercise

 1. Try different `prun` sort options using the `-s` flag as described in the documentation   https://ipython.readthedocs.io/en/stable/interactive/magics.html
 1. Test the profilers on your own code. 