# Line-profiler

This tutorial covers the basics of the [line_profiler](https://github.com/rkern/line_profiler) Python package.  I'll cover what exactly line-profiler is (and why you should care) and how to use it.


## What is line-profiler

As the name suggests line-profiler profiles the runtime of a Python script in a line-by-line manner. From the profiling you can identify **hotspots**; i.e., what lines of code take the longest time to run! While sometimes this may be of simple academic interest, alternative algorithms could be used and lead to significant time savings.  If a loop is executed a million times and you save 10ms on each loop, congratulations your code now finishes three hours sooner! 

## Installing line-profiler

Provided you have a working Python distribution, installing should be as easy as 
```
$ pip install line_profiler
```

If this fails, you can also clone the Github repository:

```
$ git clone https://github.com/rkern/line_profiler.git
```

And then install using 

```
$ python setup.py install
```

remembering you may need the ```--user``` or ```--prefix=/directory/to/local/python/site-packages/``` argument if you're on a cluster such as ```g2```.


## Using line-profiler with Jupyter Notebook

When using line-profiler there is a slight twist depending on whether you're running the Python script directly from the terminal (e.g., ``python my_script.py``) or from inside a Jupyter notebook.  I'll go over both of these methods before showing some examples from my own code (which is all executed via command line).

Loading line-profiler into your notebook is one easy command:

In [1]:
%load_ext line_profiler

Next we need some functions to test our run-time on.  

For the first example, say we are given particle positions ``x``, ``y`` and ``z`` and we wish to calculate the pair-wise distance between each particle. The pair-wise distance is a very useful property which allows us to compute further statistics such as the correlation function or power spectrum, and is necessary for things such as force calculations. 

In [2]:
from __future__ import print_function # Always do this >:( 
from __future__ import division
import numpy as np

pos = np.random.random((1000,3)) # Generate a 3D vector of random points between 0 and 1.
print(pos[0])
print(pos[0,0])

[ 0.39347777  0.82736576  0.15859656]
0.393477774111


First we'll do my favourite method; the crass, brute force method that involves nested ``for`` loops.

In [3]:
def pairwise_brute(pos):
    npart = pos.shape[0]
    ndim = pos.shape[1] 
    distance = np.empty((npart, npart), dtype = np.float64)
    
    # Note that the distance between pair i-j is the same as j-i.
    for i in range(npart): 
        for j in range(npart):
            d = 0.0
            for k in range(ndim):
                d += pow(pos[i,k] - pos[j,k], 2) # Gets the (square) distance in one dimension between particles i and j.
            distance[i,j] = np.sqrt(d)
        assert(distance[i,i] == 0) # A particle should have zero distance from itself.
    return distance


In [4]:
distance = pairwise_brute(pos)
print(distance[0,100]) # This is the distance between particle 0 and 100.

0.212877826851


Then we run ```line-profiler``` on our code.  The argument ``-f`` specifies the function that we want to profile and we need to remember to provide the input arguments.

In [5]:
%lprun -f pairwise_brute pairwise_brute(pos)

Be careful that in the above table the **Time** column is expressed in units of $10^{-6}s$, i.e., $\mu s$.  The **Time** column tells us how many $\mu s$ we spend on each line (total), the **Per Hit** column tells us the average amount of time spent on relative to the total amount of time spent in the function and finally the **% Time** column is the percentage of time spent on that line (relative to the total amount of time).

From this table a number of things become apparent.  Firstly, I enjoy writing long comments.  Secondly, one of the most expensive line of code is line 13 where we perform our ``assert`` error check.  This really highlights the power of a tool like line-profiler because whilst this line is expensive (taking $1.9\mu s$ per hit), the **relative** amount of time spent on this operation is small. The honour of most expensive operation goes to line 11 in which we spend $44\%$(!!) of our time on.  This is a result of our nested ``for`` loops; we are hitting line 11 a grand total of $4031247$ times.  

Let's see if we can do better.

In [6]:
def pairwise_numpy(pos):
    padded_pos = pos[:, None, :] # Pad the array to ensure the dimensionality will be correct 
    distance = np.sqrt(np.sum(np.square(padded_pos - pos), axis = -1)) # Perform Pythag and then sum to return to the original dimensions.
    return distance

In [7]:
%lprun -f pairwise_numpy pairwise_numpy(pos)

Huzzah we have remarkable speed-up relative to the brute force method.  
**Beware:** When using Numpy broadcasting, temporary arrays are created potentially causing memory issues for (very) large arrays.

## Using line-profiler with terminal

If you're like me and behind the times, you can also run line-profiler from the terminal.  For this all you do is add the decorator ``@profile`` before the function you want to profile.  Then you profile the results by 

```
$ kernprof -l my_script.py <Script Arguments>
```

If you get the error '``bash: kernprof: command not found``' then fully specify the path to where you installed ``line_profiler``:

```
$ python3 /home/jseiler/.local/lib/python3.5/site-packages/kernprof.py -l rewrite_files.py 15 15
```

Once the profiling has completed you can view the results via 

```
$ python3 -m line_profiler my_script.py.lprof
```

As an example from my own code (I've only included the lines with $>0.0$ in the **% Time** column):


From this we can see that a massive $97.3\%$ of time is spent doing two ```np.nonzero``` calculations.  This really puts things into perspective: if I wanted to optimize my code these are **the** lines I would start with.  It isn't even worth thinking about optimizing the other parts of my code when they form such a neglible percentage of time spent.

## Wrapping Up

Hopefully this tutorial has given a basic demonstration how to use ```line_profiler```. More importantly, I hope it's outlined the use of such a tool; namely identifying hotspots within your code so you know **where** to start optimization.  Blindly going through every line in your code and asking "Could I make this faster?" is a painful exercise and you often won't see a noticable speedup in your code. 

As always, the [Github page](https://github.com/rkern/line_profiler) for the package will contain a wealth of information on running the code and some tips and tricks.