# Profiling 

When you are optimizing code, you need to know how well your code is performing and whether the changes you are making are improving the situation.

To know whether what you are doing is helping, it is crucial to measure how well you are doing before and after some change. Profiling is a way to know how well a particular piece of code works.


## The IPython timeit magic

In the Jupyter Python notebook, you can use a ‘magic’ function to time either a single statement, or multiple statements.

For example:

In [2]:
import numpy as np

for shape in [10e3, 10e4, 10e5]:
    X = np.random.rand(int(shape))
    %timeit np.dot(X, X.T)

3.24 µs ± 81.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
16.8 µs ± 454 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
220 µs ± 16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


A slightly more elaborate example: 

In [4]:
%%timeit
X = np.random.rand(100, 100)
D = np.empty((100, 100))

M = X.shape[0]
N = X.shape[1]
for i in range(M):
    for j in range(M):
        d = 0.0
        for k in range(N):
            tmp = X[i, k] - X[j, k]
            d += tmp * tmp
        D[i, j] = np.sqrt(d)

463 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


But if you want to know how each line affect performance, you'll need to use a line-by-line profiler. 

In [7]:
%load_ext line_profiler

ModuleNotFoundError: No module named 'line_profiler'

These operate on functions, so let's put the code into a function:

In [5]:
def distance():
    X = np.random.rand(100, 100)
    D = np.empty((100, 100))

    M = X.shape[0]
    N = X.shape[1]
    for i in range(M):
        for j in range(M):
            d = 0.0
            for k in range(N):
                tmp = X[i, k] - X[j, k]
                d += tmp * tmp
            D[i, j] = np.sqrt(d)

The way the line-profiler works is that it needs to know about the function that you are profiing and that code has to be called (possibly indirectly):

    %lprun -f function_to_be_profiled function_to_be_called()
    
In this particular case, the function to be profiled and the function to be called are one and the same:

In [6]:
%lprun -f distance distance()

UsageError: Line magic function `%lprun` not found.
