## Profiling and Timing Code
Eber David Gaytan Medina


In the process of developing code and creating data processing pipelines, there are often trade-offs you can make between various implementations. Early in developing your algorithm, it can be counterproductive to worry about such things. As Donald Knuth famously quipped, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

But once you have your code working, it can be useful to dig into its efficiency a bit. Sometimes it's useful to check the execution time of a given command or set of commands; other times it's useful to dig into a multiline process and determine where the bottleneck lies in some complicated series of operations. IPython provides access to a wide array of functionality for this kind of timing and profiling of code. Here we'll discuss the following IPython magic commands:

```bash
%time: Time the execution of a single statement
%timeit: Time repeated execution of a single statement for more accuracy
%prun: Run code with the profiler
%lprun: Run code with the line-by-line profiler
%memit: Measure the memory use of a single statement
%mprun: Run code with the line-by-line memory profiler
```

Timing Code Snippets: `%timeit` and `%time`
We saw the `%timeit` line-magic and `%%timeit` cell-magic in the introduction to magic functions in IPython Magic Commands; it can be used to time the repeated execution of snippets of code:
```bash
%timeit sum(range(100))
100000 loops, best of 3: 1.54 µs per loop
```
Note that because this operation is so fast, %timeit automatically does a large number of repetitions. For slower commands, %timeit will automatically adjust and perform fewer repetitions:
```bash
%%timeit
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j
1 loops, best of 3: 407 ms per loop
```
Sometimes repeating an operation is not the best option. For example, if we have a list that we'd like to sort, we might be misled by a repeated operation. Sorting a pre-sorted list is much faster than sorting an unsorted list, so the repetition will skew the result:
```bash
import random
L = [random.random() for i in range(100000)]
%timeit L.sort()
```

#### Profiling Full Scripts: %prun

In [None]:
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total
Now we can call %prun with a function call to see the profiled results:

%prun sum_of_lists(1000000)

#### Line-By-Line Profiling with %lprun

In [None]:
$ pip install line_profiler
Next, you can use IPython to load the line_profiler IPython extension, offered as part of this package:

%load_ext line_profiler
Now the %lprun command will do a line-by-line profiling of any function–in this case, we need to tell it explicitly which functions we're interested in profiling:

%lprun -f sum_of_lists sum_of_lists(5000)

#### Profiling Memory Use: %memit and %mprun

Another aspect of profiling is the amount of memory an operation uses. This can be evaluated with another IPython extension, the memory_profiler. As with the line_profiler, we start by pip-installing the extension:
```bash
$ pip install memory_profiler
Then we can use IPython to load the extension:

%load_ext memory_profiler
The memory profiler extension contains two useful magic functions: the %memit magic (which offers a memory-measuring equivalent of %timeit) and the %mprun function (which offers a memory-measuring equivalent of %lprun). The %memit function can be used rather simply:

%memit sum_of_lists(1000000)
peak memory: 100.08 MiB, increment: 61.36 MiB
```

In [None]:
%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total
Overwriting mprun_demo.py
We can now import the new version of this function and run the memory line profiler:

from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(1000000)