# Profiling and Timing Code

IPython provides access to a wide array of functionality for this kind of timing and profiling of code:

- __``%time``__: Time the execution of a single statement
- __``%timeit``__: Time repeated execution of a single statement for more accuracy
- __``%prun``__: Run code with the profiler
- __``%lprun``__: Run code with the line-by-line profiler
- __``%memit``__: Measure the memory use of a single statement
- __``%mprun``__: Run code with the line-by-line memory profiler

The last four commands are not bundled with IPython–you'll need to get the ``line_profiler`` and ``memory_profiler`` extensions, which we will discuss in the following sections.

### Timing Code Snippets: ``%timeit`` and ``%time``

``%timeit`` line-magic and ``%%timeit`` cell-magic functions can be used to time the repeated execution of snippets of code:

In [1]:
%timeit sum(range(100))

875 ns ± 2.31 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


Note: because this operation is so fast, __``%timeit`` automatically does a large number of repetitions.__
For slower commands, ``%timeit`` will automatically adjust and perform fewer repetitions:

In [1]:
%%timeit
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

443 ms ± 2.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


__Sometimes repeating an operation is not the best option.__
For example, sorting a pre-sorted list is much faster than sorting an unsorted list, so the repetition will skew the result:

In [2]:
import random
L = [random.random() for i in range(100000)]
%timeit L.sort()

1.82 ms ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


For this, __the ``%time`` magic function may be a better choice__. It also is a good choice for longer-running commands, when short, system-related delays are unlikely to affect the result.

Let's time the sorting of an unsorted and a presorted list:

In [3]:
import random
L = [random.random() for i in range(100000)]
print("sorting an unsorted list:")
%time L.sort()

sorting an unsorted list:
CPU times: user 44 ms, sys: 0 ns, total: 44 ms
Wall time: 43.1 ms


In [4]:
print("sorting an already sorted list:")
%time L.sort()

sorting an already sorted list:
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 5.25 ms


__Notice also how much longer the timing takes with ``%time`` versus ``%timeit``, even for the presorted list!__

This is due to ``%timeit`` preventing system calls from interfering with the timing. (For example, garbage collection.)

Both ``%time`` & ``%timeit`` accept the double-percent-sign cell magic syntax for multiline scripts:

In [5]:
%%time
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

CPU times: user 416 ms, sys: 0 ns, total: 416 ms
Wall time: 415 ms


### Profiling Full Scripts: ``%prun``

IPython offers a way to use Python's built-in profiler with __``%prun``__:

In [6]:
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
    return total

In [7]:
%prun sum_of_lists(1000000)

 

In the notebook, the output is printed to the pager, and looks something like this:

```
14 function calls in 0.714 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    0.599    0.120    0.599    0.120 <ipython-input-19>:4(<listcomp>)
        5    0.064    0.013    0.064    0.013 {built-in method sum}
        1    0.036    0.036    0.699    0.699 <ipython-input-19>:1(sum_of_lists)
        1    0.014    0.014    0.714    0.714 <string>:1(<module>)
        1    0.000    0.000    0.714    0.714 {built-in method exec}
```

### Line-By-Line Profiling with ``%lprun``

The function-by-function profiling of ``%prun`` is useful, but __sometimes it's more convenient to have a line-by-line profile report__.
This is not built into Python or IPython, but there is a ``line_profiler`` package available for installation that can do this.
Start by using Python's packaging tool, ``pip``, to install the ``line_profiler`` package:

```
$ pip install line_profiler
```

Next, you can use IPython to load the ``line_profiler`` IPython extension, offered as part of this package:

In [8]:
%load_ext line_profiler

Now the ``%lprun`` command will do a line-by-line profiling of any function. __We need to tell it explicitly which functions we're interested in profiling__:

In [9]:
%lprun -f sum_of_lists sum_of_lists(5000)

Here's an example report:

```
Timer unit: 1e-06 s

Total time: 0.009382 s
File: <ipython-input-19-fa2be176cc3e>
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def sum_of_lists(N):
     2         1            2      2.0      0.0      total = 0
     3         6            8      1.3      0.1      for i in range(5):
     4         5         9001   1800.2     95.9          L = [j ^ (j >> i) for j in range(N)]
     5         5          371     74.2      4.0          total += sum(L)
     6         1            0      0.0      0.0      return total
```

The information at the top gives us the key to reading the results: the time is reported in microseconds and we can see where the program is spending the most time.

### Profiling Memory Use: ``%memit`` and ``%mprun``

__Another aspect of profiling is the amount of memory an operation uses.__ This can be evaluated with the ``memory_profiler`` extension.

Start by ``pip``-installing the extension:
```
$ pip install memory_profiler
```
Then use IPython to load the extension:

In [10]:
%load_ext memory_profiler

The memory profiler contains two useful magic functions: __``%memit`` magic (a memory-measuring equivalent of ``%timeit``)__ and __``%mprun`` function (a memory-measuring equivalent of ``%lprun``)__.

In [11]:
%memit sum_of_lists(1000000)

peak memory: 130.64 MiB, increment: 76.29 MiB


We see that this function uses about 130 MB of memory.

__For a line-by-line description of memory use__, we can use the ``%mprun`` magic.

__Unfortunately, this magic works only for functions defined in separate modules__ rather than the notebook itself. __Start by using ``%%file`` to create a module called ``mprun_demo.py``, which contains our ``sum_of_lists`` function__, with one addition that will make our memory profiling results more clear:

In [12]:
%%file mprun_demo.py
def sum_of_lists(N):
    total = 0
    for i in range(5):
        L = [j ^ (j >> i) for j in range(N)]
        total += sum(L)
        del L # remove reference to L
    return total

Writing mprun_demo.py


Now import the new version of this function and run the memory line profiler:

In [16]:
from mprun_demo import sum_of_lists
%mprun -f sum_of_lists sum_of_lists(10000)




Here is an example report:
```
Filename: ./mprun_demo.py

Line #    Mem usage    Increment   Line Contents
================================================
     4     71.9 MiB      0.0 MiB           L = [j ^ (j >> i) for j in range(N)]


Filename: ./mprun_demo.py

Line #    Mem usage    Increment   Line Contents
================================================
     1     39.0 MiB      0.0 MiB   def sum_of_lists(N):
     2     39.0 MiB      0.0 MiB       total = 0
     3     46.5 MiB      7.5 MiB       for i in range(5):
     4     71.9 MiB     25.4 MiB           L = [j ^ (j >> i) for j in range(N)]
     5     71.9 MiB      0.0 MiB           total += sum(L)
     6     46.5 MiB    -25.4 MiB           del L # remove reference to L
     7     39.1 MiB     -7.4 MiB       return total
```

The ``Increment`` column tells us how much each line affects the total memory budget: observe that when we create and delete the list ``L``, we are adding about 25 MB of memory usage.

This is on top of the background memory usage from the Python interpreter itself.