# Profiling

Profiling should be considered a mandatory prerequisite to code optimization; without solid data on how your program runs it is likely you will misidentify the bottlenecks and waste valuable time and effort on the wrong sections of code. Additionally, profiling provides you with information as to what the bottleneck is. Given how easy it is in Python, this is not something to be skipped.

## Timing your code
The first port of call is almost always timing how long your code takes to run. Python provides the `time` module for you, which allows you to take timepoints before and after your code, take the difference, and see how long the code took. We show an example of this below.

In [15]:
import time

def fib(n):
    """Compute the nth Fibonacci number."""
    if n <= 2:
        return 1
    return fib(n - 1) + fib(n - 2)

start_time = time.time()
result = fib(35)
end_time = time.time()
print(f"fib(30) = {result}, computed in {end_time - start_time:.3f} seconds")

fib(30) = 9227465, computed in 0.887 seconds


On our computer, calling `fib(30)` gave us the result 832040, in 0.088 seconds.

**QUESTION: How long does the `fib` function take with the inputs n = 20, 35, and 40? Write the answers down. How about other numbers? Is there a pattern?**

<details>
<summary>My answers</summary>
On my machine:

- `fib(20)` ran in 0.01 seconds
- `fib(30)` ran in 0.088 seconds
- `fib(35)` ran in 0.905 seconds
- `fib(40)` ran in 9.861 seconds

</details>

**QUESTION: Without editing the code provided, how long does `fib(10)` take?**

However, an operating system is complex and it is likely that there are many things going on at the same time as you executing the code, things which could affect how your code runs. A single sample can often give you the right ballpark, but running the code repeatedly allows the random effects of your OS environment to be averaged out. Of course, *non-random* effects may not be averaged out.

If you're using IPython or a Jupyter Notebook, you can use a helpful little bit of magic. The `%timeit` command, written before a line of Python code, will run that line multiple times and report back the mean and standard deviation of the time. It will also adjust the relevant time unit on the fly, so you don't have to adjust the requested precision manually. For example:

In [16]:
%timeit fib(32)

212 ms ± 2.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


If you want to time an entire code block, you use the `%%timeit` (note two % symbols) at the start of a code block.

In [17]:
%%timeit
fib(32)
fib(33)

548 ms ± 5.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


For numerical work, of course, you probably aren't using a notebook to run your code. The equivalent module and function for a Python script is the (default provided) `timeit` module and function. To use that, we call:

In [21]:
import timeit
print(timeit.timeit('fib(32)', globals = globals(), number=10))

2.1009196689997225


Lets explain that code a little bit. The `timeit` function from the `timeit` module runs a function call, provided as the first argument here as a string which it evaluates. You can specify the number of repetitions with the `number=` argument; here we've specified 10 because the default is 1_000_000 and we know this function call will take more than a few microseconds. The `globals = globals()` argument and value pass the global environment to the function, in case any of the code relies on something there.

The function returns the *total* time in float seconds that the entire set of repetitions took. You can easily divide by the number of repetitions to get the mean.

The documentation for the `timeit` module can be found [here](https://docs.python.org/3/library/timeit.html)

**QUESTION: What happens if you delete the `globals = globals()` argument?**

You can also run the `timeit` function from the command line interface (CLI) if you wish to test short snippets of code.

**NOTE** The `timeit` module turns off Garbage Collection during the timing, to avoid the performance impact of that process. This makes timing repetitions more independent, but if you want to test with real-world conditions or GC is an important part of the function being tested, it can be turned back on (see documentation).

## More informative profiling
Timing your code is a good starting point which can provide information as to whether it is even worth it to optimize your code, and as you proceed it can tell you whether you have successfully achieved speedup of your code. However, it generally doesn't offer enough information to identify *why* the code is slower than desired. To do that, we have to turn to other packages, which provide more information.

### cProfile
The first of these is `cProfile`. This module (provided with Python) also provides a time estimate, but it is important to note that the time estimate will likely be larger. `cProfile` is a deterministic profiler which systematically records *every* function call. Consequently, using it adds overhead to every function call. As it is written in `C`, (hence `c` `profile`), the overhead is usually reasonable, but you should be aware of it.

As with `timeit`, to use `cProfile` we provide a string representation of the function we want to profile.

In [24]:
import cProfile
cProfile.run('fib(30)', sort = "cumulative")

         1664082 function calls (4 primitive calls) in 0.242 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.242    0.242 {built-in method builtins.exec}
        1    0.000    0.000    0.242    0.242 <string>:1(<module>)
1664079/1    0.242    0.000    0.242    0.242 1582476348.py:3(fib)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




In our example, we order by "cumulative" to sort the output table by the `cumtime` - i.e. the cumulative time. 

`cProfile` provides an estimate of the total number of function calls and the overall time taken. The number of primitive function calls is the number of calls that were not recursive - i.e. they were not generated by other function calls. We see that there were 4 primitive function calls out of a total of 1,664,082 total. If we look at the table, we see the first column is `ncalls`, which shows that 4 different functions had 1 primitive calls, and that `fib` generated 1,664,079 recursive calls. In `tottime` we see that essentially all the time was spent in the `fib()` function, although there were so many calls that the `percall` time was still 0. 

We can also output the results of `cProfile` to file for use with the `pStats` module, which can allow us to save our output and also do more complex manipulations of it.

In [None]:
import pstats
cProfile.run('fib(30)', 'fib_stats') # save the stats to fib_stats
p = pstats.Stats('fib_stats') # read the fib_stats file
p.sort_stats('cumulative').print_stats("fib") # print only the stats for the fib function

Tue Jun 24 22:15:47 2025    fib_stats

         1664082 function calls (4 primitive calls) in 0.249 seconds

   Ordered by: cumulative time
   List reduced from 4 to 1 due to restriction <'fib'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1664079/1    0.249    0.000    0.249    0.249 /tmp/ipykernel_5892/1582476348.py:3(fib)




<pstats.Stats at 0x71076ad113c0>

### line_profiler

## Profiling Exercises