# 27.4. The Python Profilers  

https://docs.python.org/3/library/profile.html

<b>Profilers: profile, cProfile, and pstats</b> – Performance analysis of Python programs.


## 27.4.1. Introduction to the profilers

`cProfile` and `profile` provide deterministic profiling of Python programs. 

A profile is a set of statistics that describes how often and for how long various parts of the program executed. These statistics can be formatted into reports via the `pstats` module.

The Python standard library provides two different implementations of the same profiling interface:

1. `cProfile` is recommended for most users; it’s a C extension with reasonable overhead that makes it suitable for profiling long-running programs. Based on `lsprof`, contributed by Brett Rosen and Ted Czotter.


2. `profile`, a pure Python module whose interface is imitated by cProfile, but which adds significant overhead to profiled programs. If you’re trying to extend the profiler in some way, the task might be easier with this module. Originally designed and written by Jim Roskind.

    
    

### 27.4.2. Instant User’s Manual

This section is provided for users that “don’t want to read the manual.” 

It provides a very brief overview, and allows a user to rapidly perform profiling on an existing application.

To profile a function that takes a single argument, you can do:


Use `profile` instead of `cProfile` if the latter is not available on your system.

The action would run `re.compile()` and print profile results like the following:


In [None]:
import cProfile
import re
cProfile.run('re.compile("foo|bar")')

#### Instead of printing the output at the end of the profile run, you can <b>save the results to a file</b> by specifying a filename to the `run()` function:

In [None]:
import cProfile
import re
cProfile.run('re.compile("foo|bar")', 'restats')

## 27.4.7. Calibration

The profiler of the `profile` module <b>subtracts a constant</b> from each event handling time to compensate for the overhead of calling the time function, and socking away the results. 

By default, the constant is 0.

The following procedure can be used to obtain a better constant for a given platform (see Limitations).


In [None]:
import profile
pr = profile.Profile()
for i in range(5):
    print(pr.calibrate(10000))

your_computed_bias=pr.calibrate(10000)
    

The method executes the number of Python calls given by the argument, directly and again under the profiler, measuring the time for both. It then computes the hidden overhead per profiler event, and returns that as a float. 

The object of this exercise is to get a fairly consistent result. If your computer is very fast, or your timer function has poor resolution, you might have to pass 100000, or even 1000000, to get consistent results.

When you have a consistent answer, there are <b>three ways</b> you can use it:


In [None]:
import profile

# 1. Apply computed bias to all Profile instances created hereafter.
profile.Profile.bias = your_computed_bias

# 2. Apply computed bias to a specific Profile instance.
pr = profile.Profile()
pr.bias = your_computed_bias

# 3. Specify computed bias in instance constructor.
pr = profile.Profile(bias=your_computed_bias)

If you have a choice, you are <b>better off choosing a smaller constant</b>, and then your results will <b>“less often”</b> show up as negative in profile statistics.

## Example: profiling fibonacci

The most basic starting point in the profile module is `run()`. It takes <b>a string statement</b> as argument, and creates a report of the time spent executing different lines of code while running the statement.
 
This recursive version of a fibonacci sequence calculator is especially useful for demonstrating the profile because we can improve the performance so much. The standard report format shows a summary and then details for each function executed.
    

In [None]:
import profile

def fib(n):
    # from http://en.literateprograms.org/Fibonacci_numbers_(Python)
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

def fib_seq(n):
    seq = [ ]
    if n > 0:
        seq.extend(fib_seq(n-1))
    seq.append(fib(n))
    return seq

print('RAW')
print('=' * 80)
profile.run('print(fib_seq(20)); print')


As you can see, it takes 57356 separate function calls and 3/4 of a second to run. Since there are only 66 primitive calls, we know that the vast majority of those 57k calls were recursive. 

The details about where time was spent are broken out by function in the listing showing the number of calls, total time spent in the function, time per call (tottime/ncalls), cumulative time spent in a function, and the ratio of cumulative time to primitive calls.

Not surprisingly, most of the time here is spent calling `fib()` repeatedly. 


<img src="./img/recursion_without_cache.png"/> 


It’s clear this is a very inefficient algorithm: the amount of function calls increases exponentially for increasing values of n—this is because the function calls values that it has already calculated again and again.

We needed to speed up a lot of my recursive algorithms. Decorators really came to the rescue in the form of memoization（https://en.wikipedia.org/wiki/Memoization）.

The easy way to optimize this would be to cache the values in a dictionary and check to see if that value of n has been called previously. If it has, return it’s value in the dictionary, if not, proceed to call the function. This is memoization. Let’s look at our memoize class:


In [None]:
class memoize:
    
    # from http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
    def __init__(self, function):
        self.function = function
        #　a dictionary, ｀self.memoized｀, that acts as our cache
        self.memoized = {}

    def __call__(self, *args):
        try:
            return self.memoized[args]
        except KeyError:
            self.memoized[args] = self.function(*args)
            return self.memoized[args]

There is now a dictionary, ｀self.memoized｀, that acts as our cache, and a change in the exception handling that looks for ｀KeyError｀, which throws an error if a key doesn’t exist in a dictionary. 
Again, this class is generalized, and will work for any recursive function that could benefit from memoization.

We can add <b>a memoize decorator</b> to reduce the number of recursive calls and have a big impact on the performance of this function.

In [2]:
import profile

@memoize
def fib(n):
    # from http://en.literateprograms.org/Fibonacci_numbers_(Python)
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

def fib_seq(n):
    seq = [ ]
    if n > 0:
        seq.extend(fib_seq(n-1))
    seq.append(fib(n))
    return seq

if __name__ == '__main__':
    print('MEMOIZED')
    print('=' * 80)
    profile.run('print(fib_seq(20)); print')


MEMOIZED
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
         161 function calls (103 primitive calls) in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       21    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.000    0.000 :0(exec)
       20    0.000    0.000    0.000    0.000 :0(extend)
        2    0.000    0.000    0.000    0.000 :0(getpid)
        2    0.000    0.000    0.000    0.000 :0(isinstance)
        1    0.000    0.000    0.000    0.000 :0(print)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
        2    0.000    0.000    0.000    0.000 :0(time)
        2    0.000    0.000    0.000    0.000 :0(write)
    59/21    0.000    0.000    0.000    0.000 <ipython-input-1-fd330b0425df>:8(__call__)
     21/1    0.000    0.000    0.000    0.000 <ipython-input-2-f8d1eaf275e9>:13(fib_seq)
       21    0.000    0.000    0.000 

By remembering the <b>Fibonacci</b> value at each level we can avoid most of the recursion and drop down to 145 calls that only take 0.003 seconds. Also notice that the ncalls count for `fib()` shows that it never recurses.

## pstats: Saving and Working With Statistics

The standard report created by the profile functions is not very flexible. 

If it doesn’t meet your needs, you can produce your own reports by saving the raw profiling data from run() and processing it separately with the `Stats` class from `pstats`.

For example, to run several iterations of the same test and combine the results, you could do something like this:



In [None]:
import profile
import pstats

from profile_fibonacci_memoized import fib, fib_seq

# Create 5 set of stats
filenames = []
for i in range(5):
    filename = 'profile_stats_%d.stats' % i
    profile.run('print %d, fib_seq(20)' % i, filename)

# Read all 5 stats files into a single object
stats = pstats.Stats('profile_stats_0.stats')
for i in range(1, 5):
    stats.add('profile_stats_%d.stats' % i)

# Clean up filenames for the report
stats.strip_dirs()

# Sort the statistics by the cumulative time spent in the function
stats.sort_stats('cumulative')
stats.print_stats()
