# profile and pstats — Performance Analysis

**Purpose**:	Performance analysis of Python programs.

The profile module provides APIs for collecting and analyzing statistics about how Python source consumes processor resources.

## Running the Profiler

The most basic starting point in the profile module is `run()`. 

It takes a string statement as argument, and creates a report of the time spent executing different lines of code while running the statement.

In [9]:
import profile


def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n - 1) + fib(n - 2)


def fib_seq(n):
    seq = []
    if n > 0:
        seq.extend(fib_seq(n - 1))
    seq.append(fib(n))
    return seq


profile.run('print(fib_seq(20)); print()')

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]

         57391 function calls (101 primitive calls) in 0.094 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     21/1    0.000    0.000    0.094    0.094 1139073391.py:13(fib_seq)
 57291/21    0.094    0.000    0.094    0.004 1139073391.py:4(fib)
        3    0.000    0.000    0.000    0.000 :0(__exit__)
        1    0.000    0.000    0.000    0.000 :0(acquire)
       22    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.094    0.094 :0(exec)
       20    0.000    0.000    0.000    0.000 :0(extend)
        3    0.000    0.000    0.000    0.000 :0(getpid)
        3    0.000    0.000    0.000    0.000 :0(isinstance)
        3    0.000    0.000    0.000    0.000 :0(len)
        2    0.000    0.000    0.000    0.000 :0(print)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
        3    0.000    

## pstats: Saving and Working With Statistics

The standard report created by the profile functions is not very flexible. 

However, custom reports can be produced by saving the raw profiling data from `run()` and `runctx()` and processing it separately with the pstats.Stats class.

This example runs several iterations of the same test and combines the results:

In [10]:
import cProfile as profile
import pstats
from profile_fibonacci_memoized import fib, fib_seq

# Create 5 set of stats
for i in range(5):
    filename = f'profile_stats_{i}.stats'
    profile.run(f'print({i}, fib_seq(20))', filename)

# Read all 5 stats files into a single object
stats = pstats.Stats('profile_stats_0.stats')
for i in range(1, 5):
    stats.add(f'profile_stats_{i}.stats')

# Clean up filenames for the report
stats.strip_dirs()

# Sort the statistics by the cumulative time spent
# in the function
stats.sort_stats('cumulative')

stats.print_stats()

0 [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
1 [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
2 [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
3 [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
4 [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
Sat Feb  4 13:14:51 2023    profile_stats_0.stats
Sat Feb  4 13:14:51 2023    profile_stats_1.stats
Sat Feb  4 13:14:51 2023    profile_stats_2.stats
Sat Feb  4 13:14:51 2023    profile_stats_3.stats
Sat Feb  4 13:14:51 2023    profile_stats_4.stats

         498 function calls (398 primitive calls) in 0.003 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    0.001    0.000    0.003    0.001 {built-in method builtins.exec}
        5    0.000    0.000    0.0

<pstats.Stats at 0x232501e65d0>

## Limiting Report Contents

The output can be restricted by function. 

This version only shows information about the performance of `fib()` and `fib_seq()` by using a regular expression to match the desired filename:lineno(function) values.

In [11]:
import profile
import pstats
from profile_fibonacci_memoized import fib, fib_seq

# Read all 5 stats files into a single object
stats = pstats.Stats('profile_stats_0.stats')
for i in range(1, 5):
    stats.add(f'profile_stats_{i}.stats')
stats.strip_dirs()
stats.sort_stats('cumulative')

# limit output to lines with "(fib" in them
stats.print_stats('\(fib')

Sat Feb  4 13:14:51 2023    profile_stats_0.stats
Sat Feb  4 13:14:51 2023    profile_stats_1.stats
Sat Feb  4 13:14:51 2023    profile_stats_2.stats
Sat Feb  4 13:14:51 2023    profile_stats_3.stats
Sat Feb  4 13:14:51 2023    profile_stats_4.stats

         498 function calls (398 primitive calls) in 0.003 seconds

   Ordered by: cumulative time
   List reduced from 23 to 1 due to restriction <'\\(fib'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    105/5    0.000    0.000    0.001    0.000 profile_fibonacci_memoized.py:24(fib_seq)




<pstats.Stats at 0x232501f7710>