# High Performance Python Notes
> My notes for the book "High Performance Python"
- toc: true 
- badges: false
- comments: true
- categories: [programming]
<!-- - image: images/normal-dist.jpg -->

## Profiling to find bottlenecks

Let's say we want to calculate how much time a function `foo` in our program takes to run. We can do this using `time` module and calculating the time taken wherever the function is called

In [None]:
import time

start = time.time() ## Noting the start time

foo(*args, **kwargs) ## calling the function

end = time.time() ## noting the end time

print(f'The function {foo.__name__} took {end-start} seconds to run')

This approach however requires us to write the code for calculating the time taken by function everywhere the function is called. If the function is called numerous times, this approach can clutter our program.

A better approach would be to use decorator

### Using Decorators

In [2]:
def timer_func(func):
    
    def time_measurer(*args, **kwargs):
        start = time.time()
        
        reult = func(*args, **kwargs)
        
        end = time.time()
        
        print(f'The function {func.__name__} took {end-start} seconds to run')
        
        return result
    
    return time_measurer

Now we only need to "decorate"  the function as follows

In [5]:
@timer_func
def foo(*args, **kwargs):
    ...
    
    

The above code snippet is just a fancy way of saying `foo = timer_func(foo)`.  With this approach, we only need to write the code for calculating the time taken once and then using a decorator we can convert `foo` into a function that prints the time taken and returns the result. Moreover, we can time any function using this decorator

But there's one problem with this approach. Whenever the function `foo` will be called it will print out `The function time_measurer took 10 seconds to run`. This is because `timer_func` returns a function named "time_measurer". We can circumvent this issue by a small fix.

In [6]:
from functools import wraps

def timer_func(func):
    
    @wraps(func)
    def time_measurer(*args, **kwargs):
        start = time.time()
        
        reult = func(*args, **kwargs)
        
        end = time.time()
        
        print(f'The function {func.__name__} took {end-start} seconds to run')
        
        return result
    
    return time_measurer

`wraps` decorator forces the function `time_measurer` to have the same attributes as that of `func`.

### Using magic commands

In Jupyter notebooks we can use magic `%timeit` for timing the function. This will return mean and standard deviation of run time of several calls to the function

In [11]:
import julia_set

%timeit julia_set.calc_pure_python(desired_width = 1000, max_iterations = 300)

6.73 s ± 112 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


`timeit` can also be run from command line as:

In [9]:
! python -m timeit -n 5 -r 2 -s "import julia_set" "julia_set.calc_pure_python(desired_width = 1000, max_iterations = 300)"

5 loops, best of 2: 6.96 sec per loop


> Note: Running `timeit` using magic command return mean of all the runs while while running it from command line displays the time of the best run

### Using cProfile Module

cProfile is the build in profiling tool in the standard library. Using this module gives more detailed information at the cost of greater overhead. It can be used from command line as below

In [1]:
!python -m cProfile -s cumulative julia_set.py

         36221990 function calls in 11.859 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   11.859   11.859 {built-in method builtins.exec}
        1    0.025    0.025   11.859   11.859 julia_set.py:1(<module>)
        1    0.471    0.471   11.834   11.834 julia_set.py:21(calc_pure_python)
        1    7.066    7.066   11.150   11.150 julia_set.py:7(calculate_z_serial_purepython)
 34219980    4.084    0.000    4.084    0.000 {built-in method builtins.abs}
  2002000    0.207    0.000    0.207    0.000 {method 'append' of 'list' objects}
        1    0.006    0.006    0.006    0.006 {built-in method builtins.sum}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




To get more control over the results of cProfile we can write the results into a statistics file as below:

In [2]:
! python -m cProfile -o profile.stats julia_set.py

The above line of code writes the results of cProfile in a file named `profile.stats`. We can analyze this file in a seperate programme using the `pstats` module

In [3]:
import pstats

In [4]:
p = pstats.Stats("profile.stats")

In [7]:
p.sort_stats("cumulative")

<pstats.Stats at 0x24c7bdfbc70>

The above line of code sorted the functions according to the cumulative time taken by them

In [8]:
p.print_stats()

Wed Jun 30 18:58:11 2021    profile.stats

         36221990 function calls in 11.664 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   11.664   11.664 {built-in method builtins.exec}
        1    0.028    0.028   11.664   11.664 julia_set.py:1(<module>)
        1    0.451    0.451   11.637   11.637 julia_set.py:21(calc_pure_python)
        1    6.911    6.911   10.985   10.985 julia_set.py:7(calculate_z_serial_purepython)
 34219980    4.075    0.000    4.075    0.000 {built-in method builtins.abs}
  2002000    0.195    0.000    0.195    0.000 {method 'append' of 'list' objects}
        1    0.006    0.006    0.006    0.006 {built-in method builtins.sum}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




<pstats.Stats at 0x24c7bdfbc70>

Similarly, we can sort according to total time taken as follows:

In [11]:
p.sort_stats("tottime")

<pstats.Stats at 0x24c7bdfbc70>

In [12]:
p.print_stats()

Wed Jun 30 18:58:11 2021    profile.stats

         36221990 function calls in 11.664 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    6.911    6.911   10.985   10.985 julia_set.py:7(calculate_z_serial_purepython)
 34219980    4.075    0.000    4.075    0.000 {built-in method builtins.abs}
        1    0.451    0.451   11.637   11.637 julia_set.py:21(calc_pure_python)
  2002000    0.195    0.000    0.195    0.000 {method 'append' of 'list' objects}
        1    0.028    0.028   11.664   11.664 julia_set.py:1(<module>)
        1    0.006    0.006    0.006    0.006 {built-in method builtins.sum}
        1    0.000    0.000   11.664   11.664 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




<pstats.Stats at 0x24c7bdfbc70>

To get a sense of which function was called by which, or what are the functions being profiled we can use `print_callers()` method:

In [13]:
p.print_callers()

   Ordered by: internal time

Function                                          was called by...
                                                      ncalls  tottime  cumtime
julia_set.py:7(calculate_z_serial_purepython)     <-       1    6.911   10.985  julia_set.py:21(calc_pure_python)
{built-in method builtins.abs}                    <- 34219980    4.075    4.075  julia_set.py:7(calculate_z_serial_purepython)
julia_set.py:21(calc_pure_python)                 <-       1    0.451   11.637  julia_set.py:1(<module>)
{method 'append' of 'list' objects}               <- 2002000    0.195    0.195  julia_set.py:21(calc_pure_python)
julia_set.py:1(<module>)                          <-       1    0.028   11.664  {built-in method builtins.exec}
{built-in method builtins.sum}                    <-       1    0.006    0.006  julia_set.py:21(calc_pure_python)
{built-in method builtins.exec}                   <- 
{built-in method builtins.len}                    <-       2    0.000    0.000  juli

<pstats.Stats at 0x24c7bdfbc70>

To print which function called which other functions i.e flipping the information in previous output cell, we can use `p.print_callees()`

In [14]:
p.print_callees()

   Ordered by: internal time

Function                                          called...
                                                      ncalls  tottime  cumtime
julia_set.py:7(calculate_z_serial_purepython)     -> 34219980    4.075    4.075  {built-in method builtins.abs}
                                                           2    0.000    0.000  {built-in method builtins.len}
{built-in method builtins.abs}                    -> 
julia_set.py:21(calc_pure_python)                 ->       1    6.911   10.985  julia_set.py:7(calculate_z_serial_purepython)
                                                           2    0.000    0.000  {built-in method builtins.len}
                                                           1    0.006    0.006  {built-in method builtins.sum}
                                                     2002000    0.195    0.195  {method 'append' of 'list' objects}
{method 'append' of 'list' objects}               -> 
julia_set.py:1(<module>)            

<pstats.Stats at 0x24c7bdfbc70>

### Visualising cProfile Output using SnakeViz

We can use `snakeviz` visulaiser to visualize the outputs of cProfile profiler

In [None]:
!pip install snakeviz

In [16]:
! snakeviz profile.stats

^C
