# Why Profiling is needed?

Profiling is important to understand which parts of program consume most of the resources. This helps to improve the slow executing code segments to perform better or to focus on finding alternatives. Not only CPU, but also memory(RAM), disk usage(Disk I/O), network operations etc. also can be measured to determine bottlenecks of a program.

The basic method of identifying the bottlenecks is understanding the time consumption of the program sections. In jupyter notebook we can use `%%timeit` magic, time.time() or time decorators. In order to test the mentioned techniques we will define a special function named `Julia Set` which is Heavy CPU bound and less memory consuming non linear time consuming task. More technically speaking this is a fractal function which generates a complex output image.


<center><image src="./img/1.jpg" width="200px" /></center>

The basic psuedo code for calculation is as follows. In here coordinates are imaginary numbers and max_iter is a predefined variable for the function.
<pre style='color:yellow'> 
coordinates = []

for z in coordinates:
    for _ in range(max_iter):
        
        if (abs(z)< thres):
            z = z*z + c
        else:
            break
</pre>

But for the sake of testing various scenarios following imlpementation has few other parts added to it.

In [1]:
import time
import cv2

# area of imaginary space to calculate pixel values
x1, x2, y1, y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_img = -0.62772, -.42193

In [64]:
def display_img(arr):
    '''
    Function to display the generated output as an image.
    '''
    import numpy as np
    arr = arr.reshape((int(len(arr)**0.5), int(len(arr)**0.5)), order='C')

    arr = np.array(arr, dtype=np.uint8)
    cv2.imshow("Julia set", arr)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

In [50]:
def calculate_juliaset_serial(maxiter, zs, cs):
    """Calculate output list using Julia update rule"""
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output

In [51]:
def calc_juliaset_time(desired_width, max_iterations):
    """Create a list of complex coordinates (zs) and complex parameters (cs), to build Julia set"""

    x_step = (x2 - x1) / desired_width
    y_step = (y1 - y2) / desired_width

    x = []
    y = []

    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step

    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_img))
    
    print("Length of x:", len(x))
    print("Total elements:", len(zs))

    start_time = time.time()
    output = calculate_juliaset_serial(max_iterations, zs, cs)
    end_time = time.time()

    secs = end_time - start_time
    print(calculate_juliaset_serial.__name__ + " took", secs, "seconds")

    return output

In [53]:
# reasonable defaults for a laptop
val = calc_juliaset_time(desired_width=1000, max_iterations=300)
display_img(val)

Length of x: 1000
Total elements: 1000000
calculate_juliaset_serial took 6.129579544067383 seconds


As above we can use the julia set to as a baseline task to check the performance. In the above case we have used the good old print statement with time difference to measure the performance. But this time change with the other processes running in the computer. Also print statements like above causes inconvienience in the long run. Instead we can use a decorator to measure time and print. (Or in Jupyter notebooks magic functions :D )

In [65]:
from functools import wraps

def timefn(fn):

    @wraps(fn)
    def measure_time(*args, **kwargs):

        t1 = time.time()
        result = fn(*args, **kwargs)
        t2 = time.time()

        print(f"@timefn: {fn.__name__} took {t2 - t1} seconds")
        return result
    
    return measure_time


In [69]:
@timefn
def calc_juliaset_time(desired_width, max_iterations):
    """Create a list of complex coordinates (zs) and complex parameters (cs), to build Julia set"""

    x_step = (x2 - x1) / desired_width
    y_step = (y1 - y2) / desired_width

    x = []
    y = []

    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step

    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_img))
    
    print("Length of x:", len(x))
    print("Total elements:", len(zs))

    output = calculate_juliaset_serial(max_iterations, zs, cs)

    return output

In [70]:
x = calc_juliaset_time(1000, 300)

Length of x: 1000
Total elements: 1000000
@timefn: calc_juliaset_time took 6.439051389694214 seconds


Other than above methods, we can use python provided `timeit` function. It provides more functionality to time a execution with repetitions and loops.

Also it is important to keep track of other computer processes, because sometimes those may cause sudden spikes in CPU usage which may affect our profiling process.

Outside of python, we can use OS provided functionalities such as UNIX `time` command to measure program execution time as well.

<center>

__time --verbose python_script_name.py__

</center>

Another way to do the profiling is using the `cProfile` or `Profile` modules provided in the standard library.

<center>Eg:-

__python -m cProfile -s cumulative python_script_name.py__

</center>

Below is a example execution of cProfile on juliaset calculation script. As we can see it measures the execution times, number of calls to each function with little bit of overhead to the cProfiler itself. This is a comparatively descriptive way of analyzing our code to identify bottlenecks.

<center><image src="./img/2.jpg" width="500px" /></center>

A point to note is that cProfile gives details on funcion call basis, not line basis. So it would be bit harder to pinpoint problematic location of the code.

Also we can write the cProfile output to a statistics file which can later be read by python itself. This way we can further analyze the details regarding our program.

<center>Eg:-

__python -m cProfile -o profile.stats python_script_name.py__

</center>

Above command will write a file named profile.stats and we can use it as below in python.

In [72]:
import pstats

p = pstats.Stats('scripts/profile.stats')
p.sort_stats('cumulative')
p.print_stats()

Fri Apr 29 12:13:13 2022    scripts/profile.stats

         36221991 function calls in 10.417 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   10.417   10.417 {built-in method builtins.exec}
        1    0.020    0.020   10.417   10.417 juliset.py:2(<module>)
        1    0.405    0.405   10.396   10.396 juliset.py:18(calc_juliaset_time)
        1    6.102    6.102    9.842    9.842 juliset.py:5(calculate_juliaset_serial)
 34219980    3.740    0.000    3.740    0.000 {built-in method builtins.abs}
  2002000    0.149    0.000    0.149    0.000 {method 'append' of 'list' objects}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




<pstats.Stats at 0x1cbf891d2b0>

In [73]:
p.print_callers()

   Ordered by: cumulative time

Function                                          was called by...
                                                      ncalls  tottime  cumtime
{built-in method builtins.exec}                   <- 
juliset.py:2(<module>)                            <-       1    0.020   10.417  {built-in method builtins.exec}
juliset.py:18(calc_juliaset_time)                 <-       1    0.405   10.396  juliset.py:2(<module>)
juliset.py:5(calculate_juliaset_serial)           <-       1    6.102    9.842  juliset.py:18(calc_juliaset_time)
{built-in method builtins.abs}                    <- 34219980    3.740    3.740  juliset.py:5(calculate_juliaset_serial)
{method 'append' of 'list' objects}               <- 2002000    0.149    0.149  juliset.py:18(calc_juliaset_time)
{built-in method builtins.print}                  <-       2    0.000    0.000  juliset.py:18(calc_juliaset_time)
{built-in method builtins.len}                    <-       2    0.000    0.000  juliset.py

<pstats.Stats at 0x1cbf891d2b0>