# Portfolio of Evidence:
### High Performance Python

#### Part 1. Benchmarking and Profiling

1. Read the sections “Introducing the Julia Set” and “Calculating the Full Julia Set” on Chapter 2. Profiling to Find Bottlenecks from the book: M. Gorelick & I. Ozsvald (2020). High Performance Python. Practical Performant Programming for Humans. Second Edition. United States of America: O’Reilly Media, Inc. Implement the chapter functions (Example 2-1, 2-2, 2-3 and 2-4) on Python in order to calculate the Julia Set. Make the representation for the false gray and pure gray scale.

In [None]:
from PIL import Image
import array


x1,x2,y1,y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_imag = -0.62772, -.42193

def show_greyscale(output, width, height, max_iterations):
    max_iterations = float(max(output))
    print("Max iterations:", max_iterations)
    scale_factor = float(max_iterations)
    print("Scale factor:", scale_factor)
    scaled = [int(o/scale_factor*255) for o in output]
    output = array.array('B', scaled)
    #output = output.tostring()
    img = Image.frombytes("L", (width, height), output)
    img.show()

def show_false_greyscale(output, width, height, max_iterations):
    assert width * height == len(output)
    max_value = float(max(output))
    output_raw_limited = [int(float(o) / max_value * 255) for o in output]
    output_rgb = ((o+(256*o)+(256**2)*o) for o in output_raw_limited)
    output_rgb = array.array('I', output_rgb)
    img = Image.new("RGB", (width, height))
    img.frombytes(output_rgb.tobytes(), 'raw', "RGBX", 0, -1)
    img.show()

def calculate_z_serial_purepython(maxiter, zs, cs):
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output

def calc_pure_python(desired_width, max_iterations):
    x_step = (float(x2-x1) / float(desired_width))
    y_step = (float(y1-y2) / float(desired_width))
    x = []
    y = []
    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step
    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_imag))
    print("Total of %d elements" % len(zs))
    print("Length of x:", len(x))
    output = calculate_z_serial_purepython(max_iterations, zs, cs)

    assert sum(output) == 33219980

    show_greyscale(output, len(x), len(y), max_iterations)
    show_false_greyscale(output, len(x), len(y), max_iterations)

if __name__ == "__main__":
    calc_pure_python(desired_width=1000, max_iterations=300)

2. Define a new function, timefn, which takes a function as an argument: the inner function, measure_time, takes *args (a variable number of positional arguments) and **kwargs (a variable number of  key/value arguments) and passes them through to fn for execution. Decorate calculate_z_serial_purepython with @timefn to profile
it. Implement Example 2-5 and adapt your current source code.

In [3]:
from functools import wraps
import time

def timefn(fn):
    @wraps(fn)
    def measure_time(*args, **kwargs):
        t1 = time.time()
        result = fn(*args, **kwargs)
        t2 = time.time()
        print("@timefn: %s took %0.3f ms" % (fn.__name__, (t2-t1)*1000.0))
        return result
    return measure_time

@timefn
def calculate_z_serial_purepython(maxiter, zs, cs):
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output

def calc_pure_python(desired_width, max_iterations):
    x_step = (float(x2-x1) / float(desired_width))
    y_step = (float(y1-y2) / float(desired_width))
    x = []
    y = []
    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step
    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_imag))
    print("Total of %d elements" % len(zs))
    print("Length of x:", len(x))
    output = calculate_z_serial_purepython(max_iterations, zs, cs)

    assert sum(output) == 33219980


if __name__ == "__main__":
    calc_pure_python(desired_width=1000, max_iterations=300)

Total of 1000000 elements
Length of x: 1000
@timefn: calculate_z_serial_purepython took 3857.429 ms


3. Use the timeit modeule to get a coarse measurement of the execution speed of the CPU-bound function. Runs 10 loops with 5 repetitions. Show how to do the measurement on the command line and on a Jupyter Notebook.

In [4]:
%timeit -n10 -r5 calc_pure_python(desired_width=1000, max_iterations=300)

Total of 1000000 elements
Length of x: 1000
@timefn: calculate_z_serial_purepython took 3950.275 ms
Total of 1000000 elements
Length of x: 1000


KeyboardInterrupt: 

4. Use the cProfile module to profile the source code (.py). Sort the results by the time spent inside each function. This will give a view into the slowest parts. Analyze the output and make a syntesis of the findings. Show how to use the cProfile module on the command line and on a Jupyter Notebook.

In [13]:
from cProfile import Profile, run
from pstats import Stats

profiler = Profile()
profiler.run('calc_pure_python(desired_width=1000, max_iterations=300)')
stats = Stats(profiler)
stats.strip_dirs()
stats.sort_stats('cumulative')
stats.print_stats()

Total of 1000000 elements
Length of x: 1000
@timefn: calculate_z_serial_purepython took 8051.111 ms
         36222101 function calls in 8.600 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    8.600    8.600 {built-in method builtins.exec}
        1    0.061    0.061    8.600    8.600 <string>:1(<module>)
        1    0.384    0.384    8.539    8.539 693206192.py:27(calc_pure_python)
        1    0.000    0.000    8.051    8.051 693206192.py:5(measure_time)
        1    5.946    5.946    8.051    8.051 693206192.py:14(calculate_z_serial_purepython)
 34219980    2.105    0.000    2.105    0.000 {built-in method builtins.abs}
  2002000    0.099    0.000    0.099    0.000 {method 'append' of 'list' objects}
        1    0.004    0.004    0.004    0.004 {built-in method builtins.sum}
        3    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        8    0.000    0.000    0.000 

<pstats.Stats at 0x7243fc7c7190>

5. Use snakeviz to get a high-level understanding of the cPrifile statistics file. Analyze the output and make a syntesis of the findings.

In [None]:
%pip install snakeviz

In [None]:
run('calc_pure_python(desired_width=1000, max_iterations=300)', 'calc_stats')
!snakeviz "calc_stats"

6. Use the line_profiler and kernprof file to profile line-by-line the function calculate_z_serial_purepython. Analyze the output and make a syntesis of the findings.

In [None]:
%pip install line_profiler

In [None]:
%load_ext line_profiler
%lprun -f calculate_z_serial_purepython calc_pure_python(desired_width=1000, max_iterations=300)

7. Use the memory_profiler to diagnose memory usage. Analyze the output and make a syntesis of the findings.

In [None]:
%pip install memory_profiler

In [None]:
%load_ext memory_profiler

%memit calc_pure_python(desired_width=1000, max_iterations=300)