# High Performance Python

### Activity 3



### Part 1 Benchmarking and Profiling

1. Read the sections“Introducing the Julia Set”and “Calculating the Full Julia Set”on Chapter  2.  Profiling  to  Find  Bottlenecksfrom  the book: M.  Gorelick  &  I.  Ozsvald (2020). High Performance Python. Practical Performant Programming for Humans.Second  Edition.  United  States  of  America:  O’Reilly  Media,  Inc. Implement  the chapter functions(Example2-1, 2-2, 2-3 and 2-4)on Python in order to calculate the Julia Set. Make the representation for the false gray and pure gray scale. 

In [1]:
# Libraries

import time
from PIL import Image # Pillow
import array


In [2]:
x1, x2, y1, y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_imag = -0.62772, -.42193


def calculate_z_serial_purepython(maxiter, zs, cs):
    """Calculate output list using Julia update rule"""
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output


def calc_pure_python(draw_output, desired_width, max_iterations):
    """Create a list of complex co-ordinates (zs) and complex parameters (cs), build Julia set and display"""
    x_step = (float(x2 - x1) / float(desired_width))
    y_step = (float(y1 - y2) / float(desired_width))
    x = []
    y = []
    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step

    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_imag))

    print("Length of x:", len(x))
    print("Total elements:", len(zs))
    start_time = time.time()
    output = calculate_z_serial_purepython(max_iterations, zs, cs)
    end_time = time.time()
    secs = end_time - start_time
    print(calculate_z_serial_purepython.__name__ + " took", secs, "seconds")



if __name__ == "__main__":
    
    calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 5.008078098297119 seconds


In [3]:
x1, x2, y1, y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_imag = -0.62772, -.42193


def show_greyscale(output_raw, width, height, max_iterations):
    """Convert list to array, show using PIL"""
   
    max_iterations = float(max(output_raw))
    print(max_iterations)
    scale_factor = float(max_iterations)
    scaled = [int(o / scale_factor * 255) for o in output_raw]
    output = array.array('B', scaled)  # array of unsigned ints
    im = Image.new("L", (width, width))
    im.frombytes(output.tobytes(), "raw", "L", 0, -1)
    im.show(title="Greyscale Julia Set")

    


def show_false_greyscale(output_raw, width, height, max_iterations):
    """Convert list to array, show using PIL"""  
    assert width * height == len(output_raw)
    max_value = float(max(output_raw))
    output_raw_limited = [int(float(o) / max_value * 255) for o in output_raw]
    output_rgb = (
        (o + (256 * o) + (256 ** 2) * o) * 16 for o in output_raw_limited)  # fancier
    output_rgb = array.array('I', output_rgb)
    im = Image.new("RGB", (width, height))
    im.frombytes(output_rgb.tobytes(), "raw", "RGBX", 0, -1)
    im.show(title="False Greyscale Julia Set")
    


def calculate_z_serial_purepython(maxiter, zs, cs):
    """Calculate output list using Julia update rule"""
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output


def calc_pure_python(draw_output, desired_width, max_iterations):
    """Create a list of complex co-ordinates (zs) and complex parameters (cs), build Julia set and display"""
    x_step = (float(x2 - x1) / float(desired_width))
    y_step = (float(y1 - y2) / float(desired_width))
    x = []
    y = []
    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step
    width = len(x)
    height = len(y)

    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_imag))

    print("Length of x:", len(x))
    print("Total elements:", len(zs))
    start_time = time.time()
    output = calculate_z_serial_purepython(max_iterations, zs, cs)
    end_time = time.time()
    secs = end_time - start_time
    print(calculate_z_serial_purepython.__name__ + " took", secs, "seconds")



    if draw_output:
        show_false_greyscale(output, width, height, max_iterations)
        show_greyscale(output, width, height, max_iterations)
        



if __name__ == "__main__":
    
    calc_pure_python(draw_output=True, desired_width=1000, max_iterations=300)

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 4.836010217666626 seconds
300.0


2. Define a new function, timefn, which takes a function as an argument: the inner function, measure_time, takes *args (a variable number of positional arguments) and **kwargs (a variable number of key/value arguments) and passes them through to fn for execution. Decorate calculate_z_serial_purepython with @timefn to profile it. Implement Example 2-5 and adapt your current source code.

In [4]:
# Libraries
import time
from functools import wraps

In [5]:
def timefn(function):
    @wraps(function)
    def measure_time(*args, **kwargs):
        t1 = time.time()
        result = function(*args, **kwargs)
        t2 = time.time()
        total_time = t2 - t1
        print(f"{function.__name__} took {total_time} seconds")
        return result
    return measure_time
    
@timefn
def calculate_z_serial_purepython(maxiter, zs, cs):
    """Calculate output list using Julia update rule"""
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output


In [6]:
if __name__ == "__main__":
    calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 5.007472991943359 seconds
calculate_z_serial_purepython took 5.007472991943359 seconds


3. Use the timeit modeule to get a coarse measurement of the execution speed of the CPU-bound  function. Runs  10  loops  with  5  repetitions.  Show  how  to do  the measurement on the command lineand on a Jupyter Notebook.

In [7]:
%timeit -r 5 -n 10 calc_pure_python(draw_output=False,desired_width=1000, max_iterations=300)

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 4.978557586669922 seconds
calculate_z_serial_purepython took 4.978557586669922 seconds
Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 5.046737194061279 seconds
calculate_z_serial_purepython took 5.046737194061279 seconds
Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 5.035212993621826 seconds
calculate_z_serial_purepython took 5.035212993621826 seconds
Length of x: 1000
Total elements: 1000000


KeyboardInterrupt: 

4. Use the cProfile module to profilethe source code (.py). Sort the results by the time spent inside each function. This will give a view into the slowest parts. Analyze the output and make a syntesis of the findings. Show how to use the cProfile module on the command line and on a Jupyter Notebook.

In [10]:
import cProfile
import pstats

In [11]:
profiler = cProfile.Profile()
profiler.enable()
calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)
profiler.disable()

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 11.127159357070923 seconds
calculate_z_serial_purepython took 11.127159357070923 seconds


In [12]:
stats = pstats.Stats(profiler).sort_stats('cumulative')
stats.print_stats()

         36222181 function calls in 11.834 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.000    0.000   11.875    5.938 c:\Users\juanj\anaconda3\envs\HPC\Lib\site-packages\IPython\core\interactiveshell.py:3517(run_code)
        2    0.000    0.000   11.875    5.937 {built-in method builtins.exec}
        1    0.553    0.553   11.834   11.834 C:\Users\juanj\AppData\Local\Temp\ipykernel_19840\1047894795.py:48(calc_pure_python)
        1    0.000    0.000   11.127   11.127 C:\Users\juanj\AppData\Local\Temp\ipykernel_19840\957609554.py:2(measure_time)
        1    8.402    8.402   11.127   11.127 C:\Users\juanj\AppData\Local\Temp\ipykernel_19840\957609554.py:12(calculate_z_serial_purepython)
 34219980    2.725    0.000    2.725    0.000 {built-in method builtins.abs}
  2002000    0.154    0.000    0.154    0.000 {method 'append' of 'list' objects}
        4    0.000    0.000    0.000    0.000 {built-in method

<pstats.Stats at 0x1b46ddef210>

5. Use snakeviz to get a high-level understanding of thecPrifile statistics file.Analyze the output and make a syntesis of the findings.

In [18]:
%reload_ext snakeviz

if __name__ == "__main__":
    cProfile.run('calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)', 'stats')
!snakeviz "stats"

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 11.314344882965088 seconds
calculate_z_serial_purepython took 11.314344882965088 seconds


'snakeviz' is not recognized as an internal or external command,
operable program or batch file.


6. Use  the  line_profiler  and  kernprof  file  to  profile  line-by-line  the  function calculate_z_serial_purepython. Analyze  the  output  and  make  a  syntesis  of  the findings.

In [14]:
%load_ext line_profiler
%lprun -f calculate_z_serial_purepython calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)

  profile = LineProfiler(*funcs)


Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 25.5150089263916 seconds
calculate_z_serial_purepython took 25.5150089263916 seconds


Timer unit: 1e-07 s

Total time: 25.5156 s
File: C:\Users\juanj\AppData\Local\Temp\ipykernel_5440\957609554.py
Function: measure_time at line 2

Line #      Hits         Time  Per Hit   % Time  Line Contents
     2                                               @wraps(function)
     3                                               def measure_time(*args, **kwargs):
     4         1          9.0      9.0      0.0          t1 = time.time()
     5         1  255152818.0    3e+08    100.0          result = function(*args, **kwargs)
     6         1         50.0     50.0      0.0          t2 = time.time()
     7         1          8.0      8.0      0.0          total_time = t2 - t1
     8         1       2938.0   2938.0      0.0          print(f"{function.__name__} took {total_time} seconds")
     9         1          9.0      9.0      0.0          return result

7. Use the memory_profiler to diagnose memory usage. Analyze the output and make a syntesis of the findings.

In [7]:
%reload_ext memory_profiler 
%memit calc_pure_python(draw_output=False, desired_width=1000, max_iterations=300)

Length of x: 1000
Total elements: 1000000
calculate_z_serial_purepython took 4.972938776016235 seconds
calculate_z_serial_purepython took 4.972938776016235 seconds
peak memory: 162.79 MiB, increment: 85.00 MiB


###