In [None]:
%matplotlib inline

import time
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

The Julia set (https://en.wikipedia.org/wiki/Julia_set) is a CPU-bound problem in which a fractal sequence generates a complex output image. The implementation on this book is suboptimal so we can identify both memory consuming and slow instructions. 

Each coordinate is expressed as a complex number. For each coordinate z, we apply the following function:

f(z) = z^2 + c

until z gets to infinite or after a maximum number of iterations is executed. The colour assigned to each coordinate depends on the number of times we iterated at that coordinate: from black (1 iteration) to white (if we reach the maximum number of iterations)

In [None]:
# Area of the complex space to investigate
x1, x2, y1, y2 = -1.8, 1.8, -1.8, 1.8
c_real, c_imag = -0.62772, -.42193

In [None]:
# Calculating input parameters
def calc_pure_python(desired_width, max_iterations):
    '''
    Create a list of complex coordinates (zs) and complex 
    parameters (cs), build Julia set, and display
    '''
    x_step = (float(x2 - x1) / float(desired_width))
    y_step = (float(y1 - y2) / float(desired_width))
    x = []
    y = []
    ycoord = y2
    while ycoord > y1:
        y.append(ycoord)
        ycoord += y_step
    xcoord = x1
    while xcoord < x2:
        x.append(xcoord)
        xcoord += x_step
    # Build a list of coordinates and the initial conidition for each cell
    # The initial condition is a constant and we could use a single value
    # instead of an array, but the aim is to simulate a real-world scenario
    # with several inputs to our function
    zs = []
    cs = []
    for ycoord in y:
        for xcoord in x:
            zs.append(complex(xcoord, ycoord))
            cs.append(complex(c_real, c_imag))
            
    print("Length of x: " + str(len(x)))
    print("Total elements: " + str(len(zs)))
    start_time = time.time()
    output = calculate_z_serial_purepython(max_iterations, zs, cs)
    end_time = time.time()
    secs = end_time - start_time
    print('calculate_z_serial_purepython took ' + str(secs) + ' seconds')
    
    # This sum is expected for a 1000^2 grid with 300 iterations.
    # It catches minor errors we might introduce when we are
    # working on a fixed set of inputs
    assert(sum(output) == 33219980)
    
    return output

In [None]:
def calculate_z_serial_purepython(maxiter, zs, cs):
    '''
    Calculate output list using Julia update rule
    '''
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output

In [None]:
# Main method
# Calcualte the Julia set using a pure Python solution with
# reasonable defaults for a laptop
output = calc_pure_python(desired_width = 1000, max_iterations = 300)

In [None]:
output = np.array(output).reshape((-1, 1000))
output = output.astype(float)

In [None]:
plt.imshow(output, cmap='hot')

The version of the code above is based on the `time` module and `print` statements. This is the simplest way to measure the execution time of a piece of code, but it may become soon unmanageable. 

### Using a decorator

This is a cleaner approach. 

In [None]:
from functools import wraps

def timefn(fn):
    @wraps(fn)
    def measure_time(*args, **kwargs):
        t1 = time.time()
        result = fn(*args, **kwargs)
        t2 = time.time()
        print('@timefn: ' + fn.__name__ + ' took ' + str(t2 - t1) + ' seconds')
        return result
    return measure_time

In [None]:
@timefn
def calculate_z_serial_purepython(maxiter, zs, cs):
    '''
    Calculate output list using Julia update rule
    '''
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while abs(z) < 2 and n < maxiter:
            z = z * z + c
            n += 1
        output[i] = n
    return output

In [None]:
output = calc_pure_python(desired_width = 1000, max_iterations = 300)

There is a tiny difference due to the fact that `calculate_z_serial_purepython` now has to call another function. 

### Using the timeit module

The `timeit` module includes functionality to measure a piece of code by repeating it several times. We should notice that this module temporary disables the garbage collector. This may have an impact on the real execution time if the garbage collection would be invoked.

In [None]:
%%timeit -n 5 -r 5
calc_pure_python(desired_width = 1000, max_iterations = 300)

The average of 5 repetitions will be calculated (number = 5) and this process will be repeated other 5 times (repeat = 5) to calculate mean and standard deviation. Higher times are probably produced by other processes running in the background and taking processing time from the CPU. We should execute this several times. If we get wide differences it may mean that we there are too many other processes running in the background. 

### Using the cProfile module

The `cProfile` module is a built-in profiling tool in the standard library. It measures the time it takes for each function to run. This introduces a great overhead, but provides much better insights. 

The following piece of code shows the cumulative running time of each function, together with caller information (which function is calling which).

In [None]:
import cProfile, io, pstats

pr = cProfile.Profile()
pr.enable()

calc_pure_python(desired_width = 1000, max_iterations = 300)

pr.disable()

s = io.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream = s).sort_stats(sortby)
ps.print_stats()
ps.print_callers()
print(s.getvalue())

The `calculate_z_serial_purepython` function is the most time consuming one. Inside this function, the most time consuming operation is `abs`, which is called a total of 34219980 times. 

Figuring out what is happening on a line-by-line basis is very hard because we only get profile information for the function calls themselves, not each line within the functions. 