# Profiling and Optimisations

- Programmer time is mostly more valuable than computer time
- So, want to optimise until the runtime is not the limiting factor in your research

"To optimise, or not to optimise. That is the question."
Profiling answers that question

We want to optimise the most expensive parts first.

Things to keep in mind:
1. DO NOT ASSUME
2. DO NOT ASSUME
3. DO NOT ASSUME

Computers are good. And the library writers are better. What you thought is the slowest step is under most cases not the slowest step.

## Tools for profiling

In this talk, we will majorly discuss the use of line_profiler using a Jupyter notebook.

This can be done in a standard python script, but Jupyter is usually easier to work with for this.

In [1]:
%load_ext line_profiler

Now, one can profile functions line-by-line

### Example

In [2]:
def read_files(filename1, filename2):
    # Reading from the files, assuming newline separated floats
    arr1 = []
    arr2 = []
    with open(filename1, 'r') as file1:
        for x in file1:
            arr1.append(float(x))
    with open(filename2, 'r') as file2:
        for x in file2:
            arr2.append(float(x))
    
    assert len(arr1) == len(arr2) # Just to guarantee that the sizes are the same
    
    return arr1, arr2

def multiply_arrs(arr1, arr2):
    out = []
    for i in range(len(arr1)):
        out.append(arr1[i]*arr2[i])
    return out

def add_arr(out):
    ret = 0
    for x in out:
        ret += x
    return ret

In [3]:
def read_from_two_files_then_multiply_and_sum(filename1, filename2):
    arr1, arr2 = read_files(filename1, filename2)
    multiplied_arr = multiply_arrs(arr1, arr2)
    output = add_arr(multiplied_arr)
    return output

In [4]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

989 ms ± 39.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [5]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.7908467609


So, we see that we should rewrite read_files to make it faster as it takes the majority of the time

#### Introducing numpy

In [6]:
import numpy as np

In [7]:
# Rewriting to make this faster
def read_files(filename1, filename2):
    # Reading from the files, assuming space separated floats
    arr1 = np.loadtxt(filename1, unpack=True)
    arr2 = np.loadtxt(filename2, unpack=True)
    assert len(arr1) == len(arr2) # Just to guarantee that the sizes are the same
    
    return arr1, arr2

In [8]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.7908467609


In [9]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

4.54 s ± 179 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


So, for the numpy enthusiants: Numpy can be slower than pure python!

#### Introducing Pandas

In [10]:
import pandas as pd

In [11]:
# Rewriting to make this faster
def read_files(filename1, filename2):
    # Reading from the files, assuming space separated floats
    arr1 = np.array(pd.read_csv(filename1))
    arr2 = np.array(pd.read_csv(filename2))
    arr1 = arr1.T[0]
    arr2 = arr2.T[0]
    assert len(arr1) == len(arr2) # Just to guarantee that the sizes are the same
    
    return arr1, arr2

In [12]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.4665736269


In [13]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

668 ms ± 34.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Next, optimising multiplication

In [14]:
def multiply_arrs(arr1, arr2):
    return arr1*arr2

In [15]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.4665736269


In [16]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

464 ms ± 47.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Optimising the addition

In [17]:
def add_arr(out):
    return out.sum()

In [18]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.46657362505


In [19]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

380 ms ± 27.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Calling upon your unused CPU

In [20]:
import threading

In [23]:
def read_file_in_arr(arr, index, filename):
    arr[index] = np.array(pd.read_csv(filename)).T[0]
def read_files(filename1, filename2):
    # Reading from the files, assuming space separated floats
    arr = [0,0]
    
    thread_read = threading.Thread(target = read_file_in_arr, args=(arr,0,filename1))
    thread_read.start()
    
    read_file_in_arr(arr,1,filename2)
    thread_read.join()
    
    assert len(arr[0]) == len(arr[1]) # Just to guarantee that the sizes are the same
    
    return arr[0], arr[1]

In [24]:
%lprun -f read_from_two_files_then_multiply_and_sum print(read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt'))

249916.46657362505


In [25]:
%timeit read_from_two_files_then_multiply_and_sum('file1.txt', 'file2.txt')

212 ms ± 30.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
