# Performance Python

In physics, we want to do calculations to either analyze existing data or to model physical processes. Very often, numerical methods courses will use a single programming language to teach various methods to do these types of calculations. The issue with this is that research groups will be using a variety of different languages, and the specifics of any particular numerical technique does not translate from one language to another.

In this short talk, we will take a look at an easy problem. We will use matrix multiplication as our sample problem to see what happens when you don't take the eccentricities of your language into account.

## Naive code reuse

Many examples and texts will have code examples for various techniques written in older primitive languages, such as C or FORTRAN. These types of languages are strongly typed, and have no syntax to easily support techniques like object oriented programming. When you want to use these numerical techniques in more modern languages, you may see some unintended consequences.

We will start by creating our initial two matrices for the rest of the talk. We will use the numpy module to generate a couple of large matrices with random floating point numbers in them.

In [63]:
import numpy as np
rows = 200
cols = 200
A = np.random.random((rows,cols))
B = np.random.random((rows,cols))

From here, we might be tempted to just do the naive copy-paste of some matrix multiplication routine from C. When we do that, we might get something that looks like the following:

In [64]:
%%timeit
C = np.zeros((rows,cols))

i = 0
while (i < rows):
    j = 0
    while (j < cols):
        k = 0
        while (k < rows):
            C[i,j] += A[i,k] * B[k,j]
            k += 1
        j += 1
    i += 1


5.29 s ± 271 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [65]:
C[1,1]

8297.66274749814

In jupyter, we have access to builtin functions to manage timing of code. This is the '%%timeit' at the top of the following code block.

This is far from performant. The issue with this type of code is the object model in Python. Core Python is an untyped language. This means that any variable name can be used to refer to any type of object. So, whenever you use a variable anywhere in your code, Python needs to check what is referenced by that name and whether the operation you want to execute can be applied to that type of object. This means that Python is doing checks on every iteration of the loops above.

We can remove some of these checks by doing the loops in a more Pythonic way.

In [66]:
%%timeit
C = np.zeros((rows,cols))

for i in range(rows):
    for j in range(cols):
        for k in range(rows):
            C[i][j] += A[k][i] * B[j][k]


6.5 s ± 56.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [67]:
C[1,1]

8297.66274749814

This spead things up a bit, but not as much we could do.


## Using modules

A huge advantage of Python is the large environment of modules made available. We already 