# Using Python and NumPy more efficiently

As with any programming language, there are more efficient and less efficient ways to write code that has the same functional behavior.  In Python, it can be particularly jarring that `for` loops have a relatively high per-loop cost.  For simple `for` loops, there can be alternative approaches using regular Python that are both better performing and easier to read.  For numerical calculations, `NumPy` provides additional capabilities that can dramatically improve performance.

In [126]:
# Math libraries
import math
import numpy as np

In [99]:
# Create a convenience function for using the Python `timeit` module
import timeit

def ms_from_timeit(function_as_string, argument_as_string, runs=100, repeat=10):
    """Returns the milliseconds per function call"""
    timer = timeit.Timer(function_as_string+'('+argument_as_string+')',
                         setup='from __main__ import '+function_as_string+', '+argument_as_string)
    return min(timer.repeat(repeat, runs)) / runs * 1000

## Calling a function on 10,000 values

Let's start with a simple task: calculate the square root on 10,000 randomly generated values.

In [59]:
# Create a list of 10000 random floats in [0, 1)
import random
random_list = [random.random() for i in range(10000)]

### Using a `for` loop
A simple implementation is to use a `for` loop to step through the input list and append each square-root value to an output list.

In [102]:
def sqrt_python_loop(python_list):
    result = []
    for value in python_list:
        result.append(math.sqrt(value))
    return result

print("Using a Python loop takes {0:5.3f} ms".format(ms_from_timeit('sqrt_python_loop', 'random_list')))

Using a Python loop takes 4.030 ms


### Using list comprehension

For `for` loops that only need to operate on an element-by-element basis, we can use Python's list comprehension for a significant performance boost.

In [101]:
def sqrt_python_list_comprehension(python_list):
    result = [math.sqrt(value) for value in python_list]
    return result

print("Using Python list comprehension takes {0:5.3f} ms".format(ms_from_timeit('sqrt_python_list_comprehension', 'random_list')))

Using Python list comprehension takes 2.732 ms


### Using `map`

One can also use the built-in function `map` to obtain faster performance, although it may be less readable than using list comprehension.

In [100]:
def sqrt_python_map(python_list):
    result = map(math.sqrt, python_list)
    return result

print("Using Python map takes {0:5.3f} ms".format(ms_from_timeit('sqrt_python_map', 'random_list')))

Using Python map takes 1.995 ms


## Calling a numerical function on 10,000 numbers

The above examples have significant overhead due to the adherence to "vanilla" Python.  For numerical calculations, use NumPy.

In [124]:
# Create a NumPy ndarray equivalent for the same list of random floats
random_ndarray = np.array(random_list)

### Using NumPy incorrectly

While NumPy is quite powerful, it's entirely possible to use it sub-optimally.  In the following example, which sticks with using `map`, the additional overhead of converting to/from NumPy ndarrays completely dominates the run time.

In [123]:
def sqrt_numpy_map(numpy_array):
    result = np.array(map(np.sqrt, numpy_array))
    return result

print("Using NumPy with map takes {0:5.3f} ms".format(ms_from_timeit('sqrt_numpy_map', 'random_ndarray')))

Using NumPy with map takes 18.031 ms


### Using NumPy correctly

Most of NumPy's functions are already designed to act element-wise on NumPy arrays, so there's actually no need to use `map`.

In [103]:
def sqrt_numpy_ufunc(numpy_array):
    result = np.sqrt(numpy_array)
    return result

print("Using NumPy universal function takes {0:5.3f} ms".format(ms_from_timeit('sqrt_numpy_ufunc', 'random_ndarray')))

Using NumPy universal function takes 0.062 ms


## Using NumPy on two-dimensional arrays

In [94]:
# Create a 2D NumPy ndarray from the same list of random floats
random_ndarray_2d = np.array(random_list).reshape(100, 100)

In [121]:
def std_1d(numpy_2d_array):
    result = np.zeros(numpy_2d_array.shape[1])
    for index in np.arange(numpy_2d_array.shape[0]):
        result[index] = np.std(numpy_2d_array[index, :])
    return result

print("Using NumPy avoiding `axis` takes {0:5.3f} ms".format(ms_from_timeit('std_1d', 'random_ndarray_2d')))

Using NumPy avoiding `axis` takes 4.915 ms


In [122]:
def std_1d_axis(numpy_2d_array):
    result = np.std(numpy_2d_array, axis=0)
    return result

print("Using NumPy using `axis` takes {0:5.3f} ms".format(ms_from_timeit('std_1d_axis', 'random_ndarray_2d')))

Using NumPy using `axis` takes 0.133 ms
