# Performance Comparison

It's difficult to produce direct performance comparison between NumPy/SciPy implementations of operations and non-NumPy/SciPy implementations because there are typically many ways of carrying out a calculation, particularly when NumPy/SciPy are not being used and particularly for more complex calculations.

However, this notebook aims to make a few comparisons where possible as an illustration of how must faster NumPy and SciPy are. Where possible I have tried to use comparable/optimal implementations in non-NumPy/SciPy Python, although it's possible there are better ways.

## Sequence Creation

We can create a few sequences (lists or NumPy arrays) and compare how long it takes for them to be created. Because the native Python implementations are using only simple commands with little actual Python code, this is comparable to the NumPy time, but will produce lists, which are not as functional as NumPy arrays.

In [None]:
import time
import numpy as np

# Operations will be repeated a large number of times to get a good length of time
repetitions = 1000

start_time = time.time()
for i in range(repetitions):
  # Non-NumPy zeroes
  a = [0]*100000
print('Non-NumPy zeroes:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  # Numpy zeroes
  a = np.zeros(100000)
print('NumPy zeroes:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  # Non-NumPy ascending sequence
  a = list(range(100000))
print('Non-NumPy ascending sequence:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  # NumPy ascending sequence
  a = np.arange(100000)
print('NumPy ascending sequence:', time.time() - start_time)

Non-NumPy zeroes: 0.06846880912780762
NumPy zeroes: 0.02041029930114746
Non-NumPy ascending sequence: 0.7871990203857422
NumPy ascending sequence: 0.021414518356323242


## Dot Product

We can compare the time taken to calculate the dot product of two vectors. The NumPy version is significantly quicker and easier to read.

In [5]:
import time
import numpy as np

# Repeat the operations a large number of times to get a good length of time
repetitions = 1000
  
# Create the non-NumPy and NumPy arrays to be used
a = list(range(10000))
b = np.arange(10000)

start_time = time.time()
for i in range(repetitions):
  # Non-NumPy dot product
  c = sum(i[0] * i[1] for i in zip(a, a))
print('Non-NumPy dot product:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  # Numpy dot product
  c = np.dot(b, b)
print('NumPy dot product:', time.time() - start_time)

Non-NumPy dot product: 0.3973982334136963
NumPy dot product: 0.004291534423828125


## Log Function

We can compare the time taken to calculate the logarithm of a single value or an array of values. When calculating a single value, the function in the ```math``` module is faster - this is because it is specifically designed to calculate the logarithm of a single value, whilst the ```numpy``` version is a more general function which works on arrays of data. As we see when we use it on the array of values later, this makes it faster when calculating the logarithm of an array of values.

In [8]:
import time
import math
import numpy as np

# Repeat the operations a large number of times to get a good length of time
repetitions = 1000000
  
# Create the non-NumPy and NumPy arrays to be used
a = list(range(1,  10000))
b = np.arange(1, 10000)

start_time = time.time()
for i in range(repetitions):
  # Non-NumPy single log
  c = math.log(2)
print('Non-NumPy single log:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  #NumPy single log
  c = np.log(2)
print('NumPy single log:', time.time() - start_time)

# Repeat the array operations a smaller number of times
repetitions = 1000

start_time = time.time()
for i in range(repetitions):
  # Non-NumPy log of list
  c = list(map(math.log, a))
print('Non-NumPy log of list:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  #NumPy log of array
  c = np.log(b)
print('NumPy log of array:', time.time() - start_time)

Non-NumPy single log: 0.051450252532958984
NumPy single log: 0.69820237159729
Non-NumPy log of list: 0.22849440574645996
NumPy log of array: 0.03303098678588867


## Sinc Function

The sinc function is defined as:

$f(x) = \frac{\sin(\pi x)}{\pi x}$

There is not a definition in the ```math``` module so we must write our own. Again, SciPy is slower for the single calculation, but much quicker for the calculation on a sequence.

In [11]:
import time
import math
from scipy.special import sinc
import numpy as np

# Define a non-SciPy sinc
def sinc_non_scipy(x):
  return math.sin(math.pi * x)/(math.pi * x)


# Create the non-NumPy list and NumPy arrays to be used
a = [i for i in range(1,  1000)]
b = np.arange(1, 1000)

# Repeat the operations a large number of times to get a good length of time
repetitions = 100000

start_time = time.time()
for i in range(repetitions):
  # Non-SciPy single sinc
  c = sinc_non_scipy(1)
print('Non-SciPy single sinc:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  #SciPy single sinc
  c = sinc(1)
print('SciPy single sinc:', time.time() - start_time)

# Repeat the array operations a smaller number of times
repetitions = 1000

start_time = time.time()
for i in range(repetitions):
  # Non-SciPy sinc of list
  c = list(map(sinc_non_scipy, a))
print('Non-SciPy sinc of list:', time.time() - start_time)

start_time = time.time()
for i in range(repetitions):
  #SciPy sinc of array
  c = sinc(b)
print('SciPy sinc of array:', time.time() - start_time)

Non-SciPy single sinc: 0.02415180206298828
SciPy single sinc: 0.3493537902832031
Non-SciPy sinc of list: 0.10430097579956055
SciPy sinc of array: 0.009147405624389648
