## This demo shows how computing the same operation using different NumPY floating point precisions can have drastically different execution times.  

The following data was generated on an Intel Core i5-6500 3.20 GHz quad-core processor with AVX2. The `@` symbol is the NumPy matrix multiplication operator. 

In [1]:
import numpy as np
n = 500

In [2]:
a = np.random.random((n,n)).astype(dtype=np.float16)
b = np.random.random((n,n)).astype(dtype=np.float16)

float16_time = %timeit -o a @ b

423 ms ± 6.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [3]:
a = np.random.random((n,n)).astype(dtype=np.float32)
b = np.random.random((n,n)).astype(dtype=np.float32)

float32_time = %timeit -o a @ b

1.23 ms ± 238 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [4]:
a = np.random.random((n,n)).astype(dtype=np.float64)
b = np.random.random((n,n)).astype(dtype=np.float64)

float64_time = %timeit -o a @ b

1.75 ms ± 112 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [5]:
print('float32 speed up is {0:.1f} times faster than float16'.format(float16_time.average/float32_time.average))
print('float64 speed up is {0:.1f} times faster than float16'.format(float16_time.average/float64_time.average))
print('float32 speed up is {0:.1f} times faster than float64'.format(float64_time.average/float32_time.average))

float32 speed up is 343.6 times faster than float16
float64 speed up is 241.5 times faster than float16
float32 speed up is 1.4 times faster than float64
