It's useful to have a *rough* idea of the speed of arithmetic operations.  Let's do some *numpy* timings using the *timeit* feature.  We'll give it some larger arrays to work with so that we use system resources adequately.

The usual preamble

In [1]:
import numpy as np
from math import pow
from math import log
from math import exp
from math import fsum
import sys
from scipy.stats import gmean as gm
import random

In [1]:
%%timeit a=np.random.rand(1000);b=np.random.rand(1000); 
np.add(a,b)

NameError: global name 'np' is not defined

In [4]:
%%timeit a=np.random.rand(1000);b=np.random.rand(1000);
np.multiply(a,b)

The slowest run took 16.00 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 879 ns per loop


In [5]:
%%timeit a=np.random.rand(1000);b=np.random.rand(1000)
a>b

The slowest run took 18.10 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 724 ns per loop


In [6]:
%%timeit a=np.random.rand(1000);b=np.random.rand(1000);
c=np.divide(a,b)

The slowest run took 5.08 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.77 µs per loop


In [8]:
%%timeit a=np.random.rand(1000);b=np.random.rand(1000);
c=np.power(a,b)


The slowest run took 5.62 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 20.8 µs per loop


In [9]:
%%timeit a=np.random.rand(1000)
c=np.log(a)

100000 loops, best of 3: 8.69 µs per loop


Let's look at a more interesting timing.  We compare equivalent expressions for the **Geometric Mean**
$$
\left(\prod_{i=1}^n x_i \right)^{1/n}\quad \mathrm{and} \quad \prod_{i=1}^n x_i^{\frac{1}{n}} $$


In [10]:
%%timeit numbers=1000;x=np.random.rand(numbers);b=1.0/numbers
np.power(np.prod(x),b)

The slowest run took 10.72 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.54 µs per loop


We expect the 2nd algorithm to be much slower as it involves $n$ *power* calls and $n$ multiplies

In [11]:
%%timeit numbers=1000;x=np.random.rand(numbers);b=1.0/numbers
np.prod(np.power(x,b))


10000 loops, best of 3: 28.4 µs per loop


But what about accuracy?  Let's actually compute them.

In [12]:
numbers=1000
x=np.random.rand(numbers)
b=1.0/numbers
mult_first=np.power(np.prod(x),b)
power_first=np.prod(np.power(x,b))

print 'Direct Multiply:',mult_first
print 'Distributing the Power:',power_first

Direct Multiply: 0.0
Distributing the Power: 0.379101060529


Can we do better?  Recall that  
$$ \log( x_1 x_2 x_3 x_4 \ldots x_n) = \log x_1 + \log x_2 + \ldots \log x_n$$
so 
$$ \left( \prod_{i=1}^n x_i \right)^{\frac{1}{n}}= exp\left(\frac{1}{n}\sum_{i=1}^n \ln x_i \right) $$
That is, the geometric mean of a set is the exponential of the *mean* of the logarithms of that set.

In [14]:
%%timeit numbers=1000;x=np.random.rand(numbers);b=1.0/numbers
np.exp(np.mean(np.log(x)))

The slowest run took 5.42 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 21 µs per loop


What about accuracy?

In [13]:
log_mult=np.exp(np.mean(np.log(x)))

print "Logarithmic Addition:",log_mult

Logarithmic Addition: 0.379101060529


Can we do even better?  Yes, the downfall of the direct multiply(fastest) is the chance of underflow/overflow via multiplication, which won't (typically) happen for small values of $n$.  Here we use the log strategy on **chunks** of 100 numbers.

In [16]:
%%timeit x=np.random.rand(100,10);b=1/1000.0
np.exp(np.sum(np.log(np.prod(x,axis=1)))*b)

               


The slowest run took 6.50 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 12 µs per loop


And the result?

In [15]:
y=np.resize(x,[100,10])
chunk_log=np.exp(np.sum(np.log(np.prod(y,axis=1)))*b)

print('With chunking: ',chunk_log)

('With chunking: ', 0.37910106052885856)
