 ### Geometric mean

The _geometric mean_ $GM$ of nonnegative numbers $x_1, x_2, \dots, x_n$ is the nth root of their product.  Specifically
$$
   GM(x_1, x_2, \dots x_n) = \left( \prod_{k=1}^n x_k \right)^{1/n}
$$

One property of the geometric mean is that for all nonnegative numbers $x$, we have $GM(x,x) = x$ is an identity. 

Another property of the geometric mean is that any one of the numbers $x_1, x_2, \dots, x_n$ is zero, then the geometric mean of these numbers is zero.

Our first effort for a Julia function for the geometric mean is little more than a direct translation of the definition.
The Julia function `prod` returns the product of the members of an array.

In [None]:
"""
    geometric_mean(a::Array)

Compute the geometric mean of the elements in the input array `a`. When the array `a` is empty or when
the array contains a negative entry, throw an ArgumentError.
"""
function geometric_mean(a::Array)
     isempty(a) && throw(ArgumentError("Input to `geometric_mean` must not be empty."))
     all(x -> x >= 0, a) || throw(ArgumentError("Input to `geometric_mean` must contain only nonnegative values."))
     prod(a)^(1/length(a))  
end

Two simple tests show that our function works OK

In [None]:
(geometric_mean([4,45]), sqrt(4*45))

In [None]:
(geometric_mean([1,2,3,4]), (1*2*3*4)^(1/4))

In [None]:
(geometric_mean([0,1,2,3,4]), (0*1*2*3*4)^(1/5))

In [None]:
geometric_mean([])

In [None]:
geometric_mean([-1])

But a bit more testing shows that our code can overflow giving suboptimal results; for example

In [None]:
geometric_mean([1.0e155, 1.0e155])

This result violates the identity $GM(x,x) = x$.  We really should do better.  A typical way to fix this overflow
problem is to use the fact that the logarithm of a product of positive numbers is the sum of the logarithms.  This 
gives an alternative formula for the geometric sum
$$
   GM(x_1, x_2, \dots, x_n) = \exp(\frac{1}{n}  \left( \sum_{k=1}^n \ln(x_k) \right))
$$
A simple implementation of this method is

In [None]:
"""
    geometric_mean_log(a)

Compute the geometric mean of the elements in the input array `a` using a logarithmic transformation to avoid overflow. When the 
array `a` is empty or when the array contains a negative entry, throw an ArgumentError.
"""
function geometric_mean_log(a)
    isempty(a) && throw(ArgumentError("Input to `geometric_mean_log` must not be empty."))
    all(x -> x >= 0, a) || throw(ArgumentError("Input to `geometric_mean_log` must contain only nonnegative values."))
    exp(sum(map(log,a))/length(a))
end;

In Julia `log` is the natural logarithm. Many computer languages use `ln` for the natural logarithm. The Julia function `map` applies a function to each member of an array and the Julia function `sum` adds the members of an array. Finally,
`exp` is the natural exponential function.

In [None]:
(geometric_mean_log([4,45]), sqrt(4*45))

In [None]:
(geometric_mean_log([1,2,3,4]), (1*2*3*4)^(1/4))

We have resolved the overflow problem, but arguably our function isn't as accurate as it might be

In [None]:
geometric_mean_log([1.0e155, 1.0e155])

Because $\log(0)$ is undefined, you might think that `geometric_mean_log` misbehaves when one or more argument is zero, but it doesn't.

In [None]:
geometric_mean_log([0,1,2,3])

To see what happens, we can work through the calculation one step at a time

In [None]:
x = map(log,[0,1,2,3])

In [None]:
x = sum(x)

In [None]:
x = x/4

In [None]:
exp(x)

In Julia `log(0) = -Inf` and `exp(-Inf) = 0`.

For an array with many elements, computing the logarithm is a bit spendy. Can we avoid overflow without using the 
logarithm trick? Sure, we'll just loop through the array members and after each partial product, we'll extract
the exponent and the significand of the partial product.  We'll keep a running sum of the exponent as a 64 bit integer.

In [None]:
"""
    geometric_mean(a)

Compute the geometric mean of the elements in the input array `a`. When the array `a` is empty or when the array contains a negative entry,
throw an ArgumentError.
"""
function geometric_mean(a::Array)
    isempty(a) && throw(ArgumentError("Input to `geometric_mean` must not be empty."))
    all(x -> x >= 0, a) || throw(ArgumentError("Input to `geometric_mean` must contain only nonnegative values."))
    e = 0
    s = one(eltype(a))
    for x in a
        s *= x
        e += exponent(s)
        s = significand(s)
     end
    n = length(a)
    2^(e/n)*s^(1/n)
end
    

Simple checks for overflow

In [None]:
geometric_mean([1.0e155, 1.0e155])

In [None]:
geometric_mean([1.0e308, 1.0e308, 1.0e308])

There is a standard Julia package that has code for the geometric mean.  To use it, we need to use the package manager to download and install it. Once we have done that one time, to use the package, we only need to load it

In [None]:
using StatsBase, BenchmarkTools

Let's run a race between them--try a ten-million element array of random numbers. Julia has a just in time compiler, so the first time we run the code, it's slow because the Julia compiler has to be used. After that, it should be fast

In [None]:
L = rand(Float64,10^7);

In [None]:
@btime x = geometric_mean(L)

In [None]:
@btime y = geomean(L)