Skip to content

The mle_fit method of the Laplace distribution is not maximum likelihood #1276

@WilfridSKendall

Description

@WilfridSKendall

Executive summary

The mle_fit method of the Laplace distribution is not actually a maximum likelihood fit, and is asymptotically biased.

The mle_fit method of the Laplace distribution from Distributions.jl (version 0.23.9) does not return the maximum likelihood estimate (MLE) of the scale parameter, and in fact produces a scale parameter which has bias which does not tend to zero as sample size tends to infinity. (The estimate of the location parameter is the maximum likelihood estimate.) The problem with scale parameter estimation can be addressed by a single simple change: replace the StatsBase.mad! call by applying StatsBase.mean to the vector of absolute deviations, resulting in a valid MLE for scale. Simulation shows that this removes the large-sample bias problem and reduces the mean square error.

Details

The current implementation of mle_fit for Laplace runs as follows:

using Distributions, StatsBase ;

function fit_mle_current(::Type{<:Laplace}, x::Array)
    xc = copy(x)
    a = StatsBase.median!(xc)
    Laplace(a, StatsBase.mad!(xc, center=a))
end ;

Up at least to v0.33.2, the module StatsBase(xc, center=a)) implements mad! as follows: it computes median absolute deviations of the vector xc from the prescribed center a (here actually the median value of xc) and then by default multiplies the result by η = 1 / quantile(Normal(), 3/4) ≈ 1.4826, in order to obtain a consistent estimator of the standard deviation under the assumption that the data is normally distributed. (My understanding is that in due course this default multiplication
will be removed.)

Thus mle_fit_current estimates the scale parameter by a multiple η of the median of absolute deviations from the location parameter estimate. For a large sample from the Laplace distribution, the median of absolute deviations from the location parameter estimate will converge to the "half-life" of the corresponding Exponential distribution. Consequently multiplication by η leads to non-zero asymptotic bias. To obtain an unbiased estimator of the scale parameter, this median would need to be multiplied by 1 / log(2) rather than by η.

Kotz, Kozubowski and Podgorski (2001; Section 2.6) show that the simultaneous MLE for location and scale is constructed from the median (for the location) and the mean of absolute deviations from the median (for the scale). Thus the current implementation of mle_fit for Laplace should be replaced by fit_mle_proposed as follows:

function fit_mle_proposed(::Type{<:Laplace}, x::Array)
    xc = copy(x)
    a = StatsBase.median!(xc)
    Laplace(a, mean(abs.(x.-a)))
end ;

I am rather new to Julia, but it seems to me that the proposed replacement would be advantageous. Would it be worth my developing an amendment to the relevant part of (a fork of) Distributions.jl and then submitting a pull request, or have I missed important points?

Some relevant numerics.

We compare the fit_mle_current and fit_mle_proposed methods empirically, using an instance of the Laplace distribution.

const µ, θ = 0.5, 2 ;
d = Laplace(µ, θ)  ;
nn = 1000000 ;

Comparative performance for a very large sample (n=1000000) demonstrates the superiority of the MLE method:

import Cairo, Fontconfig ;
using Gadfly ;
using DataFrames ;
using Printf ;
using Random ;
rng = MersenneTwister(1234) ;
Random.seed!(0) ; 

results = Dict() ;
for fit in [fit_mle_current, fit_mle_proposed]
    results[fit] = [scale(fit(typeof(d), rand(d, nn))) for i in 1:100]
    end ;

df1 = DataFrame(
    current   = results[fit_mle_current],
    proposed  = results[fit_mle_proposed],
    ) ;

df2 = stack(df1, 
    variable_name="Method for fit", value_name="Estimates of scale"
    ) ;

println("Five-number summaries for the two fits")
println("| Method    | Minimum   | 1st quart | Median    | 3rd quart | Maximum   |")
println("| --------- | --------- | --------- | --------- | --------- | --------- |")
for z in [("| Current   |", fit_mle_current), ("| Proposed  |", fit_mle_proposed)]
    r = results[z[2]]
    println(z[1], join([@sprintf(" %1.4f    |", x) for x in 
        [minimum(r), quantile(r, 0.25), median(r), quantile(r, 0.75), maximum(r)]]))
    end ;
Five-number summaries for the two fits
| Method    | Minimum   | 1st quart | Median    | 3rd quart | Maximum   |
| --------- | --------- | --------- | --------- | --------- | --------- |
| Current   | 2.0484    | 2.0539    | 2.0556    | 2.0570    | 2.0640    |
| Proposed  | 1.9951    | 1.9988    | 2.0000    | 2.0014    | 2.0045    |

Boxplots illustrate the marked superiority of the fit_mle_proposed method for this (very large) sample: the following Gadfly code produces boxplots which do not overlap: the boxplot derived using the proposed method is tighter and covers the target value of the scale parameter,

figure3p1 = layer(df2,
    yintercept = [scale(d)],
    Geom.hline(color = "red")
    ) ;

figure3p2 = layer(df2, 
    x = "Method for fit", y = "Estimates of scale", 
    Geom.boxplot
    ) ;

plot(figure3p1, figure3p2)

laplace-bug

Further numerical experiments on a corrected version of fit_mle_current, multiplying the median of absolute deviations by 1 / log(2), show (as theory would predict) that this is comparable to the fit_mle_proposed method in terms of bias, but delivers a slightly larger mean square error. In practice one might prefer to use the corrected version of fit_mle_current rather than fit_mle_proposed, because the corrected version of fit_mle_current would offer protection against outlier effects. However I believe that this would lie beyond the scope of Distributions.jl.

References

Kotz, Samuel, Tomaz J Kozubowski, and Krzysztof Podgórski. 2001. The Laplace Distribution and Generalizations. Boston, MA: Birkhäuser Boston. https://doi.org/10.1007/978-1-4612-0173-1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions