Fast Computation of Normal CDF
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src
README.Rmd
README.md
test.cpp

README.md

Fast Normal CDF

Introduction

The standard normal CDF Φ(x) is an important function in a broad range of statistical problems. When we need to evaluate the function many times (for example in numerical integration), the computation performance may become an issue.

One way to fast evaluate the function is to use a look-up table, that is, we pre-compute a set of pairs (x[i], Φ(x[i])) and then use interpolation to approximate the function value of a given x.

This simple library calculates the Φ(x) function using piecewise linear interpolation. The approximation error is guaranteed to be no greater than ε = 1e-7.

Algorithm

We need to first determine the knots x[i] that we want to pre-compute. Since Φ(-x) = 1 - Φ(x), we only need to consider non-negative x[i]'s.

For x > Φ^(-1)(1 - 1e-7) = 5.199338, we set Φ(x) = 1 and hence the error is bounded by ε. Let x[0] = 0, x[i] = i * h, i = 0, 1, ..., N, where N is the smallest integer such that N * h > 5.199338. Then we need to determine the interval width h to satisfy the error bound.

For piecewise linear interpolation, the error is bounded by

E(t) ≤ 1/8 * ||f''||∞ * h^2

(Source http://pages.cs.wisc.edu/~amos/412/lecture-notes/lecture09.pdf)

Since Φ''(x) = φ'(x) = -x * φ(x), it can be shown that ||Φ''||∞ = φ(1) = 0.2419707.

Therefore h can be calculated as

h = sqrt(8 / dnorm(1) * 1e-7)
h
## [1] 0.001818292

So the x and y values are

x = seq(0, qnorm(1 - 1e-7) + h, by = h)
length(x)
## [1] 2861
y = pnorm(x)

We write the data to a header file fastncdf_data.h:

op = options(digits = 15)
f = "src/fastncdf_data.h"
wrt = function(...) cat(..., "\n", sep = "", file = f, append = TRUE)

cat("static const double fastncdf_max = ", x[length(x)], ";\n", sep = "", file = f)
wrt("static const double fastncdf_hinv = ", 1.0 / h, ";")
wrt("static const double fastncdf_x [] = {")
con = textConnection("xdata", "w")
write(x, file = con, ncolumns = 5, sep = ", ")
close(con)
wrt(paste(xdata, collapse = ",\n"))
wrt("};")
wrt("static const double fastncdf_y [] = {")
con = textConnection("ydata", "w")
write(y, file = con, ncolumns = 5, sep = ", ")
close(con)
wrt(paste(ydata, collapse = ",\n"))
wrt("};")

options(op)

Performance

We compare the speed of fastncdf() and pnorm() in R.

library(Rcpp)
sourceCpp("test.cpp")

x = seq(-6, 6, by = 1e-6)
system.time(y <- pnorm(x))
##    user  system elapsed
##   1.038   0.023   1.059
system.time(fasty <- fastncdf(x))
##    user  system elapsed
##   0.069   0.020   0.090
max(abs(y - fasty))
## [1] 9.99999e-08