Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boundary corrected KDE values and flag: ppc_loo_pit_overlay() #235

Merged
merged 41 commits into from
Oct 23, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
dfc156f
boundary corrected KDE flag for ppc_loo_pit_overlay
ecoronado92 Aug 13, 2020
ee095cc
added missing , in geom_line
ecoronado92 Aug 13, 2020
18abff9
fix doc additions given build check errors
ecoronado92 Aug 13, 2020
5ed24be
attempt fix at rep_id global var build error
ecoronado92 Aug 13, 2020
921df6d
Edit parameter description R/ppc-loo.R
ecoronado92 Aug 17, 2020
629ad09
Update R/ppc-loo.R
ecoronado92 Aug 17, 2020
11f6bf8
R styling R/ppc-loo.R
ecoronado92 Aug 17, 2020
3174ded
R styling R/ppc-loo.R
ecoronado92 Aug 17, 2020
1bf7972
changed bc_kde flag, length consistency issues, and strict sanity che…
ecoronado92 Aug 17, 2020
ed455bc
add test for boundary_correction=TRUE
jgabry Aug 18, 2020
015ca3a
changes to fix vector ifelse warning bc_pvals
ecoronado92 Aug 18, 2020
67d0886
add @ecoronado92 as a contributor and update NEWS.md
jgabry Aug 18, 2020
cf28e91
add ppc_loo_pit_data function to separate plotting from data
jgabry Aug 18, 2020
05ba44b
remove browser()
jgabry Aug 18, 2020
01adc6d
speed up boundary correction w custom cpp
ecoronado92 Aug 22, 2020
0922087
add suggested_package rstanarm to load Rcpp
ecoronado92 Aug 22, 2020
feb7c97
reverting to non cpp code
ecoronado92 Aug 22, 2020
25a2199
updated cpp code
ecoronado92 Aug 22, 2020
73f7c4d
changed boundary correction approach to one based on reflection trick…
ecoronado92 Aug 30, 2020
14d1d55
fixed missing comma on parameters in ppc_loo_data()
ecoronado92 Aug 30, 2020
059395e
edit man docs to reflect new grid_len param
ecoronado92 Aug 30, 2020
7264ef4
edit man docs to reflect new grid_len param
ecoronado92 Aug 30, 2020
2e5ca83
Added flags that current version only works on continuous data and ot…
ecoronado92 Sep 27, 2020
d065700
add missing comma to binary flag warning
ecoronado92 Sep 27, 2020
3f5ba74
expect warning in unit test rather than silent for checks to pass
ecoronado92 Sep 27, 2020
3a13526
minor doc edits
jgabry Oct 1, 2020
eab8117
warn for binary also if boundary_correction is FALSE
jgabry Oct 1, 2020
f1dd453
Merge branch 'master' into bc_ppc_loo_overlay
jgabry Oct 1, 2020
6768b81
update DESCRIPTION
jgabry Oct 5, 2020
c72606d
set boundary correction to TRUE as default
ecoronado92 Oct 12, 2020
95fc399
set boundary correction TRUE as default
ecoronado92 Oct 12, 2020
13a3ffb
move boundary_correction argument higher
jgabry Oct 15, 2020
c4ea8da
Update NEWS.md
jgabry Oct 15, 2020
bd355d2
removed TODO from code on convolution NAs, added as new issues on github
ecoronado92 Oct 16, 2020
95275d7
TODO comment merge conflict fixed
ecoronado92 Oct 16, 2020
ad18ea7
add reflection method to help text of ppc_loo_pit_overlay()
ecoronado92 Oct 17, 2020
6a60492
add Boneva et al citation
jgabry Oct 22, 2020
3a76b7d
xlim -> [0,1] if using boundary correction
jgabry Oct 22, 2020
4b79f92
Update ppc-loo.R
jgabry Oct 22, 2020
4e4b46c
use prettier x-axis labels
jgabry Oct 22, 2020
a649e20
Also test with boundary_correction=FALSE
jgabry Oct 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 157 additions & 45 deletions R/ppc-loo.R
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@
#' # marginal predictive check using LOO probability integral transform
#' color_scheme_set("orange")
#' ppc_loo_pit_overlay(y, yrep, lw = lw)
#' # ppc_loo_pit_overlay(y, yrep, lw = lw,
#' bc_kde = TRUE) # slower, boundary corrected KDE
#'
#' ppc_loo_pit_qq(y, yrep, lw = lw)
#' ppc_loo_pit_qq(y, yrep, lw = lw, compare = "normal")
Expand Down Expand Up @@ -117,6 +119,9 @@ NULL
#' standard normal distribution.
#' @param trim Passed to [ggplot2::stat_density()].
#' @template args-density-controls
#' @param bc_kde For `ppc_loo_pit_overlay()`, when set to TRUE function will
#' compute `"boundary corrected"` density values using beta kernel estimators.
#' Although preferable, the current implementation is slow (default = FALSE).
ecoronado92 marked this conversation as resolved.
Show resolved Hide resolved
ppc_loo_pit_overlay <- function(y,
yrep,
lw,
Expand All @@ -129,7 +134,8 @@ ppc_loo_pit_overlay <- function(y,
bw = "nrd0",
adjust = 1,
kernel = "gaussian",
n_dens = 1024) {
n_dens = 1024,
bc_kde = FALSE) {
check_ignored_arguments(...)

if (!missing(pit)) {
Expand All @@ -144,50 +150,78 @@ ppc_loo_pit_overlay <- function(y,
}

unifs <- matrix(runif(length(pit) * samples), nrow = samples)

data <- ppc_data(pit, unifs)

ggplot(data) +
aes_(x = ~ value) +
stat_density(
aes_(group = ~ rep_id, color = "yrep"),
data = function(x) dplyr::filter(x, !.data$is_y),
geom = "line",
position = "identity",
size = size,
alpha = alpha,
trim = trim,
bw = bw,
adjust = adjust,
kernel = kernel,
n = n_dens,
na.rm = TRUE) +
stat_density(
aes_(color = "y"),
data = function(x) dplyr::filter(x, .data$is_y),
geom = "line",
position = "identity",
lineend = "round",
size = 1,
trim = trim,
bw = bw,
adjust = adjust,
kernel = kernel,
n = n_dens,
na.rm = TRUE) +
scale_color_ppc_dist(labels = c("PIT", "Unif")) +
scale_x_continuous(
limits = c(.1, .9),
expand = expansion(0, 0),
breaks = seq(from = .1, to = .9, by = .2)) +
scale_y_continuous(
limits = c(0, NA),
expand = expansion(mult = c(0, .25))) +
bayesplot_theme_get() +
yaxis_title(FALSE) +
xaxis_title(FALSE) +
yaxis_text(FALSE) +
yaxis_ticks(FALSE)


if (bc_kde){
pit <- .bc_pvals(x = pit, bw = bw)
unifs <- t(apply(unifs, 1, function(x) .bc_pvals(x, bw = bw)))

data <- ppc_data(pit, unifs) %>%
dplyr::arrange(rep_id) %>%
ecoronado92 marked this conversation as resolved.
Show resolved Hide resolved
mutate(xx = rep(seq(0, 1, length.out = length(pit)),
times = samples + 1))

p <- ggplot(data) +
aes_(x = ~ xx, y = ~ value) +
geom_line(
aes_(group = ~ rep_id, color = "yrep"),
data = function(x) dplyr::filter(x, !.data$is_y),
alpha = alpha,
size = size,
na.rm = TRUE) +
geom_line(
aes_(color = "y"),
data = function(x) dplyr::filter(x, .data$is_y),
size = 1,
lineend = "round",
na.rm = TRUE)

} else {
data <- ppc_data(pit, unifs)

p <- ggplot(data) +
aes_(x = ~ value) +
stat_density(
aes_(group = ~ rep_id, color = "yrep"),
data = function(x) dplyr::filter(x, !.data$is_y),
geom = "line",
position = "identity",
size = size,
alpha = alpha,
trim = trim,
bw = bw,
adjust = adjust,
kernel = kernel,
n = n_dens,
na.rm = TRUE) +
stat_density(
aes_(color = "y"),
data = function(x) dplyr::filter(x, .data$is_y),
geom = "line",
position = "identity",
lineend = "round",
size = 1,
trim = trim,
bw = bw,
adjust = adjust,
kernel = kernel,
n = n_dens,
na.rm = TRUE)
}

p + scale_color_ppc_dist(labels = c("PIT", "Unif")) +
scale_x_continuous(
limits = c(0, 1),
expand = expansion(0, 0),
breaks = seq(from = .1, to = .9, by = .2)) +
scale_y_continuous(
limits = c(0, NA),
expand = expansion(mult = c(0, .25))) +
bayesplot_theme_get() +
yaxis_title(FALSE) +
xaxis_title(FALSE) +
yaxis_text(FALSE) +
yaxis_ticks(FALSE)
}


Expand Down Expand Up @@ -456,3 +490,81 @@ ppc_loo_ribbon <-
return(psis_object)
}

## Boundary correction based on code by Yang Hu and Carl Scarrott in evmix package

# Boundary correction KDE helper function
.bc_dunif <- function(xs, pvals, b, xmax = 1){
# Function based on biased-corrected (modified) beta kernel
# Chen, Song Xi. "Beta kernel estimators for density functions."
# Computational Statistics & Data Analysis 31.2 (1999): 131-145.

# Re-scaling inputs based on upper_bound given beta kernel is defined [0,1]
xs <- xs/xmax
pvals <- pvals/xmax
b <- b/xmax # smoothing parameter (i.e. bw)

d <- vapply(X = xs,
FUN = .bc_kde_calc,
b = b,
pvals = pvals,
FUN.VALUE = 0)

return(d/xmax)

}

# Bias correction function (Chen 1999)
.rho <- function(x, b) {
return(2*b^2 + 2.5 - sqrt(4*b^4 + 6*b^2 + 2.25 - x^2 - x/b))
}

# Piecewise kernel value estimation (Chen 1999)
.bc_kde_calc <- function(xs, b, pvals){

if ((xs >= 2*b) & (xs <= (1 - 2*b))) {
d <- mean(dbeta(pvals, xs/b, (1 - xs)/b))
} else if ((xs >= 0) & (xs < 2*b)) {
d <- mean(dbeta(pvals, .rho(xs, b), (1 - xs)/b))
} else if ((xs > (1 - 2*b)) & (xs <= 1)) {
d <- mean(dbeta(pvals, xs/b, .rho(1 - xs, b)))
} else {
d <- 0
}
return(d)
}

# Wrapper function, transforms pvals / PIT vals into bc values
.bc_pvals <- function(x, bw = "nrd0"){

# Set-up
xs <- seq(0, 1, length.out = length(x))
d <- xs
xmax <- 1 + 1e-9
bw <- density(x, bw = bw)$bw # extract bw

# get only valid pvals
valid_pvals <- x[is.finite(x)]

# Some sanity checks
if (abs(xmax - max(valid_pvals)) >= 1e-1) {
stop("largest PIT value must be below 1")
ecoronado92 marked this conversation as resolved.
Show resolved Hide resolved
}

# Ignore zeros to avoid problems during KDE estimation
if (any(valid_pvals == 0)) {
warning(paste("Ignored", sum(valid_pvals == 0),
"PIT values == 0, they are invalid for beta KDE method"))
valid_pvals = valid_pvals[valid_pvals != 0]
}

# Boundary corrected KDE values
bc_vals <- .bc_dunif(xs = xs, pvals=valid_pvals, b =bw)

# Set any negative values to zero and output bc density values
bc_vals[which(bc_vals < 0)] = 0
ecoronado92 marked this conversation as resolved.
Show resolved Hide resolved
d[ifelse(!is.na(xs), (xs >= 0) & (xs <= 1), FALSE)] = bc_vals
ecoronado92 marked this conversation as resolved.
Show resolved Hide resolved

return(d)
}


8 changes: 7 additions & 1 deletion man/PPC-loo.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.