Overview. Data transformations are a useful companion for parametric regression models. A well-chosen or learned transformation can greatly enhance the applicability of a given model, especially for data with irregular marginal features (e.g., multimodality, skewness) or various data domains (e.g., real-valued, positive, or compactly-supported data).
Given paired data SeBR
implements
efficient and fully Bayesian inference for semiparametric regression
models that incorporate (1) an unknown data transformation
and (2) a useful parametric regression model
with unknown parameters
Examples. We focus on the following important special cases of
- The linear model is a natural starting point:
The transformation
- The quantile regression model replaces the Gaussian assumption in the linear model with an asymmetric Laplace distribution (ALD)
to target the $\tau$th quantile of
- The Gaussian process (GP) model generalizes the linear model to include a nonparametric regression function,
where
Challenges: The goal is to provide fully Bayesian posterior
inference for the unknowns
Innovations: Our approach (https://arxiv.org/abs/2306.05498)
specifies a nonparametric model for
The package SeBR
is installed and loaded as follows:
# install.packages("devtools")
# devtools::install_github("drkowal/SeBR")
library(SeBR)
The main functions in SeBR
are:
-
sblm()
: Monte Carlo sampling for posterior and predictive inference with the semiparametric Bayesian linear model; -
sbsm()
: Monte Carlo sampling for posterior and predictive inference with the semiparametric Bayesian spline model, which replaces the linear model with a spline for nonlinear modeling of$x \in \mathbb{R}$ ; -
sbqr()
: blocked Gibbs sampling for posterior and predictive inference with the semiparametric Bayesian quantile regression; and -
sbgp()
: Monte Carlo sampling for predictive inference with the semiparametric Bayesian Gaussian process model.
Each function returns a point estimate of coefficients
),
point predictions at some specified testing points (fitted.values
),
posterior samples of the transformation post_g
), and posterior
predictive samples of post_ypred
), as well as other function-specific quantities (e.g.,
posterior draws of post_theta
). The calls coef()
and
fitted()
extract the point estimates and point predictions,
respectively.
Note: The package also includes Box-Cox variants of these functions,
i.e., restricting blm_bc()
, etc.) are
primarily for benchmarking.
Detailed documentation and examples are available at https://drkowal.github.io/SeBR/.