This R package implements the Gaussian Process (GP) model with mixed-type inputs. It is designed to fit GP models to datasets where the input variables can be quantitative, ordinal, nominal, or any combination thereof.
You can install the development version of MixGP from GitHub as follows:
#install.packages(devtools)
devtools::install_github("denglinsui/MixGP")Alternatively,
Option 1: if you have downloaded the package file (MixGP.zip) locally, you can install it using the devtools::install_local() function:
# install.packages("devtools")
devtools::install_local("path/to/MixGP.zip")Option 2 (From source directory): if you have downloaded the package file (MixGP.zip) locally, you may unzip the downloaded package file, open the MixGP.Rproj file, and install the package from the source directory:
setwd("path/to/MixGP")
devtools::install()Option 3: if you have downloaded the source package file (MixGP_0.1.0.tar.gz) locally, you can install it using base R:
install.packages("MixGP_0.1.0.tar.gz",repos=NULL type = "source")We will use the math_example dataset from the LVGP package as an example.
First, load the necessary libraries and the data:
library(LVGP)
library(MixGP)
X_tr <- math_example$X_tr # Training inputs
Y_tr <- math_example$Y_tr # Training outputs
X_te <- math_example$X_te # Testing inputs
Y_te <- math_example$Y_te # Testing outputsWe can now introduce how to use the method:
- The dataset consists of inputs
X_trand responsesY_tr. We useind_nomiandind_ordto specify the column indices of the nominal and ordinal variables, respectively. - Next, we specify the covariance structure by setting
cov_type_quant,cov_type_quali, andcoup_type_quali. Here:cov_type_quantis the covariance kernel for quantitative variables and can be either"gaussian"or"exponential".cov_type_qualiis the covariance kernel for qualitative (nominal/ordinal) variables and can take values"gaussian","exponential", or"linear".coup_type_qualiis the coupling method to combine the kernels for different input variables. It can be"multiplicative"(product of kernels) or"additive"(sum of kernels).dim_z_nomiis the dimension of the latent space for nominal variables. This is an integer, with a default of2. The function will automatically adjust this value if the specified dimension is too high for the number of levels in a variable.
Let's fit two models: one with a multiplicative kernel and one with an additive kernel.
model_Gau_multi <- MixGP_fit(X_tr, Y_tr, ind_nomi = 3, cov_type_quant = "gaussian", cov_type_quali = "gaussian", coup_type_quali = "multiplicative", dim_z_nomi = 2)
model_Gau_add <- MixGP_fit(X_tr, Y_tr, ind_nomi = 3, cov_type_quant = "gaussian", cov_type_quali = "gaussian", coup_type_quali = "additive", dim_z_nomi = 2)After the models are fitted, the MixGP_predict function can be used to make predictions on new data.
pred_Gau_multi <- MixGP_predict(X_new = X_te, model = model_Gau_multi)
Y_pred_multi <- pred_Gau_multi$Y_hat # Extract predicted values
pred_Gau_add <- MixGP_predict(X_new = X_te, model = model_Gau_add)
Y_pred_add <- pred_Gau_add$Y_hatFinally, we can compare the predicted values with the true values to evaluate the models' performance.
# Set up the plotting window to display two plots side-by-side
par(mfrow = c(1, 2))
# Multiplicative model results plot
plot(Y_te, Y_pred_multi,
main = "Multiplicative Gaussian Kernel",
xlab = "True Values", ylab = "Predicted Values",
col = "blue", pch = 20)
abline(a = 0, b = 1, col = "red") # y=x reference line
# Additive model results plot
plot(Y_te, Y_pred_add,
main = "Additive Gaussian Kernel",
xlab = "True Values", ylab = "Predicted Values",
col = "green", pch = 20)
abline(a = 0, b = 1, col = "red") # y=x reference line
# Restore default plotting parameters
par(mfrow = c(1, 1))We also support both model selection and model averaging strategies:
- Model selection allows you to choose the best model based on criteria like
- Bayesian information criterion (
BIC) - leave-one-out cross validation with L2 loss (
LOOCV) - leave-one-out cross validation with negative log-likelihood (
LOOCVloglik).
- Bayesian information criterion (
- Model averaging combines predictions from multiple models using BIC-based weights to give more stable and reliable results.
# Merge the model list
model.list <- list(model_Gau_multi,model_Gau_add)
# Model average
Sel_BIC <- MixGP_model_selection(model.list, X_te, criterion = c("BIC"))
Sel_LOOCVL2 <- MixGP_model_selection(model.list, X_te, criterion = c("LOOCVL2"))
Sel_LOOCVloglik <- MixGP_model_selection(model.list, X_te, criterion = c("LOOCVloglik"))
# Model selection
Res_Average <- MixGP_model_average(model.list, X_te)For more comprehensive examples, please see:
?MixGP_model_average
?MixGP_model_selection
?MixGP_fit