Skip to content

denglinsui/MixGP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MixGP

This R package implements the Gaussian Process (GP) model with mixed-type inputs. It is designed to fit GP models to datasets where the input variables can be quantitative, ordinal, nominal, or any combination thereof.

Installation

Online Installation

You can install the development version of MixGP from GitHub as follows:

#install.packages(devtools)
devtools::install_github("denglinsui/MixGP")

Local Installation

Alternatively,

Option 1: if you have downloaded the package file (MixGP.zip) locally, you can install it using the devtools::install_local() function:

# install.packages("devtools")
devtools::install_local("path/to/MixGP.zip")

Option 2 (From source directory): if you have downloaded the package file (MixGP.zip) locally, you may unzip the downloaded package file, open the MixGP.Rproj file, and install the package from the source directory:

setwd("path/to/MixGP")
devtools::install()

Option 3: if you have downloaded the source package file (MixGP_0.1.0.tar.gz) locally, you can install it using base R:

install.packages("MixGP_0.1.0.tar.gz",repos=NULL type = "source")

Example for Basic Functions

We will use the math_example dataset from the LVGP package as an example.

First, load the necessary libraries and the data:

library(LVGP)
library(MixGP)
X_tr <- math_example$X_tr  # Training inputs
Y_tr <- math_example$Y_tr  # Training outputs
X_te <- math_example$X_te  # Testing inputs
Y_te <- math_example$Y_te  # Testing outputs

We can now introduce how to use the method:

  1. The dataset consists of inputs X_tr and responses Y_tr. We use ind_nomi and ind_ord to specify the column indices of the nominal and ordinal variables, respectively.
  2. Next, we specify the covariance structure by setting cov_type_quant, cov_type_quali, and coup_type_quali. Here:
    1. cov_type_quant is the covariance kernel for quantitative variables and can be either "gaussian" or "exponential".
    2. cov_type_quali is the covariance kernel for qualitative (nominal/ordinal) variables and can take values "gaussian", "exponential", or "linear".
    3. coup_type_quali is the coupling method to combine the kernels for different input variables. It can be "multiplicative" (product of kernels) or "additive" (sum of kernels).
    4. dim_z_nomi is the dimension of the latent space for nominal variables. This is an integer, with a default of 2. The function will automatically adjust this value if the specified dimension is too high for the number of levels in a variable.

Let's fit two models: one with a multiplicative kernel and one with an additive kernel.

model_Gau_multi <- MixGP_fit(X_tr, Y_tr, ind_nomi = 3, cov_type_quant = "gaussian", cov_type_quali = "gaussian", coup_type_quali = "multiplicative", dim_z_nomi = 2)
model_Gau_add <- MixGP_fit(X_tr, Y_tr, ind_nomi = 3, cov_type_quant = "gaussian",  cov_type_quali = "gaussian", coup_type_quali = "additive", dim_z_nomi = 2)

After the models are fitted, the MixGP_predict function can be used to make predictions on new data.

pred_Gau_multi <- MixGP_predict(X_new = X_te, model = model_Gau_multi)
Y_pred_multi <- pred_Gau_multi$Y_hat  # Extract predicted values

pred_Gau_add <- MixGP_predict(X_new = X_te, model = model_Gau_add)
Y_pred_add <- pred_Gau_add$Y_hat

Finally, we can compare the predicted values with the true values to evaluate the models' performance.

# Set up the plotting window to display two plots side-by-side
par(mfrow = c(1, 2))

# Multiplicative model results plot
plot(Y_te, Y_pred_multi, 
     main = "Multiplicative Gaussian Kernel", 
     xlab = "True Values", ylab = "Predicted Values", 
     col = "blue", pch = 20)
abline(a = 0, b = 1, col = "red") # y=x reference line

# Additive model results plot
plot(Y_te, Y_pred_add, 
     main = "Additive Gaussian Kernel", 
     xlab = "True Values", ylab = "Predicted Values", 
     col = "green", pch = 20)
abline(a = 0, b = 1, col = "red") # y=x reference line

# Restore default plotting parameters
par(mfrow = c(1, 1))

Model Selection and Model Average

We also support both model selection and model averaging strategies:

  • Model selection allows you to choose the best model based on criteria like
    • Bayesian information criterion (BIC)
    • leave-one-out cross validation with L2 loss (LOOCV)
    • leave-one-out cross validation with negative log-likelihood (LOOCVloglik).
  • Model averaging combines predictions from multiple models using BIC-based weights to give more stable and reliable results.
# Merge the model list
model.list <- list(model_Gau_multi,model_Gau_add)

# Model average
Sel_BIC <- MixGP_model_selection(model.list, X_te, criterion = c("BIC"))
Sel_LOOCVL2 <- MixGP_model_selection(model.list, X_te, criterion = c("LOOCVL2"))
Sel_LOOCVloglik <- MixGP_model_selection(model.list, X_te, criterion = c("LOOCVloglik"))

# Model selection
Res_Average <- MixGP_model_average(model.list, X_te)

Additional Information

For more comprehensive examples, please see:

?MixGP_model_average
?MixGP_model_selection
?MixGP_fit

About

This R package implements the Gaussian Process (GP) model with mixed-type inputs. It is designed to fit GP models to datasets where the input variables can be quantitative, ordinal, nominal, or any combination thereof.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages