Skip to content
/ pre Public
forked from RMHogervorst/pre

R package for deriving Prediction Rule Ensembles

Notifications You must be signed in to change notification settings

guhjy/pre

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pre is an R package for deriving prediction rule ensembles for binary and 
continuous outcome variables. Input variables may be numeric, ordinal and 
nominal. The package implements the algorithm for deriving prediction rule 
ensembles as described in Friedman & Popescu (2008), with some improvements
and adjustments. The most important improvements or adjustments are: 

  1) The pre package is completely R based, allowing users better access to the 
results and more control over the parameters used for generating the prediction 
rule ensemble

  2) An unbiased tree induction algorithm is used for deriving prediction rules. 
Friedman & Popescu used the classification and regression tree (CART) algorithm, 
but this suffers from biased variable selection.

  3) The initial ensemble of prediction rules can be generated as a bagged, 
boosted and/or random forest ensemble.

The pre package is developed to provide useRs a completely R based implementation
of the algorithm described by Friedman & Popescu (2008). However, note that pre 
is under development, and much work still needs to be done.

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. 
The Annals of Applied Statistics, 2(3), pp. 916-954 916-954.

To get a first impression of how pre works, consider the following example of 
prediction rule ensemble, using the airquality dataset: 

library(pre)
airq.ens <- pre(Ozone ~ ., data = airquality[complete.cases(airquality),], verbose = TRUE)
print(airq.ens, penalty.par.val = "lambda.1se")
print(airq.ens, penalty.par.val = "lambda.min")

# Let's take the smallest ensemble yielding a cross-validated error within 1se 
# of the minimum (this is the default of all pre functions): 

# Inspect the prediction rule ensemble:
coef(airq.ens)
importance(airq.ens)
pairplot(airq.ens, varnames = c("Temp", "Wind", "Solar.R"))
singleplot(airq.ens, varname = "Temp")

# Generate predictions:
airq.preds <- predict(airq.ens)
hist(airq.preds)

# Calculate 10-fold cross-validated error:
airq.cv <- cvpre(airq.ens)
aiq.cv$accuracy

# Assess interaction effects of predictor variables:
nullmods <- bsnullinteract(airq.ens)
interact(airq.ens, varnames = c("Temp", "Wind", "Solar.R"), nullmods = nullmods)

About

R package for deriving Prediction Rule Ensembles

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%