Skip to content

haozhestat/rfinterval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rfinterval: Predictive Inference on Random Forests (beta)

Haozhe Zhang

LICENSE 996.icu HitCount

Introduction

rfinterval is a fast implementation of random forest prediction intervals described in the paper "Random Forest Prediction Intervals" by Haozhe Zhang, Joshua Zimmerman, Dan Nettleton, and Daniel J. Nordman. This R package produce out-of-bag prediction intervals, split conformal prediction intervals, and quantile regression forest intervals for random forest predictions.

Installation

To install the R package rfinterval directly from github, run the following in R:

#install.packages("devtools")
#devtools::install_github(repo="haozhestat/rfinterval")
install.packages("rfinterval")
library(rfinterval)
?rfinterval

Usage

Quickstart:

train_data <- sim_data(n = 1000, p = 10)
test_data <- sim_data(n = 1000, p = 10)

output <- rfinterval(y~., train_data = train_data, test_data = test_data,
                     method = c("oob", "split-conformal", "quantreg"),
                     symmetry = TRUE,alpha = 0.1)
                     
### print the marginal coverage of OOB prediction interval
mean(output$oob_interval$lo < test_data$y & output$oob_interval$up > test_data$y)

### print the marginal coverage of Split-conformal prediction interval
mean(output$sc_interval$lo < test_data$y & output$sc_interval$up > test_data$y)

### print the marginal coverage of Quantile regression forest prediction interval
mean(output$quantreg_interval$lo < test_data$y & output$quantreg_interval$up > test_data$y)

Data example:

oob_interval <- rfinterval(pm2.5 ~ .,
                            train_data = BeijingPM25[1:1000, ],
                            test_data = BeijingPM25[1001:2000, ],
                            method = "oob",
                            symmetry = TRUE,
                            alpha = 0.1)
str(oob_interval)

If you find any bugs, or if you experience any crashes, please report to us. If you have any questions just ask, we won't bite.

Please cite our paper if you use rfinterval.

References

  • Zhang, H., Zimmerman, J., Nettleton, D. and Nordman, D. (2019). "Random Forest Prediction Intervals." The American Statistician, doi: 10.1080/00031305.2019.1585288.
  • Zhang, H. (2019) "Topics in Functional Data Analysis and Machine Learning Predictive Inference." Graduate Theses and Dissertations.
  • Lei, J., Max G’Sell, A. R., Tibshirani, R.J., and Wasserman, L. "Distribution-free predictive inference for regression." Journal of the American Statistical Association 113, no. 523 (2018): 1094-1111.
  • Meinshausen, N. "Quantile regression forests." Journal of Machine Learning Research 7, no. Jun (2006): 983-999.
  • Liang, X., Zou, T., Guo, B., Li, S., Zhang, H., Zhang, S., Huang, H. and Chen, S.X., 2015. "Assessing Beijing's PM2. 5 pollution: severity, weather impact, APEC and winter heating". Proc. R. Soc. A, 471(2182), p.20150257.

About

An R package "rfinterval": Predictive Inference on Random Forests

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages