Skip to content
Classes and functions to create and summarize different types of resampling objects
Branch: master
Clone or download
Latest commit 775ac55 Jan 30, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Mod CODEOWNERS to stop auto-request Jul 8, 2018
R Small typo Jan 25, 2019
data Add drinks dataset Jul 13, 2018
docs verison bump and documentation update Nov 19, 2018
man Small typo Jan 25, 2019
pkgdown css changes for pkgdown Nov 18, 2017
revdep closes #62 Jan 6, 2019
tests Merge pull request #73 from krlmlr/f-unqualify Jan 6, 2019
vignettes verison bump and documentation update Nov 19, 2018
.Rbuildignore no removes, revdep, version bump Nov 19, 2018
.gitignore
.travis.yml Expanded travis testing cases Feb 5, 2018
DESCRIPTION
NAMESPACE closes #62 Jan 6, 2019
NEWS.md closes #62 Jan 6, 2019
README.Rmd fixed org topepo -> tidymodels Nov 19, 2018
README.md fixed org topepo -> tidymodels Nov 19, 2018
_pkgdown.yml verison bump and documentation update Nov 19, 2018
rsample.Rproj Initial commit Apr 22, 2017
rsample_hex_thumb.png Thumbnail (image by Greg Swinehart) Sep 24, 2017

README.md

rsample

Travis-CI Build Status Coverage Status CRAN_Status_Badge Downloads

rsample contains a set of functions that can create different types of resamples and corresponding classes for their analysis. The goal is to have a modular set of methods that can be used across different R packages for:

  • traditional resampling techniques for estimating the sampling distribution of a statistic and
  • estimating model performance using a holdout set

The scope of rsample is to provide the basic building blocks for creating and analyzing resamples of a data set but does not include code for modeling or calculating statistics. The "Working with Resample Sets" vignette gives demonstrations of how rsample tools can be used.

To install it, use:

install.packages("rsample")

## For the devel version:
require(devtools)
install_github("tidymodels/rsample")

Note that resampled data sets created by rsample are directly accessible in a resampling object but do not contain much overhead in memory. Since the original data is not modified, R does not make an automatic copy.

For example, creating 50 bootstraps of a data set does not create an object that is 50-fold larger in memory:

> library(rsample)
> library(mlbench)
> library(pryr)
> 
> data(LetterRecognition)
> 
> object_size(LetterRecognition)
2.64 MB
> 
> set.seed(35222)
> boots <- bootstraps(LetterRecognition, times = 50)
> 
> object_size(boots)
6.69 MB
> 
> # Object size per resample
> object_size(boots)/nrow(boots)
134 kB
> 
> # Fold increase is <<< 50
> as.numeric(object_size(boots)/object_size(LetterRecognition))
[1] 2.528695

The memory usage for 50 boostrap samples is less than 3-fold more than the original data set.

You can’t perform that action at this time.