A preprocessing engine to generate design matrices
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Mod CODEOWNERS to stop auto-request Jul 8, 2018
R Fix for tidymodels/textrecipes#17 Nov 23, 2018
data additions for #216 with data set and updated docs Oct 13, 2018
docs version fix Nov 21, 2018
man Changes requested by CRAN Nov 18, 2018
pkgdown version bump and doc update for imminent CRAN release Jan 11, 2018
presentations Added a directory for presentations that is not kept in build May 6, 2017
revdep revdeps Nov 17, 2018
tests test cases for tidymodels/textrecipes#17 Nov 23, 2018
vignettes pkgdown update Nov 13, 2018
.Rbuildignore no revdeps Oct 2, 2018
.gitignore revdeps Nov 17, 2018
.travis.yml rsample fails for R 3.2 Nov 17, 2018
DESCRIPTION version bump and documentation update Nov 19, 2018
NAMESPACE Fix for tidymodels/textrecipes#17 Nov 23, 2018
NEWS.md fixed version number Nov 20, 2018
README.Rmd more changes to tidymodels org Jul 30, 2018
README.html more changes to tidymodels org Jul 30, 2018
README.md more changes to tidymodels org Jul 30, 2018
_pkgdown.yml remaining dplyr steps Oct 28, 2018
codecov.yml Updated package and travis to (attempt to) use R >= 3.1 Sep 27, 2017
contributors.md step_relu add rather than replace, prefix arg, update contributors Jul 15, 2017
recipes.Rproj Moved package files to the top directory Feb 2, 2017
recipes_hex.png Hex sticker (by Dan Kuhn and colored by Greg Swinehart) Sep 24, 2017
recipes_hex_thumb.png Added thumbnail to readme Sep 24, 2017

README.md

Build Status Coverage status CRAN_Status_Badge Downloads

The recipes package is an alternative method for creating and preprocessing design matrices that can be used for modeling or visualization. From wikipedia:

In statistics, a design matrix (also known as regressor matrix or model matrix) is a matrix of values of explanatory variables of a set of objects, often denoted by X. Each row represents an individual object, with the successive columns corresponding to the variables and their specific values for that object.

While R already has long-standing methods for creating these matrices (e.g. formulas and model.matrix), there are some limitations to what the existing infrastructure can do.

The idea of the recipes package is to define a recipe or blueprint that can be used to sequentially define the encodings and preprocessing of the data (i.e. "feature engineering"). For example, to create a simple recipe containing only an outcome and predictors and have the predictors centered and scaled:

library(recipes)
library(mlbench)
data(Sonar)
sonar_rec <- recipe(Class ~ ., data = Sonar) %>%
  step_center(all_predictors()) %>%
  step_scale(all_predictors())

To install it, use:

install.packages("recipes")

## for development version:
require("devtools")
install_github("tidymodels/recipes")