diff --git a/README.Rmd b/README.Rmd index ef11f2bcb..81178f700 100644 --- a/README.Rmd +++ b/README.Rmd @@ -13,7 +13,7 @@ knitr::opts_chunk$set( ) ``` -# parsnip +# parsnip [![R build status](https://github.com/tidymodels/parsnip/workflows/R-CMD-check/badge.svg)](https://github.com/tidymodels/parsnip) diff --git a/README.html b/README.html deleted file mode 100644 index 211843cbc..000000000 --- a/README.html +++ /dev/null @@ -1,722 +0,0 @@ - - - - - - - - - - - - - - - - - - - - -

parsnip

- - -

R build status Coverage status CRAN status Downloads lifecycle

- - -

Introduction

-

The goal of parsnip is to provide a tidy, unified interface to models that can be used to try a range of models without getting bogged down in the syntactical minutiae of the underlying packages.

-

Installation

-
# The easiest way to get parsnip is to install all of tidymodels:
-install.packages("tidymodels")
-
-# Alternatively, install just parsnip:
-install.packages("parsnip")
-
-# Or the development version from GitHub:
-# install.packages("devtools")
-devtools::install_github("tidymodels/parsnip")
-

Getting started

-

One challenge with different modeling functions available in R that do the same thing is that they can have different interfaces and arguments. For example, to fit a random forest regression model, we might have:

-
# From randomForest
-rf_1 <- randomForest(
-  y ~ ., 
-  data = ., 
-  mtry = 10, 
-  ntree = 2000, 
-  importance = TRUE
-)
-
-# From ranger
-rf_2 <- ranger(
-  y ~ ., 
-  data = dat, 
-  mtry = 10, 
-  num.trees = 2000, 
-  importance = "impurity"
-)
-
-# From sparklyr
-rf_3 <- ml_random_forest(
-  dat, 
-  intercept = FALSE, 
-  response = "y", 
-  features = names(dat)[names(dat) != "y"], 
-  col.sample.rate = 10,
-  num.trees = 2000
-)
-

Note that the model syntax can be very different and that the argument names (and formats) are also different. This is a pain if you switch between implementations.

-

In this example:

- -

The goals of parsnip are to:

- -

Using the example above, the parsnip approach would be:

-
library(parsnip)
-
-rand_forest(mtry = 10, trees = 2000) %>%
-  set_engine("ranger", importance = "impurity") %>%
-  set_mode("regression")
-#> Random Forest Model Specification (regression)
-#> 
-#> Main Arguments:
-#>   mtry = 10
-#>   trees = 2000
-#> 
-#> Engine-Specific Arguments:
-#>   importance = impurity
-#> 
-#> Computational engine: ranger
-

The engine can be easily changed. To use Spark, the change is straightforward:

-
rand_forest(mtry = 10, trees = 2000) %>%
-  set_engine("spark") %>%
-  set_mode("regression")
-#> Random Forest Model Specification (regression)
-#> 
-#> Main Arguments:
-#>   mtry = 10
-#>   trees = 2000
-#> 
-#> Computational engine: spark
-

Either one of these model specifications can be fit in the same way:

-
rand_forest(mtry = 10, trees = 2000) %>%
-  set_engine("ranger", importance = "impurity") %>%
-  set_mode("regression") %>%
-  fit(mpg ~ ., data = mtcars)
-#> parsnip model object
-#> 
-#> Fit time:  75ms 
-#> Ranger result
-#> 
-#> Call:
-#>  ranger::ranger(formula = formula, data = data, mtry = ~10, num.trees = ~2000,      importance = ~"impurity", num.threads = 1, verbose = FALSE,      seed = sample.int(10^5, 1)) 
-#> 
-#> Type:                             Regression 
-#> Number of trees:                  2000 
-#> Sample size:                      32 
-#> Number of independent variables:  10 
-#> Mtry:                             10 
-#> Target node size:                 5 
-#> Variable importance mode:         impurity 
-#> Splitrule:                        variance 
-#> OOB prediction error (MSE):       5.779248 
-#> R squared (OOB):                  0.8408977
-

A list of all parsnip models across different CRAN packages can be found at tidymodels.org.

-

Data sets previously found in parsnip are now find in the modeldata package.

-

Contributing

-

If you encounter a bug, please file a minimal reproducible example on GitHub. For questions and other discussion, please use community.rstudio.com.

-

Please note that the parsnip project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

- - - diff --git a/README.md b/README.md index b588bf93a..b306a1267 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ -# parsnip +# parsnip @@ -141,7 +141,7 @@ rand_forest(mtry = 10, trees = 2000) %>% fit(mpg ~ ., data = mtcars) #> parsnip model object #> -#> Fit time: 75ms +#> Fit time: 69ms #> Ranger result #> #> Call: @@ -155,8 +155,8 @@ rand_forest(mtry = 10, trees = 2000) %>% #> Target node size: 5 #> Variable importance mode: impurity #> Splitrule: variance -#> OOB prediction error (MSE): 5.779248 -#> R squared (OOB): 0.8408977 +#> OOB prediction error (MSE): 5.815633 +#> R squared (OOB): 0.839896 ``` A list of all `parsnip` models across different CRAN packages can be