Skip to content

Commit

Permalink
version 0.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
Max Kuhn authored and cran-robot committed Oct 28, 2020
1 parent 1b28b39 commit 19ea952
Show file tree
Hide file tree
Showing 7 changed files with 194 additions and 19 deletions.
10 changes: 5 additions & 5 deletions DESCRIPTION
@@ -1,6 +1,6 @@
Package: baguette
Title: Efficient Model Functions for Bagging
Version: 0.0.1
Version: 0.1.0
Authors@R: c(
person(
given = "Max",
Expand All @@ -16,7 +16,7 @@ Description: Tree- and rule-based models can be bagged using
in an efficient format to reduce the model objects size
and speed.
License: MIT + file LICENSE
Depends: parsnip (>= 0.1.0)
Depends: parsnip (>= 0.1.3.9000)
Suggests: testthat, AmesHousing, recipes, modeldata, covr, yardstick
Encoding: UTF-8
LazyData: true
Expand All @@ -25,11 +25,11 @@ Imports: hardhat, butcher, rpart, C50, withr, rsample, dplyr, purrr,
dials
URL: https://github.com/tidymodels/baguette
BugReports: https://github.com/tidymodels/baguette/issues
RoxygenNote: 7.1.0.9000
RoxygenNote: 7.1.1.9000
NeedsCompilation: no
Packaged: 2020-04-10 14:38:44 UTC; max
Packaged: 2020-10-27 14:46:20 UTC; max
Author: Max Kuhn [aut, cre] (<https://orcid.org/0000-0003-2402-136X>),
RStudio [cph]
Maintainer: Max Kuhn <max@rstudio.com>
Repository: CRAN
Date/Publication: 2020-04-14 14:20:04 UTC
Date/Publication: 2020-10-28 05:20:06 UTC
12 changes: 6 additions & 6 deletions MD5
@@ -1,14 +1,14 @@
36fb55aa89353875dd18d9b1bbbb4cbf *DESCRIPTION
47f5a87dd20216b130ecf2a7d70ba47b *DESCRIPTION
5174dfc514f0941d2edd4b0b4c9941dd *LICENSE
3ad8b1e11ce0bfe9e17365f8a4e5f90d *NAMESPACE
a50c9b3f60e75cabdcf9646cc0d99376 *NEWS.md
105aece231cfe01430ca176c32f0f8a7 *NEWS.md
17ac1a5ef69ad8a136d188987ac2d3df *R/C5.0.R
ffaa0b886cc855c54e19f8716b4fb31d *R/aaa.R
b09b8dbbb3debc9a8da8df19ce73d824 *R/aaa_validate.R
418cd325c8f21be6dd6df0ba3135cb99 *R/bag_mars.R
1cce4429e509d983617fd12a4fb8fcc9 *R/bag_mars_data.R
5935727a62f83f64fdc2e1bc9d9861b2 *R/bag_mars_data.R
1bb4dd74840fe395718e397eff5722ab *R/bag_tree.R
1c1d5699144bb4e894f0f4cfe9296150 *R/bag_tree_data.R
6a70b109ecc20dcf1ddc688da05d838f *R/bag_tree_data.R
ab79037f0bec7c0f8a9508592d5cbc2d *R/bagger.R
a91f3bd4f59a8ac9dbe4272d1470898a *R/bridge.R
b92bca75c6411d31c744ee53123bcbe0 *R/cart.R
Expand All @@ -22,14 +22,14 @@ ee021c4e755b7d61c738d6a11a73916a *R/misc.R
38117a447acfadb0d706b721bc4aa789 *R/predict.R
e7c0c60ba034124d26f214b069298c3a *R/reexports.R
1f7d28afa7c7d75996e4fc84fbcd81b1 *R/var_imp.R
c509d8554b69b631f3f14ce48c575d3f *README.md
8a6a02a5e8ef22762addd74c9895c9f8 *README.md
7b882e7e4146723ccf30f0d15908828e *man/bag_mars.Rd
fd99f46e9a8a8533a2a04064930fb46e *man/bag_tree.Rd
64b8d2fcee4ec7be49801a88ac5afba2 *man/bagger.Rd
d83b1d748fa6f6817c438a07c71817d6 *man/class_cost.Rd
243c2dd2756b5fe0caa7006fcf3e4a1d *man/control_bag.Rd
cc0dfab251071ae0e8a5c9a456a35f53 *man/predict.bagger.Rd
283a2815b98a024118fd826658bebd06 *man/reexports.Rd
d13a9eb854e890a5e3ab1f433f5fae8a *man/reexports.Rd
f7c223b80e9f9c5a5483427b6b10c511 *man/var_imp.bagger.Rd
31bdaf0c0940e596ae34edcc246648af *tests/testthat.R
2fb39dd96d736e640da8c1ee51aa20d3 *tests/testthat/test-C5.R
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
@@ -1,3 +1,7 @@
# baguette 0.1.0

* Added encoding information to work with current version of `parsnip`.

# baguette 0.0.0.9000

Development version
Expand Down
25 changes: 25 additions & 0 deletions R/bag_mars_data.R
Expand Up @@ -16,6 +16,7 @@ make_bag_mars <- function() {
parsnip::set_model_engine("bag_mars", "classification", "earth")
parsnip::set_model_engine("bag_mars", "regression", "earth")
parsnip::set_dependency("bag_mars", "earth", "earth")
parsnip::set_dependency("bag_mars", "earth", "baguette")

parsnip::set_model_arg(
model = "bag_mars",
Expand Down Expand Up @@ -56,6 +57,18 @@ make_bag_mars <- function() {
)
)

parsnip::set_encoding(
model = "bag_mars",
eng = "earth",
mode = "regression",
options = list(
predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE,
allow_sparse_x = FALSE
)
)

parsnip::set_fit(
model = "bag_mars",
eng = "earth",
Expand All @@ -68,6 +81,18 @@ make_bag_mars <- function() {
)
)

parsnip::set_encoding(
model = "bag_mars",
eng = "earth",
mode = "classification",
options = list(
predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE,
allow_sparse_x = FALSE
)
)

parsnip::set_pred(
model = "bag_mars",
eng = "earth",
Expand Down
37 changes: 37 additions & 0 deletions R/bag_tree_data.R
Expand Up @@ -16,6 +16,7 @@ make_bag_tree <- function() {
parsnip::set_model_engine("bag_tree", "classification", "rpart")
parsnip::set_model_engine("bag_tree", "regression", "rpart")
parsnip::set_dependency("bag_tree", "rpart", "rpart")
parsnip::set_dependency("bag_tree", "rpart", "baguette")

parsnip::set_model_arg(
model = "bag_tree",
Expand Down Expand Up @@ -64,6 +65,18 @@ make_bag_tree <- function() {
)
)

parsnip::set_encoding(
model = "bag_tree",
eng = "rpart",
mode = "regression",
options = list(
predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE,
allow_sparse_x = FALSE
)
)

parsnip::set_fit(
model = "bag_tree",
eng = "rpart",
Expand All @@ -76,6 +89,18 @@ make_bag_tree <- function() {
)
)

parsnip::set_encoding(
model = "bag_tree",
eng = "rpart",
mode = "classification",
options = list(
predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE,
allow_sparse_x = FALSE
)
)

parsnip::set_pred(
model = "bag_tree",
eng = "rpart",
Expand Down Expand Up @@ -124,6 +149,7 @@ make_bag_tree <- function() {

parsnip::set_model_engine("bag_tree", "classification", "C5.0")
parsnip::set_dependency("bag_tree", "C5.0", "C50")
parsnip::set_dependency("bag_tree", "C5.0", "baguette")

parsnip::set_fit(
model = "bag_tree",
Expand All @@ -137,6 +163,17 @@ make_bag_tree <- function() {
)
)

parsnip::set_encoding(
model = "bag_tree",
eng = "C5.0",
mode = "classification",
options = list(
predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE,
allow_sparse_x = FALSE
)
)

parsnip::set_model_arg(
model = "bag_tree",
Expand Down
123 changes: 116 additions & 7 deletions README.md
@@ -1,20 +1,129 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->

# baguette

<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/baguette)](https://cran.r-project.org/package=baguette)
[![Codecov test coverage](https://codecov.io/gh/tidymodels/baguette/branch/master/graph/badge.svg)](https://codecov.io/gh/tidymodels/baguette?branch=master)
[![R build status](https://github.com/tidymodels/baguette/workflows/R-CMD-check/badge.svg)](https://github.com/tidymodels/baguette/actions)

[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
[![CRAN
status](https://www.r-pkg.org/badges/version/baguette)](https://cran.r-project.org/package=baguette)
[![Codecov test
coverage](https://codecov.io/gh/tidymodels/baguette/branch/master/graph/badge.svg)](https://codecov.io/gh/tidymodels/baguette?branch=master)
[![R build
status](https://github.com/tidymodels/baguette/workflows/R-CMD-check/badge.svg)](https://github.com/tidymodels/baguette/actions)
[![R-CMD-check](https://github.com/tidymodels/baguette/workflows/R-CMD-check/badge.svg)](https://github.com/tidymodels/baguette/actions)
<!-- badges: end -->

The goal of baguette is to provide efficient functions that can be used to create ensemble models via bagging.
## Introduction

The goal of baguette is to provide efficient functions for bagging (aka
[bootstrap
aggregating](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C7&q=bagging+predictors+breiman+1996&oq=Bagging+predictors+))
ensemble models.

The model objects produced by baguette are kept smaller than they would
otherwise be through two operations:

- The [butcher](https://tidymodels.github.io/butcher/) package is used
to remove object elements that are not crucial to using the models.
For example, some models contain copies of the training set or model
residuals when created. These are removed to save space.

- For ensembles whose base models use a formula method, there is a
built-in redundancy because each model has an identical terms
object. However, each one of these takes up separate space in memory
and can be quite large when there are many predictors. The baguette
package solves this problem by replacing each terms object with the
object from the first model in the ensemble. Since the other terms
objects are not modified, we get the same functional capabilities
using far less memory to save the ensemble.

## Installation

For now:
You can install the released version of baguette from
[CRAN](https://CRAN.R-project.org) with:

``` r
install.packages("baguette")
```

Install the development version from GitHub with:

``` r
require("devtools")
install_github("tidymodels/baguette")
```

## Example

Let’s build a bagged decision tree model to predict a continuous
outcome.

``` r
devtools::install_github("tidymodels/baguette")
library(baguette)
#> Loading required package: parsnip

bag_tree() %>%
set_engine("rpart") # C5.0 is also available here
#> Bagged Decision Tree Model Specification (unknown)
#>
#> Main Arguments:
#> cost_complexity = 0
#> min_n = 2
#>
#> Computational engine: rpart

set.seed(123)
bag_cars <-
bag_tree() %>%
set_engine("rpart", times = 25) %>% # 25 ensemble members
set_mode("regression") %>%
fit(mpg ~ ., data = mtcars)

bag_cars
#> parsnip model object
#>
#> Fit time: 3.6s
#> Bagged CART (regression with 25 members)
#>
#> Variable importance scores include:
#>
#> # A tibble: 10 x 4
#> term value std.error used
#> <chr> <dbl> <dbl> <int>
#> 1 disp 905. 51.9 25
#> 2 wt 889. 56.8 25
#> 3 hp 814. 48.7 25
#> 4 cyl 581. 42.9 25
#> 5 drat 540. 54.1 25
#> 6 qsec 281. 53.2 25
#> 7 vs 150. 51.2 20
#> 8 carb 84.4 30.6 25
#> 9 gear 80.0 35.8 23
#> 10 am 51.5 22.9 18
```

The models also return aggregated variable importance scores.

## Contributing

This project is released with a [Contributor Code of
Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html).
By contributing to this project, you agree to abide by its terms.

- For questions and discussions about tidymodels packages, modeling,
and machine learning, please [post on RStudio
Community](https://rstd.io/tidymodels-community).

- If you think you have encountered a bug, please [submit an
issue](https://github.com/tidymodels/baguette/issues).

- Either way, learn how to create and share a
[reprex](https://rstd.io/reprex) (a minimal, reproducible example),
to clearly communicate about your code.

- Check out further details on [contributing guidelines for tidymodels
packages](https://www.tidymodels.org/contribute/) and [how to get
help](https://www.tidymodels.org/help/).
2 changes: 1 addition & 1 deletion man/reexports.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 19ea952

Please sign in to comment.