Merge pull request #104 from tidymodels/confint-api-test-number-2000

Confint api test number 2000
tidymodels · Jul 12, 2019 · cc4ccc3 · cc4ccc3
2 parents 751bff0 + b8d1fe0
commit cc4ccc3
Show file tree

Hide file tree

Showing 3 changed files with 25 additions and 26 deletions.
diff --git a/NEWS.md b/NEWS.md
@@ -1,10 +1,11 @@
-# `rsample` 0.0.4.9000
+# `rsample` 0.0.5
 
 * Added three functions to compute different bootstrap confidence intervals. 
 * A new function (`add_resample_id`) augments a data frame with columns for the resampling identifier. 
 * Updated `initial_split`, `mc_cv`, `vfold_cv`, `bootstraps`, and `group_vfold_cv` to use tidyselect on the stratification variable.
 * Updated `initial_split`, `mc_cv`, `vfold_cv`, `bootstraps` with new `breaks` parameter that specifies the number of bins to stratify by for a numeric stratification variable.
 
+
 # `rsample` 0.0.4
 
 Small maintenence release. 

diff --git a/tests/testthat/test_bootci.R b/tests/testthat/test_bootci.R
@@ -93,28 +93,28 @@ test_that("Wrappers -- selection of multiple variables works", {
   bt_resamples <- bootstraps(attrition, times = 1000, apparent = TRUE) %>%
     mutate(res = map(splits, func))
 
-  iris_tidy <-
+  attrit_tidy <-
     lm(Age ~ HourlyRate + DistanceFromHome, data = attrition) %>%
     tidy(conf.int = TRUE) %>%
     dplyr::arrange(term)
 
   pct_res <-
     int_pctl(bt_resamples, res) %>%
-    inner_join(iris_tidy, by = "term")
+    inner_join(attrit_tidy, by = "term")
   expect_equal(pct_res$conf.low,  pct_res$.lower, tolerance = .01)
   expect_equal(pct_res$conf.high, pct_res$.upper, tolerance = .01)
 
 
   t_res <-
     int_t(bt_resamples, res) %>%
-    inner_join(iris_tidy, by = "term")
+    inner_join(attrit_tidy, by = "term")
   expect_equal(t_res$conf.low,  t_res$.lower, tolerance = .01)
   expect_equal(t_res$conf.high, t_res$.upper, tolerance = .01)
 
 
   bca_res <-
     int_bca(bt_resamples, res, .fn = func) %>%
-    inner_join(iris_tidy, by = "term")
+    inner_join(attrit_tidy, by = "term")
   expect_equal(bca_res$conf.low,  bca_res$.lower, tolerance = .01)
   expect_equal(bca_res$conf.high, bca_res$.upper, tolerance = .01)
 
@@ -135,26 +135,24 @@ test_that('Upper & lower confidence interval does not contain NA', {
   }
 
   set.seed(888)
-  bt_resamples <- bootstraps(data.frame(x = 1:100), times = 1000, apparent = TRUE) %>%    mutate(res = map(splits, bad_stats))
+  bt_resamples <- bootstraps(data.frame(x = 1:100), times = 1000, apparent = TRUE) %>%
+    mutate(res = map(splits, bad_stats))
 
-  expect_warning(
-    expect_error(
-      int_pctl(bt_resamples, res),
-      "missing values"
-    )
+  expect_error(
+    int_pctl(bt_resamples, res),
+    "missing values"
   )
 
-  expect_warning(
-    expect_error(
-      int_t(bt_resamples, res),
-      "missing values"
-    )
+  expect_error(
+    int_t(bt_resamples, res),
+    "missing values"
+  )
+
+  expect_error(
+    int_bca(bt_resamples, res, .fn = bad_stats),
+    "missing values"
   )
 
-  # expect_error(
-  #   int_bca(bt_resamples, res, .fn = bad_stats),
-  #   "missing values"
-  # )
 })
 
 # ------------------------------------------------------------------------------

diff --git a/vignettes/Applications/Intervals.Rmd b/vignettes/Applications/Intervals.Rmd
@@ -16,9 +16,9 @@ library(GGally)
 theme_set(theme_bw())
 ```
 
-The bootstrap was originally intended for estimating confidence intervals for complex statistics whose variance properties are difficult to analytically derive. Davison and Hinkley's [_Bootstrap Methods and Their Applications_](https://www.cambridge.org/core/books/bootstrap-methods-and-their-application/ED2FD043579F27952363566DC09CBD6A) is a great resource for these methods. `rsample` contains a few function to compute the most common types of intervals. 
+The bootstrap was originally intended for estimating confidence intervals for complex statistics whose variance properties are difficult to analytically derive. Davison and Hinkley's [_Bootstrap Methods and Their Application_](https://www.cambridge.org/core/books/bootstrap-methods-and-their-application/ED2FD043579F27952363566DC09CBD6A) is a great resource for these methods. `rsample` contains a few function to compute the most common types of intervals. 
 
-To demonstrate the computations for the different types of intervals, we'll use a nonlinear regression example from [Baty _et al_ (2015)](https://www.jstatsoft.org/article/view/v066i05). The showed data that monitored oxygen uptake in a patient with rest and exercise phases (in the data frame `O2K`). 
+To demonstrate the computations for the different types of intervals, we'll use a nonlinear regression example from [Baty _et al_ (2015)](https://www.jstatsoft.org/article/view/v066i05). They showed data that monitored oxygen uptake in a patient with rest and exercise phases (in the data frame `O2K`). 
 
 ```{r O2K-dat}
 library(tidymodels)
@@ -31,7 +31,7 @@ ggplot(O2K, aes(x = t, y = VO2)) +
   geom_point()
 ```
 
-The authors fit a segmented regression model where the transition point was known (this is the time when exercise commenced).Their model was:
+The authors fit a segmented regression model where the transition point was known (this is the time when exercise commenced). Their model was:
 
 ```{r O2K-fit}
 nonlin_form <-  
@@ -114,7 +114,7 @@ nls_coef %>%
 
 ## Percentile intervals
 
-The most basic type of interval uses _percentiles_ of the resampling distribution. To get the percentile intervals, the `rset` objects is passed as the first argument and the second argument is the list column of tidy results: 
+The most basic type of interval uses _percentiles_ of the resampling distribution. To get the percentile intervals, the `rset` object is passed as the first argument and the second argument is the list column of tidy results: 
 
 ```{r pctl}
 p_ints <- int_pctl(nlin_bt, models)
@@ -166,7 +166,7 @@ nls_coef %>%
 
 ## t-intervals
 
-Bootstrap _t_-intervals are estimated by computing intermediate statistics that are _t_-like in structure. To use these, we require the estimated variance _for each individual resampled estimate_. In our example, this comes along with the fitted model object. We can extract the standard errors of the parameters. Luckily, most `tidy()` provide this in a column names `std.err`. 
+Bootstrap _t_-intervals are estimated by computing intermediate statistics that are _t_-like in structure. To use these, we require the estimated variance _for each individual resampled estimate_. In our example, this comes along with the fitted model object. We can extract the standard errors of the parameters. Luckily, most `tidy()` provide this in a column named `std.error`. 
 
 The arguments for these intervals are the same:
 
@@ -216,7 +216,7 @@ fold_incr <- function(split, ...) {
     term = "fold increase",
     estimate = unname(quants[2]/quants[1]),
     # We don't know the analytical formula for this 
-    std.err = NA_real_
+    std.error = NA_real_
   )
 }
 ```