Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
8c503f9
Manage "infer" class more systematically.
echasnovski Jan 4, 2019
ce156e9
Update 'RoxygenNote' in 'DESCRIPTION'.
echasnovski Jan 4, 2019
ab63dc0
Merge pull request #219 from echasnovski/infer-class
Jan 4, 2019
d65e0cf
Test if I can push.
echasnovski Jan 5, 2019
ab04e6e
Use {vdiffr} for plot testing. Closes #212.
echasnovski Feb 3, 2019
71b5ee4
Merge pull request #221 from echasnovski/vdiffr
Feb 5, 2019
1124d1b
Implement initial "area under the curve" functionality in `shade_p_va…
echasnovski Mar 8, 2019
b4e56d8
add GoF test
andrewpbray Apr 2, 2019
d449a20
remove unfinished pval part
andrewpbray Apr 2, 2019
0f059bd
refactor to take out parse_variables into a separate function. remove…
andrewpbray Apr 3, 2019
b4ace58
insert in parse_variables, change `data` to `x`.
andrewpbray Apr 3, 2019
edf73d1
add GoF to chisq_test
andrewpbray Apr 3, 2019
4bd38b3
redo chisq_stat to only use chisq_test
andrewpbray Apr 3, 2019
723911c
Update {vdiffr} tests to use "area under the curve" approach in `shad…
echasnovski Apr 7, 2019
4c7f421
Move `shade_p_value()` tests to separate file.
echasnovski Apr 7, 2019
dd38a69
Update `shade_p_value()` documentation.
echasnovski Apr 7, 2019
689a63c
Update files for a clean R CMD CHECK.
echasnovski Apr 7, 2019
d160351
Add `shade_p_value()` tests for handling `direction` synonyms.
echasnovski Apr 7, 2019
7f22dfc
Mention "area under the curve" approach in 'NEWS.md' and documentation.
echasnovski Apr 7, 2019
97bdbcc
Attempt to solve test reproducibility problem on r-devel.
echasnovski Apr 7, 2019
15ea3c7
Merge pull request #229 from tidymodels/shade-pval-area
andrewpbray Apr 8, 2019
78a8be4
simplify args
andrewpbray Apr 8, 2019
3a35105
add get_expr
andrewpbray Apr 11, 2019
4e77df2
fix namespace bug
andrewpbray Apr 11, 2019
7fb6ea0
add default args
andrewpbray Apr 11, 2019
7b76feb
fix t_test to use parse_variables
andrewpbray Apr 11, 2019
9beebde
Update README.md
Apr 13, 2019
1da3092
Update `shade_ci()` to draw vertical lines from 0 (not from `-Inf`).
echasnovski Apr 28, 2019
85faa38
Rename {vdiffr} expectations for `shade_p_value()` accepting `directi…
echasnovski Apr 28, 2019
2510aeb
Move code and tests for `shade_confidence_interval()` to separate files.
echasnovski Apr 28, 2019
4ee21cb
Update 'NEWS.md'.
echasnovski Apr 28, 2019
bb75b3a
Merge pull request #234 from tidymodels/shade_ci-update
andrewpbray May 1, 2019
735de6b
fuss w NSE
andrewpbray May 1, 2019
3f41b57
extend for nse and documentation
andrewpbray May 1, 2019
5765078
adapt chisq tests
andrewpbray May 17, 2019
9a00357
solve enquo problem (hopefully()
andrewpbray May 23, 2019
ebc3559
tnker w tests
andrewpbray May 23, 2019
0ca55a2
add more chisq tests
andrewpbray May 23, 2019
3f333a3
redo-manual
andrewpbray May 23, 2019
f6f99d4
merge in develop
andrewpbray May 23, 2019
25b918d
recompile by check
andrewpbray May 23, 2019
de98563
replace the name of the arugment `x` with `data` to conform with base…
andrewpbray May 24, 2019
f515d6c
fix test
andrewpbray May 24, 2019
26f0496
flesh out two sample t test
andrewpbray May 24, 2019
f1b7af1
roll back arg name to x
andrewpbray May 24, 2019
443b285
add tests
andrewpbray May 24, 2019
43c727d
add check args for chisq
andrewpbray May 24, 2019
4df0b10
revert change
andrewpbray May 25, 2019
782bac4
fix test failures
andrewpbray May 25, 2019
924ba9e
fix typos
andrewpbray May 25, 2019
b850092
drag t-test into t-stat
andrewpbray May 25, 2019
7bf8386
add more t tests
andrewpbray May 25, 2019
f85c644
fix pull()
andrewpbray Aug 12, 2019
ae4a4c3
add arg documentation
andrewpbray Aug 12, 2019
fc35749
update docs
andrewpbray Aug 12, 2019
e80e8b6
update for vdiffr
andrewpbray Aug 12, 2019
43927e9
update after sims changed
andrewpbray Aug 12, 2019
4c7d550
add import
andrewpbray Aug 12, 2019
bcf4bef
make null hypothesis params explicit
richierocks Aug 13, 2019
48b5fc6
fix silly bugs
richierocks Aug 13, 2019
2a07da7
remove dupe check on null arg
richierocks Aug 13, 2019
3624cc9
typo
richierocks Aug 13, 2019
de704f7
tests for bad calls to hypothesize()
richierocks Aug 13, 2019
4d250d7
try adding variable to fix vdiffr failures
andrewpbray Aug 18, 2019
8738090
Merge pull request #241 from tidymodels/goodness-of-fit
andrewpbray Aug 18, 2019
00a4ace
rechecked, vdiffr cases managed
andrewpbray Aug 28, 2019
cf40bc3
fix typo
andrewpbray Sep 17, 2019
90cd5b0
change to lower case
andrewpbray Sep 17, 2019
b5d6641
fix typos in roxygen
andrewpbray Sep 17, 2019
b5d82ce
repair test
andrewpbray Sep 17, 2019
67d9a67
Merge pull request #246 from andrewpbray/develop
andrewpbray Sep 17, 2019
a9e9a02
resolve merge conflict
andrewpbray Sep 17, 2019
63ce0e2
add news for v 0.5.0
andrewpbray Sep 17, 2019
4725409
switch evgeni to aut
andrewpbray Sep 17, 2019
bfe840e
update vdiffr
andrewpbray Sep 17, 2019
fe76479
suppress warning from chisq.test() in chisq_stat()
andrewpbray Sep 17, 2019
55ee98b
suppress warning from chisq.test() when doing a permutation GoF
andrewpbray Sep 17, 2019
2e45f95
fix typo
andrewpbray Sep 17, 2019
c3f7d17
Merge pull request #247 from tidymodels/supress-chisq
andrewpbray Sep 17, 2019
02647da
add tolerance to exact tests
andrewpbray Sep 20, 2019
8c00c65
temporarily remove tests that are failing noLD builds
andrewpbray Sep 27, 2019
9c7c5a7
prepare for release
andrewpbray Sep 27, 2019
1b7f4aa
Merge branch 'master' into develop
andrewpbray Oct 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
^docs*
^CONDUCT\.md$
^README\.md$
^NEWS\.md$
^cran-comments\.md$
^_build\.sh$
^appveyor\.yml$
Expand Down
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ latex: false
env:
global:
- CRAN: http://cran.rstudio.com
- VDIFFR_RUN_TESTS: false

notifications:
email:
Expand Down
8 changes: 5 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: infer
Type: Package
Title: Tidy Statistical Inference
Version: 0.4.1
Version: 0.5.0
Authors@R: c(
person("Andrew", "Bray", email = "abray@reed.edu", role = c("aut", "cre")),
person("Chester", "Ismay", email = "chester.ismay@gmail.com", role = "aut"),
Expand All @@ -28,7 +28,8 @@ Imports:
ggplot2,
magrittr,
glue (>= 1.3.0),
grDevices
grDevices,
purrr
Depends:
R (>= 3.1.2)
Suggests:
Expand All @@ -39,7 +40,8 @@ Suggests:
nycflights13,
stringr,
testthat,
covr
covr,
vdiffr
URL: https://github.com/tidymodels/infer
BugReports: https://github.com/tidymodels/infer/issues
Roxygen: list(markdown = TRUE)
Expand Down
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,17 @@ importFrom(ggplot2,ylab)
importFrom(glue,glue_collapse)
importFrom(magrittr,"%>%")
importFrom(methods,hasArg)
importFrom(purrr,compact)
importFrom(rlang,"!!")
importFrom(rlang,":=")
importFrom(rlang,enquo)
importFrom(rlang,eval_tidy)
importFrom(rlang,f_lhs)
importFrom(rlang,f_rhs)
importFrom(rlang,get_expr)
importFrom(rlang,quo)
importFrom(rlang,sym)
importFrom(stats,as.formula)
importFrom(stats,dchisq)
importFrom(stats,df)
importFrom(stats,dnorm)
Expand Down
15 changes: 15 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
# infer 0.5.0

## Breaking changes

- `shade_confidence_interval()` now plots vertical lines starting from zero (previously - from the bottom of a plot) (#234).
- `shade_p_value()` now uses "area under the curve" approach to shading (#229).

## Other

- Updated `chisq_test()` to take arguments in a response/explanatory format, perform goodness of fit tests, and default to the approximation approach (#241).
- Updated `chisq_stat()` to do goodness of fit (#241).
- Make interface to `hypothesize()` clearer by adding the options for the point null parameters to the function signature (#242).
- Manage `infer` class more systematically (#219).
- Use `vdiffr` for plot testing (#221).

# infer 0.4.1

- Added Evgeni Chasnovski as author for his incredible work on refactoring the package and providing excellent support.
Expand Down
6 changes: 3 additions & 3 deletions R/calculate.R
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ calculate <- function(x,
)
}
# else {
# class(result) <- append("infer", class(result))
# result <- append_infer_class(result)
# }

result <- copy_attrs(to = result, from = x)
Expand Down Expand Up @@ -232,12 +232,12 @@ calc_impl.Chisq <- function(type, x, order, ...) {
p_levels <- get_par_levels(x)
x %>%
dplyr::summarize(
stat = stats::chisq.test(
stat = suppressWarnings(stats::chisq.test(
# Ensure correct ordering of parameters
table(!!(attr(x, "response")))[p_levels],
p = attr(x, "params")
)$stat
)
))
} else {
# Straight from `specify()`
stop_glue(
Expand Down
14 changes: 5 additions & 9 deletions R/generate.R
Original file line number Diff line number Diff line change
Expand Up @@ -145,9 +145,7 @@ bootstrap <- function(x, reps = 1, ...) {
result <- rep_sample_n(x, size = nrow(x), replace = TRUE, reps = reps)
result <- copy_attrs(to = result, from = x)

class(result) <- append("infer", class(result))

result
append_infer_class(result)
}

#' @importFrom dplyr bind_rows group_by
Expand All @@ -159,9 +157,7 @@ permute <- function(x, reps = 1, ...) {

df_out <- copy_attrs(to = df_out, from = x)

class(df_out) <- append("infer", class(df_out))

df_out
append_infer_class(df_out)
}

permute_once <- function(x, ...) {
Expand Down Expand Up @@ -195,7 +191,7 @@ simulate <- function(x, reps = 1, ...) {

rep_tbl <- copy_attrs(to = rep_tbl, from = x)

class(rep_tbl) <- append("infer", class(rep_tbl))

dplyr::group_by(rep_tbl, replicate)
rep_tbl <- dplyr::group_by(rep_tbl, replicate)
append_infer_class(rep_tbl)
}
97 changes: 38 additions & 59 deletions R/hypothesize.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,14 @@
#' @param x A data frame that can be coerced into a [tibble][tibble::tibble].
#' @param null The null hypothesis. Options include `"independence"` and
#' `"point"`.
#' @param ... Arguments passed to downstream functions.
#' @param p The true proportion of successes (a number between 0 and 1). To be used with point null hypotheses when the specified response
#' variable is categorical.
#' @param mu The true mean (any numerical value). To be used with point null
#' hypotheses when the specified response variable is continuous.
#' @param med The true median (any numerical value). To be used with point null
#' hypotheses when the specified response variable is continuous.
#' @param sigma The true standard deviation (any numerical value). To be used with
#' point null hypotheses.
#'
#' @return A tibble containing the response (and explanatory, if specified)
#' variable data with parameter information stored as well.
Expand All @@ -17,71 +24,43 @@
#' generate(reps = 100, type = "permute") %>%
#' calculate(stat = "F")
#'
#' @importFrom purrr compact
#' @export
hypothesize <- function(x, null, ...) {
hypothesize_checks(x, null)
hypothesize <- function(x, null, p = NULL, mu = NULL, med = NULL, sigma = NULL) {

# Custom logic, because using match.arg() would give a default value when
# the user didn't specify anything.
null <- match_null_hypothesis(null)
attr(x, "null") <- null

dots <- list(...)

if ((null == "point") && (length(dots) == 0)) {
stop_glue(
"Provide a parameter and a value to check such as `mu = 30` for the ",
"point hypothesis."
)
}

if ((null == "independence") && (length(dots) > 0)) {
warning_glue(
"Parameter values are not specified when testing that two variables are ",
"independent."
)
}

if ((length(dots) > 0) && (null == "point")) {
params <- parse_params(dots, x)
attr(x, "params") <- params

if (any(grepl("p.", attr(attr(x, "params"), "names")))) {
# simulate instead of bootstrap based on the value of `p` provided
attr(x, "type") <- "simulate"
} else {
attr(x, "type") <- "bootstrap"
}
hypothesize_checks(x, null)

}
dots <- compact(list(p = p, mu = mu, med = med, sigma = sigma))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snazzy!

Copy link
Collaborator Author

@andrewpbray andrewpbray Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's all @richierocks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @richierocks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging that bit about plot labels. We'll try to tack that on to the next wave of updates, led by @simonpcouch, focused on documentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Looking forward to it. Happy to provide a quick review. Just tag me and I'll let you know a timeline for when I'll be able to get it done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it over to an issue and try to assign it to Simon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I'm not sure why the travis build is taking so long...normally it's like a 5 min job.


if (!is.null(null) && (null == "independence")) {
attr(x, "type") <- "permute"
}
switch(
null,
independence = {
params <- sanitize_hypothesis_params_independence(dots)
attr(x, "type") <- "permute"
},
point = {
params <- sanitize_hypothesis_params_point(dots, x)
attr(x, "params") <- unlist(params)

# Check one proportion test set up correctly
if (null == "point") {
if (is.factor(response_variable(x))) {
if (!any(grepl("p", attr(attr(x, "params"), "names")))) {
stop_glue(
'Testing one categorical variable requires `p` to be used as a ',
'parameter.'
)
if (!is.null(params$p)) {
# simulate instead of bootstrap based on the value of `p` provided
attr(x, "type") <- "simulate"
} else {
# Check one proportion test set up correctly
if (is.factor(response_variable(x))) {
stop_glue(
'Testing one categorical variable requires `p` to be used as a ',
'parameter.'
)
}
attr(x, "type") <- "bootstrap"
}
}
}

# Check one numeric test set up correctly
## Not currently able to reach in testing as other checks
## already produce errors
# if (null == "point") {
# if (
# !is.factor(response_variable(x))
# & !any(grepl("mu|med|sigma", attr(attr(x, "params"), "names")))
# ) {
# stop_glue(
# 'Testing one numerical variable requires one of ',
# '`mu`, `med`, or `sd` to be used as a parameter.'
# )
# }
# }

tibble::as_tibble(x)
)
append_infer_class(tibble::as_tibble(x))
}
2 changes: 1 addition & 1 deletion R/infer.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ NULL
if (getRversion() >= "2.15.1") {
utils::globalVariables(
c(
"prop", "stat", "value", "x", "..density..", "statistic", ".",
"prop", "stat", "value", "x", "y", "..density..", "statistic", ".",
"parameter", "p.value", "xmin", "x_min", "xmax", "x_max", "density",
"denom", "diff_prop", "group_num", "n1", "n2", "num_suc", "p_hat",
"total_suc", "explan", "probs", "conf.low", "conf.high"
Expand Down
76 changes: 76 additions & 0 deletions R/shade_confidence_interval.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#' Add information about confidence interval
#'
#' `shade_confidence_interval()` plots confidence interval region on top of the
#' [visualize()] output. It should be used as \\{ggplot2\\} layer function (see
#' examples). `shade_ci()` is its alias.
#'
#' @param endpoints A 2 element vector or a 1 x 2 data frame containing the
#' lower and upper values to be plotted. Most useful for visualizing
#' conference intervals.
#' @param color A character or hex string specifying the color of the
#' end points as a vertical lines on the plot.
#' @param fill A character or hex string specifying the color to shade the
#' confidence interval. If `NULL` then no shading is actually done.
#' @param ... Other arguments passed along to \\{ggplot2\\} functions.
#' @return A list of \\{ggplot2\\} objects to be added to the `visualize()`
#' output.
#'
#' @seealso [shade_p_value()] to add information about p-value region.
#'
#' @examples
#' viz_plot <- mtcars %>%
#' dplyr::mutate(am = factor(am)) %>%
#' specify(mpg ~ am) %>% # alt: response = mpg, explanatory = am
#' hypothesize(null = "independence") %>%
#' generate(reps = 100, type = "permute") %>%
#' calculate(stat = "t", order = c("1", "0")) %>%
#' visualize(method = "both")
#'
#' viz_plot + shade_confidence_interval(c(-1.5, 1.5))
#' viz_plot + shade_confidence_interval(c(-1.5, 1.5), fill = NULL)
#'
#' @name shade_confidence_interval
NULL

#' @rdname shade_confidence_interval
#' @export
shade_confidence_interval <- function(endpoints, color = "mediumaquamarine",
fill = "turquoise", ...) {
endpoints <- impute_endpoints(endpoints)
check_shade_confidence_interval_args(color, fill)

res <- list()
if (is.null(endpoints)) {
return(res)
}

if (!is.null(fill)) {
res <- c(
res, list(
ggplot2::geom_rect(
data = data.frame(endpoints[1]),
fill = fill, alpha = 0.6,
aes(xmin = endpoints[1], xmax = endpoints[2], ymin = 0, ymax = Inf),
inherit.aes = FALSE,
...
)
)
)
}

c(
res,
list(
ggplot2::geom_segment(
data = data.frame(x = endpoints),
aes(x = x, xend = x, y = 0, yend = Inf),
colour = color, size = 2,
inherit.aes = FALSE
)
)
)
}

#' @rdname shade_confidence_interval
#' @export
shade_ci <- shade_confidence_interval
Loading