Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust conditions for bootstrap "type" warning #425

Merged
merged 6 commits into from May 12, 2022
Merged

Conversation

simonpcouch
Copy link
Collaborator

In response to #423—not necessarily a fix, as is, but a place to jump off from. Here's the current logic:

infer/R/generate.R

Lines 127 to 135 in 8ff4a38

# Ensure that the supplied type matches what would be assumed from input
compare_type_vs_auto_type <- function(type, auto_type) {
if(is.null(auto_type)) {
return(type)
}
if (auto_type != type &&
(any(!c(auto_type, type) %in% c("draw", "simulate")))) {
warning_glue(
"You have given `type = \"{type}\"`, but `type` is expected",

Any mismatch between the given and "expected" type raises the warning (where "draw" and "simulate" are exchangeable).

Most of the logic for determining the expected type is here in specify()'s internals—I think this is all fine, and #423 is regarding a case when it's fine to ignore the expected type. Are all usages of type = "bootstrap" okay? Are there other cases when we need not raise the warning even when the given and expected types don't match?

Tests will likely fail with this draft; will adjust once we've solidified what this logic should be. :-)

@mine-cetinkaya-rundel
Copy link
Collaborator

What should be happening is

  • warn when using bootstrap for a proportion (or two proportions) and it's in the context of hypothesis testing
  • don't warn in the context of a confidence interval

So I think the logic should somehow take into account the existence of a hypothesize() step

@simonpcouch simonpcouch marked this pull request as ready for review January 5, 2022 00:33
@simonpcouch
Copy link
Collaborator Author

Okay, the most recent commit should reflect that logic! One error that is currently still raised:

library(infer)
library(tidyverse)

mtcars %>%
  tibble() %>%
  specify(mpg ~ am) %>%
  generate(reps = 100, type = "permute")

#> Error: Permuting should be done only when doing independence hypothesis test. See `hypothesize()`.

Created on 2022-01-04 by the reprex package (v2.0.0)

Do we want this error to persist? See here and here for the history on this one.

@mine-cetinkaya-rundel
Copy link
Collaborator

Indeed that error should still persist. The pipeline in this example is for a confidence interval, therefore one shouldn't be permuting.

Copy link
Collaborator

@mine-cetinkaya-rundel mine-cetinkaya-rundel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good here but see the one question I left. Would be good to reply before merging.

R/generate.R Outdated
(any(!c(auto_type, type) %in% c("draw", "simulate")))) {
(any(!c(auto_type, type) %in% c("draw", "simulate"))) &&
is_hypothesized(x) ||
(type == "bootstrap" && auto_type == "draw" && is_hypothesized(x))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify the logic here? It looks right but I want to make sure I'm following it correctly.

This comment was marked as outdated.

@simonpcouch
Copy link
Collaborator Author

Will give you a chance to look over again, if you so please, before merging.🦦

@mine-cetinkaya-rundel
Copy link
Collaborator

@simonpcouch This looks good, I just have one question. Why doesn't the following generate a warning?

# expected bootstrap, given draw (no hypothesis)
gss %>%
  specify(response = hours) %>%
  # hypothesize(null = "point", mu = 40) %>%
  generate(reps = 200, type = "draw")

@simonpcouch
Copy link
Collaborator Author

Okay, finally getting back to this.

The current state of the logic for this warning on main is:

infer/R/generate.R

Lines 133 to 134 in b163874

if (auto_type != type &&
(any(!c(auto_type, type) %in% c("draw", "simulate")))) {

As it stands on this PR, it is:

infer/R/generate.R

Lines 133 to 135 in aa59b0b

if (auto_type != type &&
(any(!c(auto_type, type) %in% c("draw", "simulate"))) &&
is_hypothesized(x)) {

Current draft doesn't capture examples a la #441, though:

# no hypothesis. auto_type is permute, given is draw. this raises the 
# warning on `main`, but does not do so with the current draft of this PR
gss %>%
  specify(college ~ sex, success = "degree") %>%
  generate(reps = 1000, type = "draw")

# correctly specified hypothesis—raises the warning on both `main`
# and the PR draft
gss %>%
  specify(college ~ sex, success = "degree") %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000, type = "draw")

# "incorrectly" specified hypothesis—hypothesize ought to be the step that's 
# picking up the issue here, as `auto_type` looks to the hypothesize args and picks 
# "draw" as well. so... neither options warn here—not fine, but a separate issue.
gss %>%
  specify(college ~ sex, success = "degree") %>%
  hypothesize(null = "point", p = .40) %>%
  generate(reps = 1000, type = "draw")

My proposal for the criteria, I think:

  • If type = "bootstrap", only raise the warning if
    • testing a proportion
    • the call follows hypothesize()
  • Otherwise, only raise the warning if
    • type and auto_type are different
    • the difference isn't just "draw" vs "simulate" (aliases)

(Note to self: the first set of conditions for raising the warning are only true if a p parameter is present.)

Big picture, this would mean that we supply a warning on type = "bootstrap" less often than we do on main, but are just as strict as we have been on main when supplied other types.

Thoughts @mine-cetinkaya-rundel @andrewpbray? If this sounds good, will update the PR and add more extensive unit tests.

@simonpcouch simonpcouch merged commit 71a0e02 into main May 12, 2022
@simonpcouch simonpcouch deleted the type-warning branch May 12, 2022 16:50
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators May 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants