# tidymodels/infer

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

# Have any summary function work in calculate #50

Closed
opened this Issue Oct 19, 2017 · 8 comments

Projects
None yet
4 participants
Contributor

### rudeboybert commented Oct 19, 2017

 I really like using The Lady Tasting Tea to motivate hypothesis testing. The following code does the trick: ``````library(tidyverse) library(infer) lady_tasting_tea <- data_frame( first = as.factor(c(rep("milk",4), rep("tea",4))) ) lady_tasting_tea %>% specify(response = first) %>% # alt: am ~ NULL (or am ~ 1) hypothesize(null = "point", p = c("milk" = .5, "tea" = .5)) %>% generate(reps = 1000, type = "simulate") %>% calculate(stat = "prop") `````` However, it would be great if the final stat could be "sum". That way there is one less layer of abstraction between the experiment and the null distribution (students can read off count directly, instead of reading off proportions) In a more general setting, any many-to-one summary function would be great, for example all those that work with `dplyr::summarize()`

Collaborator

### ismayc commented Aug 8, 2018

 @echasnovski I have a feeling `calculate(sum)` might be one case that would work for generalizing here? Do you think that's possible?
Collaborator

### echasnovski commented Aug 8, 2018

 I don't think this is actually generalizing, rather adding another acceptable string to `calculate()`. In my understanding implementation of `"sum"` should be similar to `"mean"`, `"median"`, `"sd"` cases. In the example, however, response is a factor and `sum()` doesn't work with factors. As this example currently doesn't work (`success` should be specified for a `"prop"` statistic), I think an appropriate way would be a possibility of `calculate("count")` (as said in issue with word description). It would behave exactly like `"prop"` but with `sum()` instead of `mean()` here.
Collaborator

### ismayc commented Aug 8, 2018

 I think you are right. This might have to be another special case with string input. The `success` argument was added later on in development.
Collaborator

### echasnovski commented Aug 8, 2018

 So is it `calculate("count")` with `success` argument, or `calculate("sum")` for numerical response, or both? After #173 this should be very straightforward to add.
Collaborator

### ismayc commented Aug 8, 2018

 I think it’s both.
Contributor

### mine-cetinkaya-rundel commented Aug 8, 2018

 Is the suggestion to not call the statistic "prop" anymore? The reason for that term was so that it matches the name of the parameter on which we do inference.
Collaborator

### echasnovski commented Aug 8, 2018

 As I understand it, all current functionality is preserved, including `calculate("prop")`. A new one will be added: `calculate("count")`. This will return a number of "successes" inside one bootstrap resample.

Merged

### ismayc added a commit that referenced this issue Aug 20, 2018

``` Merge pull request #179 from echasnovski/new-calc-stat ```
`Add new options for `calculate()` (issue #50)`
``` 82a8301 ```

Collaborator

### ismayc commented Aug 20, 2018

 Now implemented by @echasnovski on the `develop` branch via #179
to join this conversation on GitHub. Already have an account? Sign in to comment