Skip to content

version 1.0.0 cummean() function is providing weird results #5287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nsingh-phd opened this issue May 31, 2020 · 2 comments
Closed

version 1.0.0 cummean() function is providing weird results #5287

nsingh-phd opened this issue May 31, 2020 · 2 comments
Labels
bug an unexpected problem or unintended behavior funs 😆 wip work in progress

Comments

@nsingh-phd
Copy link

nsingh-phd commented May 31, 2020

For testing, I have this small vector x where the long(er) way is giving me the results as expected version 1.0.0 cummean() is not.

library(tidyverse)
x <- 1:5

# long(er) way
cumsum(x) / seq_along(x)
#> [1] 1.0 1.5 2.0 2.5 3.0

# dplyr 0.8.5 cummean()
cummean(x)
#> [1] 1.0 1.5 2.0 2.5 3.0

# dplyr 1.0.0 cummean()
cummean(x)
#> [1] 1.000000 1.000000 1.333333 1.750000 2.200000

Created on 2020-05-31 by the reprex package (v0.3.0)

@brianrice2
Copy link
Contributor

It looks like the indexing is off by one for dplyr_cummean in /src/funs.cpp, causing the first index to be repeated twice (and the last index to be dropped). I'll submit a pull request with a slight change which I think makes it work as intended.

@nsingh-phd
Copy link
Author

nsingh-phd commented Jun 1, 2020

Thanks @brianrice2 for requesting a quick fix.

@hadley hadley added bug an unexpected problem or unintended behavior funs 😆 labels Jun 5, 2020
@hadley hadley added the wip work in progress label Jun 5, 2020
romainfrancois added a commit that referenced this issue Jun 9, 2020
* Update indexing for dplyr_cummean (#5287)

* Add unit test for cummean

Check that cummean is consistent with the alternate method of
cummean(x) = cumsum(x) / seq_along(x).

Make sure future changes don't break the relationship above.

See also: #5287

* Update unit test for cummean

Add inline comments with exact values we expect to see,
so in the future it is easier to compare expected vs actual results.

* Update unit test for cummean

Add another check that we get expected values, independent of
cumsum and seq_along (in case of other bugs). That is,
cummean(1:5) = [1, 1.5, 2, 2.5, 3].

* Not assuming there is at least one element, +test

* + NEWS

Co-authored-by: Romain Francois <romain@rstudio.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior funs 😆 wip work in progress
Projects
None yet
Development

No branches or pull requests

4 participants