Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceiling_date behavior (at boundary) is undocumented #262

Closed
ctbrown opened this issue Sep 25, 2014 · 7 comments
Closed

ceiling_date behavior (at boundary) is undocumented #262

ctbrown opened this issue Sep 25, 2014 · 7 comments

Comments

@ctbrown
Copy link
Contributor

ctbrown commented Sep 25, 2014

ceiling_date seems to give erroneous results. Consider the following example:

library(lubridate)
packageVersion( 'lubridate')
# [1] 1.3.3

 Sys.setenv( TZ="UTC" )
 dt <- ymd( "20120930", "20121001", "20121002", "20121101" )
 floor_date( dt, "month" )
 # [1] "2012-09-01 UTC" "2012-10-01 UTC" "2012-10-01 UTC" "2012-11-01 UTC"
 ceiling_date( dt, "month" )
 # [1] "2012-10-01 UTC" "2012-10-01 UTC" "2012-11-01 UTC" "2012-11-01 UTC"

I expect the second result to be "2012-11-01 UTC", i.e. like the third. This is consistently wrong as seen by the fourth argument.

The problem appears to be with the following expression in ceiling_date:
y <- floor_date(x - eseconds(1), unit)

I don't believe that there is a reason to subtract a second. Changing this to:
y <- floor_date( x, unit )

fixes the issue.

Also, I realize that the ship has probably sailed on this, but shouldn't the ceiling for a month be the last second in the month, e.g. 2012-10-31 12:59:59 rather than the first second of the following month?

@ctbrown
Copy link
Contributor Author

ctbrown commented Sep 25, 2014

This appears to be an undocumented behavior so that it behaves like ceiling. Date/Times on a boundary are not rounded up. From the inst/tests/test-round.R, the change suggested above fails on this check:

 > test_that("ceiling_date does not round up dates that are already on a boundary",{
 +   expect_equal(ceiling_date(as.Date("2012-09-27"), 'day'), as.Date("2012-09-27"))
 + })
 Error: Test failed: 'ceiling_date does not round up dates that are already on a boundary'
 Not expected: ceiling_date(as.Date("2012-09-27"), "day") not equal to as.Date("2012-09-27")
 Mean relative difference: 6.40615e-05.

This probably should be added to the documentation.

@ctbrown ctbrown changed the title ceiling_date provides erroneous and inconsistent results ceiling_date behavior is undocumented Sep 25, 2014
@ctbrown ctbrown changed the title ceiling_date behavior is undocumented ceiling_date behavior (at boundary) is undocumented Sep 25, 2014
@vspinu vspinu closed this as completed in ef982df Dec 14, 2014
@vspinu
Copy link
Member

vspinu commented Dec 14, 2014

I have just documented this.

Also, I realize that the ship has probably sailed on this, but shouldn't the ceiling for a month be the last second in the month, e.g. 2012-10-31 12:59:59 rather than the first second of the following month?

You would like to have the following consistency ceiling_date(ymd("2000-02-26"), "month") == floor_date(ymd("2000-03-3"), "month"). So if you pick the last second of the month, you will end up with the non-intuitive behavior for floor_date.

@shrektan
Copy link
Contributor

Hi, @vspinu @imanuelcostigan, related to:

By convention the boundary for a month is the first second of the month. Thus floor_date(ymd("2000-03-01"), "month") gives "2000-03-01 UTC".

Can you give me some hint so that I can have a better understand the meaning of this convention?

According to my current understanding, for each date in March, no matter it's the first day or the last day, with unit = "month", the ceilling should always be April the 1st, while the floor should always be March the 1st.

@vspinu
Copy link
Member

vspinu commented Mar 10, 2016

Your understanding is correct. Except that second 0 is on the boundary, so ceiling will give you exact the same date for ymd("200-03-01"). It's the same idea as for ceiling(2)==2. You must have a borderline somewhere. The most natural is second 0.

@shrektan
Copy link
Contributor

@vspinu Thanks for the quick response.

I get your point, but I would insist on my opinion that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") should return as.Date("2010-04-01"), which is the same result as got from ceiling_date(as.Date("2010-03-02"), unit = "month"). And I believe that's where intuition fits.

I've been frustrated 3 years ago when I started using lubridate... My colleagues are frustrated, too. So we have to build a wrapper function like function(x) lubridate::ceilling_date(x + 0.01, unit = "month") to overcome the unwanted behavior at the 1st date of the month. However, this failed to work after your updating on CRAN in Dec, 2015.

@vspinu
Copy link
Member

vspinu commented Mar 10, 2016

Hm. That you workaround fails looks like a bug.

that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") s

This appeals to the intuition indeed. The problem is that it's difficult to reconcile this with the fact that we align dates to corresponding 00 second POSIX times. I doubt that this convention is enforced anywhere except truncation. Let's revisit it. Please open a new issue. I will think a bit more about a coherent semantics and will try to fix this asap.

@shrektan
Copy link
Contributor

Thanks, I'll file a new issue later 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants