New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceiling_date behavior (at boundary) is undocumented #262

Closed
ctbrown opened this Issue Sep 25, 2014 · 7 comments

Comments

Projects
None yet
3 participants
@ctbrown
Contributor

ctbrown commented Sep 25, 2014

ceiling_date seems to give erroneous results. Consider the following example:

library(lubridate)
packageVersion( 'lubridate')
# [1] 1.3.3

 Sys.setenv( TZ="UTC" )
 dt <- ymd( "20120930", "20121001", "20121002", "20121101" )
 floor_date( dt, "month" )
 # [1] "2012-09-01 UTC" "2012-10-01 UTC" "2012-10-01 UTC" "2012-11-01 UTC"
 ceiling_date( dt, "month" )
 # [1] "2012-10-01 UTC" "2012-10-01 UTC" "2012-11-01 UTC" "2012-11-01 UTC"

I expect the second result to be "2012-11-01 UTC", i.e. like the third. This is consistently wrong as seen by the fourth argument.

The problem appears to be with the following expression in ceiling_date:
y <- floor_date(x - eseconds(1), unit)

I don't believe that there is a reason to subtract a second. Changing this to:
y <- floor_date( x, unit )

fixes the issue.

Also, I realize that the ship has probably sailed on this, but shouldn't the ceiling for a month be the last second in the month, e.g. 2012-10-31 12:59:59 rather than the first second of the following month?

@ctbrown

This comment has been minimized.

Show comment
Hide comment
@ctbrown

ctbrown Sep 25, 2014

Contributor

This appears to be an undocumented behavior so that it behaves like ceiling. Date/Times on a boundary are not rounded up. From the inst/tests/test-round.R, the change suggested above fails on this check:

 > test_that("ceiling_date does not round up dates that are already on a boundary",{
 +   expect_equal(ceiling_date(as.Date("2012-09-27"), 'day'), as.Date("2012-09-27"))
 + })
 Error: Test failed: 'ceiling_date does not round up dates that are already on a boundary'
 Not expected: ceiling_date(as.Date("2012-09-27"), "day") not equal to as.Date("2012-09-27")
 Mean relative difference: 6.40615e-05.

This probably should be added to the documentation.

Contributor

ctbrown commented Sep 25, 2014

This appears to be an undocumented behavior so that it behaves like ceiling. Date/Times on a boundary are not rounded up. From the inst/tests/test-round.R, the change suggested above fails on this check:

 > test_that("ceiling_date does not round up dates that are already on a boundary",{
 +   expect_equal(ceiling_date(as.Date("2012-09-27"), 'day'), as.Date("2012-09-27"))
 + })
 Error: Test failed: 'ceiling_date does not round up dates that are already on a boundary'
 Not expected: ceiling_date(as.Date("2012-09-27"), "day") not equal to as.Date("2012-09-27")
 Mean relative difference: 6.40615e-05.

This probably should be added to the documentation.

@ctbrown ctbrown changed the title from ceiling_date provides erroneous and inconsistent results to ceiling_date behavior is undocumented Sep 25, 2014

@ctbrown ctbrown changed the title from ceiling_date behavior is undocumented to ceiling_date behavior (at boundary) is undocumented Sep 25, 2014

@vspinu vspinu closed this in ef982df Dec 14, 2014

@vspinu

This comment has been minimized.

Show comment
Hide comment
@vspinu

vspinu Dec 14, 2014

Member

I have just documented this.

Also, I realize that the ship has probably sailed on this, but shouldn't the ceiling for a month be the last second in the month, e.g. 2012-10-31 12:59:59 rather than the first second of the following month?

You would like to have the following consistency ceiling_date(ymd("2000-02-26"), "month") == floor_date(ymd("2000-03-3"), "month"). So if you pick the last second of the month, you will end up with the non-intuitive behavior for floor_date.

Member

vspinu commented Dec 14, 2014

I have just documented this.

Also, I realize that the ship has probably sailed on this, but shouldn't the ceiling for a month be the last second in the month, e.g. 2012-10-31 12:59:59 rather than the first second of the following month?

You would like to have the following consistency ceiling_date(ymd("2000-02-26"), "month") == floor_date(ymd("2000-03-3"), "month"). So if you pick the last second of the month, you will end up with the non-intuitive behavior for floor_date.

@shrektan

This comment has been minimized.

Show comment
Hide comment
@shrektan

shrektan Mar 10, 2016

Contributor

Hi, @vspinu @imanuelcostigan, related to:

By convention the boundary for a month is the first second of the month. Thus floor_date(ymd("2000-03-01"), "month") gives "2000-03-01 UTC".

Can you give me some hint so that I can have a better understand the meaning of this convention?

According to my current understanding, for each date in March, no matter it's the first day or the last day, with unit = "month", the ceilling should always be April the 1st, while the floor should always be March the 1st.

Contributor

shrektan commented Mar 10, 2016

Hi, @vspinu @imanuelcostigan, related to:

By convention the boundary for a month is the first second of the month. Thus floor_date(ymd("2000-03-01"), "month") gives "2000-03-01 UTC".

Can you give me some hint so that I can have a better understand the meaning of this convention?

According to my current understanding, for each date in March, no matter it's the first day or the last day, with unit = "month", the ceilling should always be April the 1st, while the floor should always be March the 1st.

@vspinu

This comment has been minimized.

Show comment
Hide comment
@vspinu

vspinu Mar 10, 2016

Member

Your understanding is correct. Except that second 0 is on the boundary, so ceiling will give you exact the same date for ymd("200-03-01"). It's the same idea as for ceiling(2)==2. You must have a borderline somewhere. The most natural is second 0.

Member

vspinu commented Mar 10, 2016

Your understanding is correct. Except that second 0 is on the boundary, so ceiling will give you exact the same date for ymd("200-03-01"). It's the same idea as for ceiling(2)==2. You must have a borderline somewhere. The most natural is second 0.

@shrektan

This comment has been minimized.

Show comment
Hide comment
@shrektan

shrektan Mar 10, 2016

Contributor

@vspinu Thanks for the quick response.

I get your point, but I would insist on my opinion that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") should return as.Date("2010-04-01"), which is the same result as got from ceiling_date(as.Date("2010-03-02"), unit = "month"). And I believe that's where intuition fits.

I've been frustrated 3 years ago when I started using lubridate... My colleagues are frustrated, too. So we have to build a wrapper function like function(x) lubridate::ceilling_date(x + 0.01, unit = "month") to overcome the unwanted behavior at the 1st date of the month. However, this failed to work after your updating on CRAN in Dec, 2015.

Contributor

shrektan commented Mar 10, 2016

@vspinu Thanks for the quick response.

I get your point, but I would insist on my opinion that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") should return as.Date("2010-04-01"), which is the same result as got from ceiling_date(as.Date("2010-03-02"), unit = "month"). And I believe that's where intuition fits.

I've been frustrated 3 years ago when I started using lubridate... My colleagues are frustrated, too. So we have to build a wrapper function like function(x) lubridate::ceilling_date(x + 0.01, unit = "month") to overcome the unwanted behavior at the 1st date of the month. However, this failed to work after your updating on CRAN in Dec, 2015.

@vspinu

This comment has been minimized.

Show comment
Hide comment
@vspinu

vspinu Mar 10, 2016

Member

Hm. That you workaround fails looks like a bug.

that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") s

This appeals to the intuition indeed. The problem is that it's difficult to reconcile this with the fact that we align dates to corresponding 00 second POSIX times. I doubt that this convention is enforced anywhere except truncation. Let's revisit it. Please open a new issue. I will think a bit more about a coherent semantics and will try to fix this asap.

Member

vspinu commented Mar 10, 2016

Hm. That you workaround fails looks like a bug.

that at least for the pure date object, i.e., ceiling_date(as.Date("2010-03-01"), unit = "month") s

This appeals to the intuition indeed. The problem is that it's difficult to reconcile this with the fact that we align dates to corresponding 00 second POSIX times. I doubt that this convention is enforced anywhere except truncation. Let's revisit it. Please open a new issue. I will think a bit more about a coherent semantics and will try to fix this asap.

@shrektan

This comment has been minimized.

Show comment
Hide comment
@shrektan

shrektan Mar 11, 2016

Contributor

Thanks, I'll file a new issue later 😄

Contributor

shrektan commented Mar 11, 2016

Thanks, I'll file a new issue later 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment