-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow date_group(bound = "upper")
or similar
#232
Comments
This is a good question. It has to do with how the underlying types in clock work. The TLDR is that
As you mentioned, you can't meaningfully generate equally spaced buckets to round up to month/year, since they are variable. library(clock)
x <- date_build(1970, 1:3, 15:17)
x
#> [1] "1970-01-15" "1970-02-16" "1970-03-17"
# A type of time point
nt <- as_naive_time(x)
nt
#> <time_point<naive><day>[3]>
#> [1] "1970-01-15" "1970-02-16" "1970-03-17"
# Stored as an integer number of days since 1970-01-01
unclass(nt)
#> $ticks
#> [1] 14 46 75
#>
#> attr(,"precision")
#> [1] 4
#> attr(,"clock")
#> [1] 1
# Mathematical floor on integer ticks that generates buckets of
# [0, 5), [5, 10), [10, 15), ...
# and chooses the lower bound
time_point_floor(nt, "day", n = 5)
#> <time_point<naive><day>[3]>
#> [1] "1970-01-11" "1970-02-15" "1970-03-17" The other option that you can use is library(clock)
x <- date_build(1970, 1:3, 15:17)
x
#> [1] "1970-01-15" "1970-02-16" "1970-03-17"
# Group by 2 months
date_group(x, "month", n = 2)
#> [1] "1970-01-01" "1970-01-01" "1970-03-01"
ymd <- as_year_month_day(x)
ymd
#> <year_month_day<day>[3]>
#> [1] "1970-01-15" "1970-02-16" "1970-03-17"
# Notice that this stores the components separately
unclass(ymd)
#> $year
#> [1] 1970 1970 1970
#>
#> $month
#> [1] 1 2 3
#>
#> $day
#> [1] 15 16 17
#>
#> attr(,"precision")
#> [1] 4
# Drops day completely, then looks only at the month component
# and generates buckets of:
# [1970-1, 1970-2], [1970-3, 1970-4], ... [1970-11, 1970-12]
# and chooses LHS
calendar_group(ymd, "month", n = 2L)
#> <year_month_day<month>[3]>
#> [1] "1970-01" "1970-01" "1970-03"
# Since we have to convert back to Date, this widens to day precision,
# using the first day of the month
calendar_widen(calendar_group(ymd, "month", n = 2L), "day")
#> <year_month_day<day>[3]>
#> [1] "1970-01-01" "1970-01-01" "1970-03-01" |
I see. In this case it would seem useful to add an additional parameter to
zoo::as.yearmon("2020-5")
# > [1] "May 2020" These types then have their own definition of zoo::as.Date(zoo::as.yearmon("2020-5")) # default frac = 0
# > [1] "2020-05-01"
zoo::as.Date(zoo::as.yearmon("2020-5"), frac = 1)
# > [1] "2020-05-31"
zoo::as.Date(zoo::as.yearmon("2020-5"), frac = 0.5)
# > [1] "2020-05-16" It just seems unintuitive (if understandable) that the user can't easily get the date at the end of the month (or year, for that matter). As described, there is effectively a |
As I was writing out the second example, I did think that it might be possible for x <- date_build(1970, 1:3, 15:17)
x
#> [1] "1970-01-15" "1970-02-16" "1970-03-17"
# Group by 2 months
date_group(x, "month", n = 2, bound = "upper")
#> [1] "1970-02-28" "1970-02-28" "1970-04-30" Also keep in mind that with something like library(clock)
x <- date_parse(c("2019-02-04", "2019-08-02", "2019-11-01"))
# [1, 5], [6, 10], [11, 12]
date_group(x, "month", n = 5)
#> [1] "2019-01-01" "2019-06-01" "2019-11-01"
date_group(x, "month", n = 5, bound = "upper")
#> [1] "2019-05-31" "2019-10-31" "2019-12-31" I'm not really sure when this would be all that useful though. It would require |
Well, it'd be useful in the same circumstances that
I agree. Generating end-of-month dates and then grouping by them would be messy. But that seems more an issue with the design choice of leaving this to |
date_round/floor/ceiling
allow "month" precision?date_group(bound = "upper")
or similar
Is there a fundamental reason for
date_round
and its siblings forbiddingprecision = "month"
?This is likely my lack of imagination, but I can't see when rounding a date would be ambiguous. This is especially true for
Date
types which don't have to worry about entering/leaving summer time or what have you.I'd absolutely understand if its a matter of implementation, given months' variable durations and leap years, etc.
The text was updated successfully, but these errors were encountered: