Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week rounding #747

Open
geneorama opened this issue Jul 31, 2014 · 4 comments
Open

Week rounding #747

geneorama opened this issue Jul 31, 2014 · 4 comments

Comments

@geneorama
Copy link

This is probably a feature request.

I noticed that when I'm rounding IDates by week there's a problem where the end of one year and the beginning of the next year will get rounded into two different weeks.

The issue is that the week rounding considers the first day of the year to be the day to which all weeks in that year are rounded. I don't know if this is in the data.table code or somewhere else. Since week rounding doesn't exist for POSIX dates, I'm guessing that it's another smart data.table innovation.

Please note: I'm suggesting that calculating week, e.g. week(dd) is different than round(dd, 'weeks') and could have a different logic. I think the current round works like week, but it should be different.

I would expect and argue that we should round to the same weekday regardless of whether the week overlaps across two years. Personally, I don't care about EU / US standards, and any weekday is fine. It would be trivial to add an offset that would shift which day was the rounding day.

Here's my implementation of a work around and an example of the problem:
EDIT: I realized that this is a mistake... please see changes below !!

library(data.table)

dd <- seq(as.IDate("2013-12-20"), as.IDate("2014-01-20"), 1)

dt <- data.table(day = dd)
dt[ , i := 1:nrow(dt)]
dt[ , weekday := weekdays(day)]
dt[ , day_rounded := round(day, "weeks")]
dt[ , weekday_rounded := weekdays(day_rounded)]
offset <- data.table(weekday_rounded = c('Sunday', 'Monday', 
                                         'Tuesday', 'Wednesday', 
                                         'Thursday', 'Friday', 'Saturday'),
                     offset = -(0:6),
                     key = "weekday_rounded")
dt <- merge(dt, offset, by="weekday_rounded")
dt[ , day_rounded_adj := day_rounded + offset]
dt[ , weekday_rounded_adj := weekdays(day_rounded_adj)]
setkey(dt, i)

dt
dt[,day_rounded_adj]

Thank you,

Gene

PS: I know I've posted two things today, but it's just a coincidence. Please don't worry about me creating tons of posts.

@geneorama
Copy link
Author

Sorry, I realized that I made a mistake in my first logic... It's been a long day.

I meant to post this logic:

dd <- seq(as.IDate("2013-12-20"), as.IDate("2014-01-20"), 1)

dt <- data.table(i = 1:length(dd),
                 day = dd,
                 weekday = weekdays(dd))
offset <- data.table(weekday = c('Sunday', 'Monday', 'Tuesday', 'Wednesday', 
                                         'Thursday', 'Friday', 'Saturday'),
                     offset = -(0:6))
dt <- merge(dt, offset, by="weekday")
dt[ , day_adj := day + offset]
dt[ , weekday_adj := weekdays(day_adj)]
setkey(dt, i)

dt

Which results in this outcome:

> dt

    weekday  i        day offset    day_adj weekday_adj
1:    Friday  1 2013-12-20     -5 2013-12-15      Sunday
2:  Saturday  2 2013-12-21     -6 2013-12-15      Sunday
3:    Sunday  3 2013-12-22      0 2013-12-22      Sunday
4:    Monday  4 2013-12-23     -1 2013-12-22      Sunday
5:   Tuesday  5 2013-12-24     -2 2013-12-22      Sunday
6: Wednesday  6 2013-12-25     -3 2013-12-22      Sunday
7:  Thursday  7 2013-12-26     -4 2013-12-22      Sunday
8:    Friday  8 2013-12-27     -5 2013-12-22      Sunday
9:  Saturday  9 2013-12-28     -6 2013-12-22      Sunday
10:    Sunday 10 2013-12-29      0 2013-12-29      Sunday
11:    Monday 11 2013-12-30     -1 2013-12-29      Sunday
12:   Tuesday 12 2013-12-31     -2 2013-12-29      Sunday
13: Wednesday 13 2014-01-01     -3 2013-12-29      Sunday
14:  Thursday 14 2014-01-02     -4 2013-12-29      Sunday
15:    Friday 15 2014-01-03     -5 2013-12-29      Sunday
16:  Saturday 16 2014-01-04     -6 2013-12-29      Sunday
17:    Sunday 17 2014-01-05      0 2014-01-05      Sunday
18:    Monday 18 2014-01-06     -1 2014-01-05      Sunday
19:   Tuesday 19 2014-01-07     -2 2014-01-05      Sunday
20: Wednesday 20 2014-01-08     -3 2014-01-05      Sunday
21:  Thursday 21 2014-01-09     -4 2014-01-05      Sunday
22:    Friday 22 2014-01-10     -5 2014-01-05      Sunday
23:  Saturday 23 2014-01-11     -6 2014-01-05      Sunday
24:    Sunday 24 2014-01-12      0 2014-01-12      Sunday
25:    Monday 25 2014-01-13     -1 2014-01-12      Sunday
26:   Tuesday 26 2014-01-14     -2 2014-01-12      Sunday
27: Wednesday 27 2014-01-15     -3 2014-01-12      Sunday
28:  Thursday 28 2014-01-16     -4 2014-01-12      Sunday
29:    Friday 29 2014-01-17     -5 2014-01-12      Sunday
30:  Saturday 30 2014-01-18     -6 2014-01-12      Sunday
31:    Sunday 31 2014-01-19      0 2014-01-19      Sunday
32:    Monday 32 2014-01-20     -1 2014-01-19      Sunday
weekday  i        day offset    day_adj weekday_adj

@arunsrinivasan
Copy link
Member

Dint go through the post in detail yet. But seems like it may be related to http://stackoverflow.com/a/22440145/559784. Could you verify?

@geneorama
Copy link
Author

I think they're talking about the function week, which I decided to ignore because I am not sure of the "best" way to do a week calculation. I focused on format, but if the underlying functionality relies on week (and it seems like it might be) then they are related.

@geneorama
Copy link
Author

BTW, I know that lubridate probably provides a solution for week rounding. However I would prefer to avoid using it. I just don't like to introduce more packages and conventions if I don't need to. I have enough to read with data.table!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants