-
Notifications
You must be signed in to change notification settings - Fork 5
/
motivations.Rmd
672 lines (445 loc) · 25.8 KB
/
motivations.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
---
title: "Motivations for clock"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Motivations for clock}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
editor_options:
chunk_output_type: console
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup, warning=FALSE, message=FALSE}
library(clock)
library(lubridate)
library(magrittr)
library(rlang)
```
# Motivations
R has always had strong date and date-time capabilities. The Date, POSIXct, and POSIXlt classes provide a powerful foundation to build on, and the R community has done just that. [lubridate](https://lubridate.tidyverse.org/), [zoo](https://CRAN.R-project.org/package=zoo), and [tsibble](https://CRAN.R-project.org/package=tsibble) are just a few of the popular packages that have extended R's native date-time capabilities.
So, why clock?
The following are a few of clock's goals:
- Explicitly handle invalid dates
- Explicitly handle daylight saving time issues
- Expose naive types for representing date-times without a time zone
- Provide calendar types for representing calendar "dates" in alternative ways
- Provide variable precision date-time types
Dealing with dates and date-times is difficult enough as it is. clock's overarching mission is to accomplish its goals in the most explicit and unambiguous way possible. It does this through a combination of:
- Erroring early and verbosely
- Requiring named optional arguments everywhere
- Providing an optional "strict" mode for production systems that forces you to explicitly type out your assumptions for invalid dates and daylight saving time issues
You might also be wondering, why not add these features to lubridate instead? While clock's high-level API and lubridate have similar overall scopes and feel, they are powered by two very different systems. The way that you solve problems with clock is often very different from what you might do with lubridate. Merging this new system into lubridate would be a very difficult task, and would involve many backwards incompatible changes. Additionally, clock's low-level API provides new types and capabilities that are out of scope even for lubridate, which focuses on R's native date-time types.
The rest of this vignette investigates each of clock's goals in more detail. Where relevant, a comparison against lubridate is made to demonstrate some of the improvements and consistency that come with clock.
Parts of this document are extremely detailed, and can get very technical. They are included to provide examples of tough date-time manipulation problems that I feel are under-discussed, but feel free to skip around.
## Invalid dates
### Dates
What is 1 month after January 31st, 2020? Let's ask lubridate:
```{r}
x <- add_days(as.Date("2020-01-31"), c(-2, -1, 0, 1))
x
x + months(1)
```
In this example, we've asked lubridate to create two *invalid dates*, `2020-02-30` and `2020-02-31`. Invalid dates often arise with month-based arithmetic, but they also arise when forcibly setting the month to a specific value. Because these dates don't exist, lubridate returns `NA`. When working with time series, it is often more useful to return *something* back, but what? We could:
- Return `NA` as lubridate does
- Return the previous valid moment in time
- Return the next valid moment in time
- Overflow our invalid date into March by the number of days past the true end of February that it landed at
Only the first is available to you with `+ months(1)`. To supplement this, lubridate added `%m+%` and `add_with_rollback()`.
```{r}
# Equivalent to `add_with_rollback(x, months(1))`
x %m+% months(1)
add_with_rollback(x, months(1), roll_to_first = TRUE)
```
With Dates, the hardest part about `%m+%` is remembering to use it. It is a common bug to forget to use this helper until *after* you have been bitten by an invalid date issue with an unexpected `NA`.
In clock, invalid dates must be handled explicitly as they occur. The default behavior raises an error.
```{r, error=TRUE}
add_months(x, 1)
```
You can handle these issues with an *invalid date resolution strategy*:
```{r}
# Previous moment in time
add_months(x, 1, invalid = "previous")
# Next moment in time
add_months(x, 1, invalid = "next")
# Overflow by 1 day for 30th, 2 days for 31st
add_months(x, 1, invalid = "overflow")
# Like lubridate
add_months(x, 1, invalid = "NA")
```
I recommend using either `"previous"` or `"next"` at all times. Overflowing can be useful in certain scenarios, but it can also result in losing the relative ordering of your input. In the previous example, note that `x` is no longer sorted after adding 1 month when using `"overflow"`. If your analysis requires a (weakly) increasing sequence of dates, as many time series analyses do, then you should use `"previous"` or `"next"`, which are guaranteed to maintain relative ordering.
```{r}
order(x)
order(add_months(x, 1, invalid = "overflow"))
```
### Date-times
With date-times, when you hit invalid dates you also have to consider what to do with the time of day.
```{r}
x <- as.POSIXct("2020-01-31", "America/New_York") %>%
add_days(c(-2, -1, 0, 1)) %>%
add_hours(4:1)
x
```
`add_with_rollback()` defaults to preserving the time of day. Invalid dates are rolled back to the previous valid *day*, but the time of day of `x` is kept:
```{r}
add_with_rollback(x, months(1))
```
I think this is dangerous for the same reason I advised not using `"overflow"` in the Date section above. You can easily lose the relative ordering within `x`.
```{r}
order(x)
order(add_with_rollback(x, months(1)))
```
`add_with_rollback()` has a `preserve_hms` argument, but this can't help here, as it just sets the time of day to midnight and the ordering is still lost.
```{r}
add_with_rollback(x, months(1), preserve_hms = FALSE)
order(add_with_rollback(x, months(1), preserve_hms = FALSE))
```
With clock, using `"previous"` and `"next"` will use the previous and next valid *moment in time*. With POSIXct (which clock assumes to have second precision), this chooses the previous and next valid second. This is the only way to preserve relative ordering in `x`, which is another reason I advocate for using these options.
```{r}
add_months(x, 1, invalid = "previous")
add_months(x, 1, invalid = "next")
order(add_months(x, 1, invalid = "previous"))
order(add_months(x, 1, invalid = "next"))
```
If you really want lubridate's `add_with_rollback()` behavior of retaining the time of day, you can use `"previous-day"` and `"next-day"`, which only adjust the day component, leaving the time of day intact. Again, I don't recommend this.
```{r}
add_months(x, 1, invalid = "previous-day")
add_months(x, 1, invalid = "next-day")
```
## Daylight saving time
In general, there are two types of daylight saving time issues that might come up with dealing with date-times:
- Nonexistent times due to daylight saving time gaps
- Ambiguous times due to daylight saving time fallbacks
### Nonexistent time
On March 8th, 2020 in New York (the America/New_York time zone), the clocks jumped forward from `01:59:59` to `03:00:00`. This created a daylight saving time gap, resulting in a *nonexistent* 2 o'clock hour. These daylight saving time gaps happen once a year in a large number of time zones around the world. Luckily, they often happen at irregular times, like 2 AM, so they often don't interfere with your analyses. That said, it is entirely possible that you could find yourself landing in one of these gaps, so it is important to know how to deal with them.
```{r}
# 1 day before the gap
x <- date_time_parse(
"March 7, 2020 02:30:00", "America/New_York",
format = "%B %d, %Y %H:%M:%S"
)
x <- add_minutes(x, c(-45, 0, 45))
x
```
Let's try adding 1 day to these times with lubridate. As you might expect, you get an `NA`.
```{r}
x + days(1)
```
Unlike with invalid dates, there aren't any tools like `add_with_rollback()` to help you out here. You're stuck with the `NA`. But what could you do? Possibly:
- Roll forward to the next valid moment in time
- Roll backward to the previous valid moment in time
- Shift forward by the size of the gap
- Shift backward by the size of the gap
- Return `NA`
clock allows you to do all of these things. By default, nonexistent times result in an error.
```{r, error=TRUE}
add_days(x, 1)
```
Solve these issues with a *nonexistent time resolution strategy*:
```{r}
# 02:30:00 -> 03:00:00
add_days(x, 1, nonexistent = "roll-forward")
# 02:30:00 -> 01:59:59
add_days(x, 1, nonexistent = "roll-backward")
# 02:30:00 + gap of 1 hr = 03:30:00
add_days(x, 1, nonexistent = "shift-forward")
# 02:30:00 - gap of 1 hr = 01:30:00
add_days(x, 1, nonexistent = "shift-backward")
# lubridate behavior
add_days(x, 1, nonexistent = "NA")
```
I recommend using `"roll-forward"` and `"roll-backward"`, as they are guaranteed to retain the relative ordering of `x`. Shifting, while sometimes useful, will not always retain this:
```{r}
order(x)
order(add_days(x, 1, nonexistent = "shift-forward"))
```
When doing arithmetic with clock's high-level API, you'll only encounter these nonexistent time issues when adding "large" units of time, like days, months, or years. Adding hours, minutes, or seconds will first convert to UTC, add the units of time there, and then convert back to your original time zone, which never creates any problematic times. This might sound a bit strange, but it is generally what you want when working with these smaller units of time.
```{r}
# 1 second before the gap
x <- date_time_parse(
"March 8, 2020 01:59:59", "America/New_York",
format = "%B %d, %Y %H:%M:%S"
)
add_seconds(x, 1)
```
This is essentially equivalent to using `dseconds()` from lubridate. Note that adding `seconds()` would "land" you in the gap, but `dseconds()` jumps right over it.
```{r}
x + seconds(1)
x + dseconds(1)
```
If for any reason you want the `+ seconds()` behavior, you can manually take control by converting to naive-time, adding the units of time there, and then converting back to POSIXct.
```{r, error=TRUE}
x %>%
as_naive_time() %>%
add_seconds(1) %>%
as.POSIXct(date_time_zone(x))
x %>%
as_naive_time() %>%
add_seconds(1) %>%
as.POSIXct(date_time_zone(x), nonexistent = "roll-forward")
```
### Ambiguous time
On November 1st, 2020 in New York (the America/New_York time zone), the clocks fell backwards from `01:59:59 EDT` to `01:00:00 EST`. This created a daylight saving time fallback, resulting in two 1 o'clock hours. Without prior knowledge about whether the daylight time or standard time 1 o'clock should be used, times that fall in the two 1 o'clock hours are considered *ambiguous*.
Like with nonexistent times, adding hours, minutes, or seconds with the high-level API won't generate ambiguous time issues, as it always adds a fixed duration of seconds to the underlying numeric representation of your date-time. This gives us an easy way to show the two 1 o'clocks:
```{r}
x <- date_time_parse("2020-11-01 00:59:59", "America/New_York")
# The first 1 o'clock
add_seconds(x, 1)
# The second 1 o'clock
add_seconds(x, 3601)
```
If we instead worked with units of months, it would be possible to generate an ambiguous time:
```{r, error=TRUE}
# 1 month before the two ambiguous times
x <- date_time_parse("2020-10-01 01:30:00", "America/New_York")
# Which of the two 1 o'clocks are we talking about? It is ambiguous!
add_months(x, 1)
```
You can handle these by setting an *ambiguous time resolution strategy*:
```{r}
# The 1 o'clock that is earliest in the timeline
add_months(x, 1, ambiguous = "earliest")
# The 1 o'clock that is latest in the timeline
add_months(x, 1, ambiguous = "latest")
# Ambiguous times should return NA
add_months(x, 1, ambiguous = "NA")
```
The lubridate behavior here is a bit tricky to explain. In some examples, it might seem to give a more obvious answer:
```{r}
# 1 month after the fallback
x_after <- x + months(2)
# Start before the fallback, add time, return the earliest ambiguous time
x
x + months(1)
# Start after the fallback, subtract time, return the latest ambiguous time
x_after
x_after - months(1)
```
But with other examples, you can end up with confusing results:
```{r}
x_year_before <- x - years(1) + months(1)
x_year_after <- x + years(1) + months(1)
# Start before the fallback, add time, return the earliest ambiguous time
x_year_before
x_year_before + years(1)
# Start after the fallback, subtract time, return the...earliest ambiguous time?
x_year_after
x_year_after - years(1)
```
When lubridate hits an ambiguous time, the UTC offset of the result always matches the offset of the original input. Sometimes this makes sense, but often when adding larger periods of time or updating multiple components of a date-time, this can be confusing.
This isn't to say that the lubridate behavior is wrong. I would argue that it is just a bit too lenient. In clock, a slightly modified form of this behavior exists that is a bit stricter. Consider what would happen if you tried to floor these dates to the nearest hour:
```{r}
x <- as.POSIXct("2020-11-01 00:30:00", "America/New_York")
x <- add_hours(x, 0:5)
x
```
Notice that if we converted two of these dates to naive-time, and then tried to convert them back to POSIXct, they would have been considered ambiguous.
```{r, error=TRUE}
x %>%
as_naive_time() %>%
as.POSIXct(date_time_zone(x))
```
`date_floor()` works by flooring while in naive-time, but somehow this still produces the results you probably expected, with the two 1 o'clock hours being kept in separate groups:
```{r}
date_floor(x, "hour")
```
If you look carefully at the documentation for the POSIXct method of `date_floor()`, you'll notice that `ambiguous` actually defaults to `x`. When ambiguous times occur and your `ambiguous` argument is set to a date-time, clock will attempt to use information from that date-time to determine how to resolve the ambiguity. To resolve ambiguity using a date-time `ambiguous` value, the following two conditions must be met:
- If `ambiguous` was converted to naive-time, then back to POSIXct, it must be considered ambiguous.
- The transition point of the ambiguous naive-time implied by `ambiguous` and the naive-time that results from the operation you are performing must be the same.
In the above example, both of these bullet points are true (the transition point of interest is `2020-11-01 01:00:00-05:00`, the first valid moment in time after the fallback), which allows `date_floor()` to resolve the ambiguity "automatically" without any user intervention.
As a final example, let's change the `origin` that we floor relative to to be 1am EDT, and then let's floor by two hours. This should generate hour groups of `[1, 3), [3, 5), ...` .
```{r, error=TRUE}
origin <- date_floor(x[2], "hour")
origin
date_floor(x, "hour", n = 2, origin = origin)
```
It seems that we have an ambiguous time issue at location 4, the `02:30:00` time. Flooring this to every two hours with an origin of 1 o'clock places it in the `01:00:00` group, but trying to convert that back to POSIXct fails. Unlike when flooring by a single hour, here we have an original time that would *not* have been considered ambiguous if converted to naive-time and back (bullet point 1 above). This implies that we cannot use information from `ambiguous` to resolve this new ambiguity.
To solve this, ideally we'd be able to manually fix the ambiguity that `02:30:00` introduced while maintaining the automatic ambiguity resolution behavior for the `01:30:00 EDT` and `01:30:00 EST` date-times. To handle both of these issues at once, you can supply a list for `ambiguous` where the first element is `x` and the second is an ambiguous time resolution strategy to use when `x` isn't enough to solve the ambiguity.
```{r}
# [23, 1 EDT), [1 EDT, 1 EST), [1 EST, 3), [3, 5)
date_floor(x, "hour", n = 2, origin = origin, ambiguous = list(x, "latest"))
```
You end up with a grouping that contains only a single hour of information (1 EDT), but this is probably the best that we can do since it maintains the rest of the expected grouping structure before and after the fallback.
## Production
This new invalid date and daylight saving time behavior might sound great to you, but you might be wondering about usage of clock in production. What happens if `add_months()` worked in interactive development, but then you put your analysis into production, gathered new data, and all of the sudden it started failing?
```{r, error=TRUE}
x <- as.Date(c("2020-03-30", "2020-03-31"))
x
# All good! Ship it!
add_months(x[1], 1)
# Got new production data. Oh no!
add_months(x[1:2], 1)
```
To balance the usefulness of clock in interactive development with the strict requirements of production, you can set the `clock.strict` global option to `TRUE` to turn `invalid`, `nonexistent`, and `ambiguous` from optional arguments into required ones.
```{r, error=TRUE}
with_options(clock.strict = TRUE, .expr = {
add_months(x[1], 1)
})
```
Forcing yourself to specify these arguments up front during interactive development is a great way to explicitly document your assumptions about these possible issues, while also guarding against future problems in production.
## Naive-time
The previous section was entirely focused on issues that might arise when working with POSIXct and time zones. If you don't require the complexity of time zones, you can use a naive-time. Naive-times are a type of time point that have a yet-to-be-specified time zone. The other type of time point in clock is a sys-time, which is interpreted as having a UTC time zone attached to it. If you aren't using time zones at all, naive-times and sys-times are equivalent, and even use the same date algorithms under the hood.
Unlike our ambiguous time examples from before, since naive-times don't have any implied time zone, there is no repeated 1 o'clock hour.
```{r}
x <- naive_time_parse("2020-11-01T00:30:00")
x <- add_hours(x, 0:5)
x
```
This makes arithmetic and flooring much simpler to reason about:
```{r}
time_point_floor(x, "hour")
origin <- time_point_cast(x[2], "hour")
time_point_floor(x, "hour", n = 2, origin = origin)
```
You'll notice that the result of flooring by hour is a time point with *hour* precision. Flooring drops all information at more precise precisions, and the result is represented in a more compact form.
The difference between naive-time and sys-time is only important when you are working with time zones and zoned-times / POSIXct. Since naive-time has a yet-to-be-specified time zone, converting to a zoned-time simply specifies that zone. If possible, it keeps the printed time the same, while changing the underlying duration.
```{r}
x[1]
as_zoned_time(x[1], "America/New_York")
```
Sys-time is interpreted as being UTC. Converting it to a zoned-time will keep the underlying duration, but will change the printed time.
```{r}
x_sys <- as_sys_time(x[1])
x_sys
# Subtract 4 hours to print the America/New_York time
as_zoned_time(x_sys, "America/New_York")
```
Converting from sys-time to zoned-time won't ever have any issues with nonexistent or ambiguous times, but converting from naive-time might. Remember that there were two 1 o'clock hours on this day in the America/New_York time zone.
```{r, error=TRUE}
x[1:2]
as_zoned_time(x[1:2], "America/New_York")
as_zoned_time(x[1:2], "America/New_York", ambiguous = "earliest")
```
In the low-level API, converting from naive-time to zoned-time is actually the *only* place where the `nonexistent` and `ambiguous` arguments are used. This limits how often you have to think about daylight saving time, ideally helping you avoid as many subtle bugs as possible. In the high-level API for POSIXct, these arguments are forced to bubble up in more places. This trades the user benefit of being able to work directly on R's native date-time types for the side-effect of potentially having to deal with these issues in more places.
## Calendar types
clock provides multiple new *calendar* types, which allow you to represent calendar "dates" in various different ways.
The year-month-day calendar is the most common, it represents the Gregorian calendar through a set of *components* like year, month, and day.
```{r}
year_month_day(2019, 1, 1:5)
year_month_day(2020, 1:12, "last")
```
There are also calendars for:
- year-month-weekday: An alternative way to specify a Gregorian date as year, month, and indexed day of the week (like the 3rd Sunday in March). Great for specifying holidays.
- year-day: An alternative way to specify a Gregorian date as year and day of the year. Great for extracting the day of the year, or grouping by the day of the year.
- year-quarter-day: Great for quarterly financial data, or if you have a fiscal year that starts in a different month from January.
- year-week-day: Allows you to specify a date using the year, week number, day of the week, and a `start` value representing the day that represents the start of the week.
- iso-year-week-day: Follows the ISO week standard. Great if you need to represent ISO weeks like `2020-W02`, or if you need to extract out the ISO year which might differ from the Gregorian year. Identical to year-week-day with `start` set to Monday.
Calendar types are great for instant retrieval of their underlying components (year, month, day, etc), and for irregular calendrical arithmetic with units such as years or months.
```{r}
year_month_day(2019, 1, 2) + duration_months(1:3)
```
The structure of calendar types also enables a unique feature among date-time libraries, directly representing invalid dates. While adding 1 month to an R Date of 2019-03-31 requires setting an invalid date resolution strategy, the year-month-day type can actually handle this directly.
```{r}
ymd <- year_month_day(2019, 3, c(30, 31)) + duration_months(1)
ymd
```
There are a family of functions for working with invalid dates: `invalid_count()`, `invalid_any()`, `invalid_detect()`, and `invalid_resolve()`.
```{r}
invalid_count(ymd)
invalid_detect(ymd)
invalid_resolve(ymd, invalid = "previous")
invalid_resolve(ymd, invalid = "next")
```
The *only* time in the entire low-level API that invalid dates must be resolved is before converting to another type:
```{r, error=TRUE}
as.Date(ymd)
as_naive_time(ymd)
as_year_month_weekday(ymd)
```
In the high-level API for R's native date and date-time types, the `invalid` argument is forced to bubble up in more places. This trades the user benefit of being able to work directly on R's native date-time types for the side-effect of potentially having to deal with this issue in more places.
For another example of invalid dates, imagine that your office is planning on having a group meeting every 1st, 3rd, and 5th Friday of every month in 2019. What days are those?
```{r}
grid <- expand.grid(
index = c(1L, 3L, 5L),
day = clock_weekdays$friday,
month = 1:12,
year = 2019
)
meetings <- year_month_weekday(grid$year, grid$month, grid$day, grid$index)
meetings
meetings[invalid_detect(meetings)]
```
The year-month-weekday calendar allows us to quickly generate these dates. We can see that we've generated 8 invalid dates - some months don't have a 5th Friday.
One option for handling these is to simply remove them. Less group meetings!
```{r}
meetings[!invalid_detect(meetings)]
```
Or, you could resolve them by selecting the previous valid day:
```{r}
meetings_valid <- invalid_resolve(meetings, invalid = "previous")
meetings_valid
```
If you didn't want to meet on Saturday or Sunday, you could convert to a naive-time and *shift* any Saturdays or Sundays to the previous Friday.
```{r}
meetings_naive <- as_naive_time(meetings_valid)
meetings_wday <- as_weekday(meetings_naive)
meetings_wday
friday <- weekday(clock_weekdays$friday)
saturday <- weekday(clock_weekdays$saturday)
sunday <- weekday(clock_weekdays$sunday)
on_weekend <- meetings_wday == saturday | meetings_wday == sunday
meetings_naive[on_weekend] <- time_point_shift(
meetings_naive[on_weekend],
target = friday,
which = "previous"
)
as_year_month_weekday(meetings_naive)
```
## Variable precision
### Calendars
clock's new types all have variable precision. Calendars, for example, range in precision from year to nanosecond.
```{r}
year_month_day(2019)
year_month_day(2019, 01, 01, 02, 30, 50, 123, subsecond_precision = "nanosecond")
```
A nice thing about this is that you get year-month and year-quarter types for free.
```{r}
year_month_day(2019, 1:5)
year_quarter_day(2020, 1:4)
```
clock takes the worldview that a calendar at a specific precision represents a *range* of values. Rather than assuming that a month precision year-month-day would convert directly to a day precision naive-time at the first of the month, you have to be explicit about exactly what day you are talking about by first promoting that month precision calendar up to day precision.
```{r, error=TRUE}
ym <- year_month_day(2019, 1:3)
as_naive_time(ym)
# You might want the first of the month
ym %>%
set_day(1) %>%
as_naive_time()
# Or the last
ym %>%
set_day("last") %>%
as_naive_time()
```
This range assumption is all throughout clock. For example, grouping a day precision iso-year-week-day by weeks returns a week precision calendar.
```{r}
x <- iso_year_week_day(2019, 1:10, 2)
calendar_group(x, "week", n = 2)
```
### Time points
Time points, such as naive-time and sys-time, can vary in precision from day to nanosecond.
```{r}
as_naive_time(year_month_day(2019, 1, 2))
as_sys_time(year_month_day(
2019, 2, 3, 1, 30, 35, 123,
subsecond_precision = "millisecond"
))
```
Unlike calendars, a time point at say, second precision, is considered equivalent to a time point at millisecond precision with millisecond information set to 0.
```{r}
sec <- as_sys_time(year_month_day(
2019, 2, 3, 1, 30, 35
))
milli <- as_sys_time(year_month_day(
2019, 2, 3, 1, 30, 35, 0,
subsecond_precision = "millisecond"
))
sec == milli
```
This allows time points to naturally promote upwards to a more precise precision when doing arithmetic with them:
```{r}
as_naive_time(year_month_day(2019, 1, 2)) + duration_nanoseconds(25)
```