<font size="6"><b>WORKING WITH DATE/TIME OBJECTS</b></font>

In [None]:
library(tidyverse)
library(data.table)

In [None]:
options(repr.matrix.max.rows=20, repr.matrix.max.cols=15) # for limiting the number of top and bottom rows of tables printed 

![xkcd](../imagesbb/iso_8601.png)

(https://xkcd.com/1179)

Working with datetime objects is complicated tasks, because date and time can take many forms according to the detail level (year, quarter, day, seconds), timezone and differences across a series of datetime objects can be irregular (as with the case of different number of days is months).

In Unix systems, datetime is kept as the number of seconds since 1970-01-01 midnight in UTC time.

This is called "Unix time" or "epoch time".

Note that UTC should not be confused with GMT. GMT is a timezone so the geographic region applying GMT may change in time and daylight saving practice can also interfere with the way datetime is kept.

However UTC is just a regular counter starting from Unix epoch and not altered by daylight saving practices or timezones.

# Base R operations with datetime objects

## POSIXct objects

Let's get the current time:

In [None]:
current_time <- Sys.time()

In [None]:
current_time

In [None]:
current_time %>% str

This is a POSIXct object

In [None]:
current_time

There is no timezone attribute we define, it is printed using the system timezone

Get the timezone of the system that we are working in:

In [None]:
tzs <- Sys.timezone()
tzs

In [None]:
attributes(current_time)

Let's create a copy:

In [None]:
current_time2 <- current_time

And change its tzone attribute:

In [None]:
attributes(current_time2)$tzone <- "GMT"

Now let's print again:

In [None]:
current_time2

In [None]:
current_time == current_time2

The values are the same

And we can also get the epoch seconds:

In [None]:
as.numeric(current_time)
as.numeric(current_time2)

They are also the same

However the objects are not identical due to timezone differences:

In [None]:
identical(current_time, current_time2)

All timezone options can be retrieved:

In [None]:
timezones <- OlsonNames()

In [None]:
timezones

Arithmetic operations can be made but it is not so convenient, since it is hard to get the exact time three months later, etc:

In [None]:
current_time + 1

Let's assign the numeric value of seconds since epoch into an object:

In [None]:
current_time_n <- as.numeric(current_time)
current_time_n

We can convert it back to POSIXct object, the default timezone of the system will be used:

In [None]:
as.POSIXct(current_time_n)

as.POSIXct(current_time_n) %>% as.numeric

Or provide another timezone

In [None]:
as.POSIXct(current_time_n, tz = "GMT")

as.POSIXct(current_time_n) %>% as.numeric

See that, the timezones do change how the date and time is presented but the underlying numeric value, seconds since Unix epoch is the same

We can also convert to a character object with any format:

The default one:

In [None]:
format(current_time)

In [None]:
format(current_time, format = "%Y/%m/%d")

In [None]:
format(current_time, format = "%H_%M_%S")

All available formats can be found here:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/strptime.html

%a
Abbreviated weekday name in the current locale on this platform. (Also matches full name on input: in some locales there are no abbreviations of names.)

%A
Full weekday name in the current locale. (Also matches abbreviated name on input.)

%b
Abbreviated month name in the current locale on this platform. (Also matches full name on input: in some locales there are no abbreviations of names.)

%B
Full month name in the current locale. (Also matches abbreviated name on input.)

%c
Date and time. Locale-specific on output, "%a %b %e %H:%M:%S %Y" on input.

%C
Century (00–99): the integer part of the year divided by 100.

%d
Day of the month as decimal number (01–31).

%D
Date format such as %m/%d/%y: the C99 standard says it should be that exact format (but not all OSes comply).

%e
Day of the month as decimal number (1–31), with a leading space for a single-digit number.

%F
Equivalent to %Y-%m-%d (the ISO 8601 date format).

%g
The last two digits of the week-based year (see %V). (Accepted but ignored on input.)

%G
The week-based year (see %V) as a decimal number. (Accepted but ignored on input.)

%h
Equivalent to %b.

%H
Hours as decimal number (00–23). As a special exception strings such as ‘⁠24:00:00⁠’ are accepted for input, since ISO 8601 allows these.

%I
Hours as decimal number (01–12).

%j
Day of year as decimal number (001–366): For input, 366 is only valid in a leap year.

%m
Month as decimal number (01–12).

%M
Minute as decimal number (00–59).

%n
Newline on output, arbitrary whitespace on input.

%p
AM/PM indicator in the locale. Used in conjunction with %I and not with %H. An empty string in some locales (for example on some OSes, non-English European locales including Russia). The behaviour is undefined if used for input in such a locale.

Some platforms accept %P for output, which uses a lower-case version (%p may also use lower case): others will output P.

%r
For output, the 12-hour clock time (using the locale's AM or PM): only defined in some locales, and on some OSes misleading in locales which do not define an AM/PM indicator. For input, equivalent to %I:%M:%S %p.

%R
Equivalent to %H:%M.

%S
Second as integer (00–61), allowing for up to two leap-seconds (but POSIX-compliant implementations will ignore leap seconds).

%t
Tab on output, arbitrary whitespace on input.

%T
Equivalent to %H:%M:%S.

%u
Weekday as a decimal number (1–7, Monday is 1).

%U
Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.

%V
Week of the year as decimal number (01–53) as defined in ISO 8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1. See %G (%g) for the year corresponding to the week given by %V. (Accepted but ignored on input.)

%w
Weekday as decimal number (0–6, Sunday is 0).

%W
Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1). The UK convention.

%x
Date. Locale-specific on output, "%y/%m/%d" on input.

%X
Time. Locale-specific on output, "%H:%M:%S" on input.

%y
Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2018 POSIX standard, but it does also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.

%Y
Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see https://en.wikipedia.org/wiki/0_(year). However, the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved.

For input, only years 0:9999 are accepted.

%z
Signed offset in hours and minutes from UTC, so -0800 is 8 hours behind UTC. (Standard only for output. For input R currently supports it on all platforms – values from -1400 to +1400 are accepted.)

%Z
(Output only.) Time zone abbreviation as a character string (empty if not available). This may not be reliable when a time zone has changed abbreviations over the years.

Let's get into a character format using all fields:

In [None]:
completex <- format(current_time, format = "%Y/%m/%d %H_%M_%S")

In [None]:
completex

And also save the timezone into an object:

In [None]:
tzx <- format(current_time, format = "%Z")

In [None]:
tzx

Create the POSIXct object from the character value representing the datatime, the format to render that character and also the timezone:

In [None]:
current_time3 <- as.POSIXct(completex, format = "%Y/%m/%d %H_%M_%S", tz = tzx)

In [None]:
current_time3

Let's get the attributes:

In [None]:
attributes(current_time3)

And let's see the numeric value: Seconds since epoch

In [None]:
as.numeric(current_time3)

We can also get the character representation of datetime using strptime function

In [None]:
strptime(current_time, format = "%Y-%m-%d %H:%M:%S", tz = "")

### Difference in datetime objects

We can calculate the difference between two datetime objects:

In [None]:
difftime(as.POSIXct("2024-02-26 10:12:20", tz = "GMT"), as.POSIXct("2024-02-26 10:12:20", tz = "Europe/Istanbul"))

In [None]:
difftime(as.POSIXct("2024-02-26 10:12:20", tz = "Europe/Istanbul"), as.POSIXct("2023-07-10 23:05:52", tz = "GMT"))

## Date objects

We can extract dates from POSIXct objects:

In [None]:
current_time

In [None]:
current_date1 <- as.Date(current_time)

In [None]:
current_date1

In [None]:
current_time2

In [None]:
current_date2 <- as.Date(current_time2)

In [None]:
current_date2

Note that, when converting POSIXct object to date, the timezone does not change result, and the date as of the UTC time will be returned.

Now we can get the system date:

In [None]:
current_date3 <- Sys.Date()

In [None]:
current_date3

In [None]:
current_date3 %>% str

The system date is tied to the timezone of the system that we are using.
So if the time at the timezone is past midnight, while the UTC time isn't, the dates will differ

Convert date object to numeric:

In [None]:
current_date_n <- as.integer(current_date1)
current_date_n

This number is dates since the start of Unix epoch, 1st of January 1970:

In [None]:
as.Date(0)

We can convert the numeric date back to a date object

In [None]:
as.Date(current_date_n)

Or by providing the origin explicitly:

In [None]:
as.Date(current_date_n, origin = "1970-01-01")

We can also extract the date as a formatted character:

In [None]:
datechar1 <- format(current_date1)
datechar1

Or in another format:

In [None]:
datechar2 <- format(current_date1, "%Y%m%d")
datechar2

In [None]:
datechar3 <- format(current_date1, "%Y/%m/%d")
datechar3

For unambigiously identified formats, the character can be converted back to date easily:

In [None]:
as.Date(datechar1)

In [None]:
as.Date(datechar3)

But for a format for which the date cannot be decided unambigiously, we have to explicitly pass the format the convert:

In [None]:
as.Date(datechar2, "%Y%m%d")

If the wrong format is provided, the date cannot be retrieved:

In [None]:
as.Date(datechar3, "%Y/%m/%d")

In [None]:
as.Date(datechar3, "%Y%m%d")

In [None]:
completex
tzx

### Sequence of dates

We can create a sequence of dates:

In [None]:
seq.Date(as.Date("2024-02-26"), to = as.Date("2024-12-31"), by = 7)

# `readr` for parsing dates

Similar to base as.Date, we can parse a date from a character representation:

In [None]:
datechar1

In [None]:
parse_date(datechar1)

In [None]:
datechar3

In [None]:
parse_date(datechar3)

But again, the date should be unambigiously decided from the format:

In [None]:
datechar2

In [None]:
parse_date(datechar2)

# `lubridate` operations

## datetime operations

We can convert a character representation of datetime and timezone also in lubridate:

In [None]:
completex

In [None]:
tzs

In [None]:
datetime_l1 <- parse_date_time(completex, orders = "%Y/%m/%d %H_%M_%s", tz = tzs)

In [None]:
datetime_l1

It is again a POSIXct object

In [None]:
datetime_l1 %>% str

We can also use some wrapper for some easily recognized formats:

In [None]:
ymd_hms(completex, tz = Sys.timezone())

### Period objects

Suppose we are only interested in the time part, just the hour, minute and seconds of a POSIXct object, it can convey information on for example an athletic performance, a flight or any time sensitive event:

In [None]:
current_time

We should first format the datetime to reveal only the time part:

In [None]:
time1 <- format(current_time, "%H:%M:%S")
time1

And convert it to a period object

In [None]:
hms1 <- time1 %>% hms
hms1

In [None]:
hms1 %>% str

In [None]:
current_time2

In [None]:
time2 <- format(current_time2, "%H:%M:%S")
time2

In [None]:
hms2 <- time2 %>% hms
hms2

It would also be convenient to convert that period to seconds for specific purposes:

In [None]:
hms2 %>% period_to_seconds

### Accessing/changing units of a datetime separately

We can get any unit of a datetime object:

In [None]:
current_time

In [None]:
second(current_time)

In [None]:
hour(current_time)

In [None]:
minute(current_time)

In [None]:
year(current_time)

In [None]:
day(current_time)

In [None]:
month(current_time)

In [None]:
week(current_time)

Or change any unit of the date time object:

In [None]:
current_time4 <- current_time

In [None]:
day(current_time4) <- day(current_time4) + 1

In [None]:
current_time4

### Duration

We can calculate the duration in seconds of a datetime specification including any units:

In [None]:
duration(24, "hour")

In [None]:
duration(hour = 24, minute = 30)

## Date operations

We can create a date object from year, month and day values:

In [None]:
datel1 <- make_date(2024, 02, 25)

In [None]:
datel1

In [None]:
datel1 %>% str

The conversion from an ambigious formatted character to date can also be done some utility function representing the order of the parts of a date:

In [None]:
ymd("20240225")

In [None]:
mdy("02252024")

In [None]:
dmy("25022024")

### Accessing units

We can have the week day, day in month and day since year beginning values:

In [None]:
lubridate::wday(current_date1, week_start = 1)

In [None]:
lubridate::wday(current_date1, label = T)

In [None]:
mday(current_date1)

In [None]:
yday(current_date1)

### Rounding dates

We can round up or down the date to some defined date unit

Get the first date of the month of a certain date:

In [None]:
floor_date(current_date1, "month")

Get the first date of the month next to that of a certain date:

In [None]:
ceiling_date(current_date1, "month")

And we can easily get the last day of the month of a certain date

In [None]:
(ceiling_date(current_date1, "month") - 1)

### Monthwise operations

We can conduct month-wise additions to create a sequence of regular dates vis-a-vis the position in the month.

That we can get the exact day n months later or the nth day for any month.

For example, get the 31st of each month from January to next year's March.

Note that when a month ends earlier than the 31st, the date is automatically adjusted:

In [None]:
as.Date("2024-01-31") %m+% months(0:14)

### Interval object

We can also create an interval object to check whether a date is within that interval:

In [None]:
last_day <- ceiling_date(current_date1, "year") - 1
last_day

In [None]:
first_day <- floor_date(current_date1, "year")
first_day

In [None]:
interval1 <- interval(first_day, last_day)

In [None]:
interval1

In [None]:
interval1 %>% str

In [None]:
current_date1 %within% interval1

In [None]:
current_date1 %m+% months(0:12)

In [None]:
current_date1 %m+% months(0:12) %within% interval1