## Arithmetic with Dates and Times
Getting datetimes into R is just the first step. Now that you know how to parse datetimes, you need to learn how to do calculations with them. In this chapter, you'll learn the different ways of representing spans of time with lubridate and how to leverage them to do arithmetic on datetimes. By the end of the chapter, you'll have calculated how long it's been since the first man stepped on the moon, generated sequences of dates to help schedule reminders, calculated when an eclipse occurs, and explored the reigns of monarch's of England (and which ones might have seen Halley's comet!).

### How long has it been?
To get finer control over a difference between datetimes use the base function difftime(). 
For example instead of time1 - time2, you use difftime(time1, time2).
difftime() takes an argument units which specifies the units for the difference. Your options are "secs", "mins", 
"hours", "days", or "weeks".
To practice you'll find the time since the first man stepped on the moon. 
You'll also see the lubridate functions today() and now() which when called with 
no arguments return the current date and time in your system's timezone.

In [3]:
library(lubridate)

# Apollo 11 landed on July 20, 1969. Use difftime() to find the number of days between today() and date_landing.

# The date of landing and moment of step
date_landing <- mdy("July 20, 1969")
moment_step <- mdy_hms("July 20, 1969, 02:56:15", tz = "UTC")

# How many days since the first man on the moon?
difftime(today(),date_landing, , units = 'days')

# How many seconds since the first man step on the moon?
difftime(now(), moment_step, units = 'secs')

Time difference of 18787 days

Time difference of 1623265239 secs

### How many seconds are in a day?
How many seconds are in a day? There are 24 hours in a day, 60 minutes in an hour, and 60 seconds in a minute, 
so there should be 24*60*60 = 86400 seconds, right?
Not always! In this exercise you'll see a counter example, can you figure out what is going on?
We've put code to define three times in your script - noon on March 11th, March 12th, and March 13th in 2017 in the US Pacific timezone. Find the difference in time between mar_13 and mar_12 in seconds. This should match your intuition.

In [2]:
library(lubridate)

# Three dates
mar_11 <- ymd_hms("2017-03-11 12:00:00", 
  tz = "America/Los_Angeles")
mar_12 <- ymd_hms("2017-03-12 12:00:00", 
  tz = "America/Los_Angeles")
mar_13 <- ymd_hms("2017-03-13 12:00:00", 
  tz = "America/Los_Angeles")

# Difference between mar_13 and mar_12 in seconds
difftime(mar_13, mar_12, units = 'secs')

# Now, find the difference in time between mar_12 and mar_11 in seconds. Surprised?

# Difference between mar_12 and mar_11 in seconds
difftime(mar_12, mar_11, units = 'secs')

# Why would a day only have 82800 seconds? At 2am on Mar 12th 2017, Daylight Savings started in the Pacific timezone. 
# That means a whole hour of seconds gets skipped between noon on the 11th and noon on the 12th.


Attaching package: 'lubridate'

The following object is masked from 'package:base':

    date



Time difference of 86400 secs

Time difference of 82800 secs

### Adding or subtracting a time span to a datetime
A common use of time spans is to add or subtract them from a moment in time. For, example 
to calculate the time one day in the future from mar_11 (from the previous exercises), you could do either of:

marr_11 + days(1)
marr_11 + ddays(1)

Try them in the console, you get different results! But which one is the right one? It depends on your intent. 
If you want to account for the fact that time units, in this case days, have different lengths (i.e. due to daylight savings),
you want a period days(). If you want the time 86400 seconds in the future you use a duration ddays().
In this exercise you'll add and subtract timespans from dates and datetimes.

In [4]:
# Add a period of one week to mon_2pm
mon_2pm <- dmy_hm("27 Aug 2018 14:00")
mon_2pm + (weeks(1))

# Add a duration of 81 hours to tue_9am
tue_9am <- dmy_hm("28 Aug 2018 9:00")
tue_9am + dhours(81)

# What were you doing five years ago? Subtract a period of 5 years from today().
# Subtract a period of five years from today()
today() - years(5)

# Subtract a duration of 5 years from today(). Will this give a different date?
# Subtract a duration of five years from today()
today() - dyears(5)

# Why did subtracting a duration of five years from today, give a different answer to subtracting a period of five years? 
# Periods know about leap years, and since five years ago includes at least one leap year (assuming you aren't taking 
# this course in 2100) the period of five years is longer than the duration of 365*5 days.

[1] "2017-03-12 12:00:00 PDT"

[1] "2017-03-12 13:00:00 PDT"

[1] "2018-09-03 14:00:00 UTC"

[1] "2018-08-31 18:00:00 UTC"

### Arithmetic with timespans
You can add and subtract timespans to create different length timespans, and even multiply them by numbers. 
For example, to create a duration of three days and three hours you could do: ddays(3) + dhours(3), or 3*ddays(1) + 3*dhours(1) 
or even 3*(ddays(1) + dhours(1)).
There was an eclipse over North America on 2017-08-21 at 18:26:40. It's possible to predict the next eclipse 
with similar geometry by calculating the time and date one Saros in the future. A Saros is a length of time 
that corresponds to 223 Synodic months, a Synodic month being the period of the Moon's phases, 
a duration of 29 days, 12 hours, 44 minutes and 3 seconds.
Do just that in this exercise!

In [5]:
# Time of North American Eclipse 2017
eclipse_2017 <- ymd_hms("2017-08-21 18:26:40")

# Duration of 29 days, 12 hours, 44 mins and 3 secs
synodic <- ddays(29) + dhours(12) + dminutes(44) + dseconds(3)

# 223 synodic months
saros <- 223 * synodic

# Add saros to eclipse_2017
next_eclipse = eclipse_2017 + saros
print(next_eclipse)

[1] "2035-09-02 02:09:49 UTC"


### Generating sequences of datetimes
By combining addition and multiplication with sequences you can generate sequences of datetimes. 
For example, you can generate a sequence of periods from 1 day up to 10 days with,

10 * days(1)

Then by adding this sequence to a specific datetime, you can construct a sequence of datetimes 
from 1 day up to 10 days into the future

today() + 1:10 * days(1)

You had a meeting this morning at 8am and you'd like to have that meeting at the same time and day 
every two weeks for a year. Generate the meeting times in this exercise.

In [7]:
# Add a period of 8 hours to today
today_8am <- today() + hours(8)

# Sequence of two weeks from 1 to 26
every_two_weeks <- 1:26 * weeks(2)

# Create datetime for every two weeks for a year
today_8am + every_two_weeks

 [1] "2021-01-10 08:00:00 UTC" "2021-01-24 08:00:00 UTC"
 [3] "2021-02-07 08:00:00 UTC" "2021-02-21 08:00:00 UTC"
 [5] "2021-03-07 08:00:00 UTC" "2021-03-21 08:00:00 UTC"
 [7] "2021-04-04 08:00:00 UTC" "2021-04-18 08:00:00 UTC"
 [9] "2021-05-02 08:00:00 UTC" "2021-05-16 08:00:00 UTC"
[11] "2021-05-30 08:00:00 UTC" "2021-06-13 08:00:00 UTC"
[13] "2021-06-27 08:00:00 UTC" "2021-07-11 08:00:00 UTC"
[15] "2021-07-25 08:00:00 UTC" "2021-08-08 08:00:00 UTC"
[17] "2021-08-22 08:00:00 UTC" "2021-09-05 08:00:00 UTC"
[19] "2021-09-19 08:00:00 UTC" "2021-10-03 08:00:00 UTC"
[21] "2021-10-17 08:00:00 UTC" "2021-10-31 08:00:00 UTC"
[23] "2021-11-14 08:00:00 UTC" "2021-11-28 08:00:00 UTC"
[25] "2021-12-12 08:00:00 UTC" "2021-12-26 08:00:00 UTC"

### The tricky thing about months
What should ymd("2020-01-31") + months(1) return? Should it be 30, 31 or 28 days in the future? 
Try it. In general lubridate returns the same day of the month in the next month, but since the 31st of February 
doesn't exist lubridate returns a missing value, NA.
There are alternative addition and subtraction operators: %m+% and %m-% that have different behavior. 
Rather than returning an NA for a non-existent date, they roll back to the last existing date.
You'll explore their behavior by trying to generate a sequence for the last day in every month this year.

In [8]:
jan_31 = ymd("2020-01-31")

# Start by creating a sequence of 1 to 12 periods of 1 month.
month_seq <- 1:12 * months(1)
# Add month_seq to jan_31. Notice what happens to any month where the 31st doesn't exist
jan_31 + month_seq
# Now add month_seq to jan_31 using the %m+% operator.
jan_31 %m+% month_seq
# Try subtracting month_seq from jan_31 using the %m-% operator.
jan_31 %m-% month_seq

### Examining intervals. Reigns of kings and queens
You can create an interval by using the operator %--% with two datetimes. 
For example ymd("2001-01-01") %--% ymd("2001-12-31") creates an interval for the year of 2001.
Once you have an interval you can find out certain properties like its start, end and length with 
int_start(), int_end() and int_length() respectively.
Practice by exploring the reigns of kings and queens of Britain

In [16]:
# import libraries
library(anytime)
library(dplyr)
# Use read.csv()
monarch <- read.csv('monarchs.csv')
monarch
# create  a datetime
monarch_mod <- monarch%>% 
  mutate(
    from_date = as.Date(anytime(from)),
    to_date  = as.Date(anytime(to)))
# check
print(monarch_mod)

# Create an interval for reign
monarchs <- monarch_mod %>%
  mutate(reign = from_date %--% to_date) 

# Find the length of reign, and arrange
monarchs %>%
  mutate(length = int_length(reign)) %>% 
  arrange(desc(length)) %>%
  select(name, length)

name,from,to
Elizabeth II,2/6/1952,12/26/2020
Victoria,1837-06-20,1/22/1901
George V,5/6/1910,1/20/1936
George III,1801-01-01,1820-01-29
George VI,12/11/1936,2/6/1952
George IV,1820-01-29,1830-06-26
Edward VII,1/22/1901,5/6/1910
William IV,1830-06-26,1837-06-20
Edward VIII,1/20/1936,12/11/1936
George III,1760-10-25,1801-01-01


           name       from         to  from_date    to_date
1  Elizabeth II   2/6/1952 12/26/2020 1952-02-05 2020-12-25
2      Victoria 1837-06-20  1/22/1901 1837-06-19 1901-01-21
3      George V   5/6/1910  1/20/1936 1910-05-05 1936-01-19
4    George III 1801-01-01 1820-01-29 1800-12-31 1820-01-28
5     George VI 12/11/1936   2/6/1952 1936-12-10 1952-02-05
6     George IV 1820-01-29 1830-06-26 1820-01-28 1830-06-25
7    Edward VII  1/22/1901   5/6/1910 1901-01-21 1910-05-05
8    William IV 1830-06-26 1837-06-20 1830-06-25 1837-06-19
9   Edward VIII  1/20/1936 12/11/1936 1936-01-19 1936-12-10
10   George III 1760-10-25 1801-01-01 1760-10-24 1800-12-31


name,length
Elizabeth II,2173910400
Victoria,2006726400
George III,1268092800
George V,811296000
George III,601948800
George VI,478224000
George IV,328406400
Edward VII,292982400
William IV,220406400
Edward VIII,28166400


### Comparing intervals and datetimes
A common task with intervals is to ask if a certain time is inside the interval or whether it overlaps with another interval.
The operator %within% tests if the datetime (or interval) on the left hand side is within the interval of the right hand side. 
For example, if y2001 is the interval covering the year 2001,

2001 <- ymd("2001-01-01") %--% ymd("2001-12-31")

Then 

ymd("2001-03-30") %within% y2001 

will return TRUE and 

ymd("2002-03-30") %within% y2001 

will return FALSE.

int_overlaps() performs a similar test, but will return true if two intervals overlap at all.

In [18]:
# one date_time and one period:
hendrix_at_woodstock <- mdy("August 17 1969")
luis_xiv <- dmy("01 October 1620") %--% dmy("16 September 1701")

# Monarch in power on hendrix at woodstock 
monarchs %>% 
  filter(hendrix_at_woodstock %within% reign) %>%
  select(name, from_date, to_date)

# Monarchs whose reign overlaps luis_xiv
monarchs %>% 
  filter(int_overlaps(luis_xiv, reign)) %>%
  select(name, from_date, to_date)

name,from_date,to_date
Elizabeth II,1952-02-05,2020-12-25


name,from_date,to_date


### Converting to durations and periods
Intervals are the most specific way to represent a span of time since they retain information 
about the exact start and end moments. They can be converted to periods and durations exactly: 
it's possible to calculate both the exact number of seconds elapsed between the start and end date, 
as well as the perceived change in clock time.
To do so you use the as.period(), and as.duration() functions, parsing in an interval as the only argument.
Try them out to get better representations of the length of the monarchs reigns.

In [19]:
# New columns for duration and period
monarchs <- monarchs %>%
  mutate(
    duration = as.duration(reign),
    period = as.period(reign)) 
    
# Examine results    
monarchs %>%
  select(name, duration, period)

name,duration,period
Elizabeth II,2173910400s (~68.89 years),68y 10m 20d 0H 0M 0S
Victoria,2006726400s (~63.59 years),63y 7m 2d 0H 0M 0S
George V,811296000s (~25.71 years),25y 8m 14d 0H 0M 0S
George III,601948800s (~19.07 years),19y 0m 28d 0H 0M 0S
George VI,478224000s (~15.15 years),15y 1m 26d 0H 0M 0S
George IV,328406400s (~10.41 years),10y 4m 28d 0H 0M 0S
Edward VII,292982400s (~9.28 years),9y 3m 14d 0H 0M 0S
William IV,220406400s (~6.98 years),6y 11m 25d 0H 0M 0S
Edward VIII,28166400s (~46.57 weeks),10m 21d 0H 0M 0S
George III,1268092800s (~40.18 years),40y 2m 7d 0H 0M 0S
