# Lecture 9.2: Dates
<div style="border: 1px double black; padding: 10px; margin: 10px">

**After today's lecture you will understand:**
* Learn how to deal with dates 
</div>

This correpsonds to Chapter 16 of your book



    




In [4]:
library(tidyverse)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.0 ──

[32m✔[39m [34mggplot2[39m 3.3.2     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.0.3     [32m✔[39m [34mdplyr  [39m 1.0.2
[32m✔[39m [34mtidyr  [39m 1.1.1     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 1.3.1     [32m✔[39m [34mforcats[39m 0.5.0

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mlubridate[39m::[32mas.difftime()[39m masks [34mbase[39m::as.difftime()
[31m✖[39m [34mlubridate[39m::[32mdate()[39m        masks [34mbase[39m::date()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m          masks [34mstats[39m::filter()
[31m✖[39m [34mlubridate[39m::[32mintersect()[39m   masks [34mbase[39m::intersect()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m             masks [34mstats[39m::lag()
[31m✖[39m [34mlubridate[39m::[32msetdiff()[39m     masks [34mb

## Dates
Most of us have a pretty firm grasp on dates. But they can be more complicated than you might think. For instance,  does every year have 365 days?

As before, we'll rely on the `lubridate` package to work with dates:

In [3]:
library(lubridate)
library(tidyverse)
library(nycflights13)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.1 ──

[32m✔[39m [34mggplot2[39m 3.3.5     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.3     [32m✔[39m [34mdplyr  [39m 1.0.7
[32m✔[39m [34mtidyr  [39m 1.1.3     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.0.1     [32m✔[39m [34mforcats[39m 0.5.1

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mlubridate[39m::[32mas.difftime()[39m masks [34mbase[39m::as.difftime()
[31m✖[39m [34mlubridate[39m::[32mdate()[39m        masks [34mbase[39m::date()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m          masks [34mstats[39m::filter()
[31m✖[39m [34mlubridate[39m::[32mintersect()[39m   masks [34mbase[39m::intersect()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m             masks [34mstats[39m::lag()
[31m✖[39m [34mlubridate[39m::[32msetdiff()[39m     masks [34mb

There are three different date classes in R:

* A date, printed in tibble as `<date>`, represents a full day on the calendar.
* A time within a day, printed as `<time>`, represents a specific time within an (unspecified) day.
* A date-time is a date plus a time (tibble: `<dttm>`). A date-time uniquely identifies an instant in time (up to a given precision, usually 1 second.)

We've already seen examples of date-times in the `flights` tibble:

You should tend to favor working with dates over date-times if possible. The latter are more complicated because of the need to handle time zones.



We can get the current date and date-time using the `today()` and `now()` commands:

### Converting strings to dates and times
Frequently you will be encounter date and/or time data stored as text. You will need to convert these data into the native R date classes in order to use date functions on them. The `mdy/ymd/dmy` functions accomplish this.

### Other languages
You might find yourself needing to parse dates in other languages. This will fail if the dates are in a language which is different from your system's language:

Fix this by specifying the *locale* option:

### Date-time parsers
There are also equivalent functions for parsing date-times:

### Making a date-time from components
We saw in the `flights` table that date information can be spread across multiple columns. The `make_date` and `make_datetime` functions can make dates from these:

#### Example
The `flights` table has scheduled as well as actual arrival and departure times. Let's create a date-time variable from the actual departure time:

Now we can use built in R command to query and plot these data based on actual departure time:

Notice that this command and resulting graph looks much nicer than if we did not use the date classes:
* The filter on `dep_time` looks more natural than `filter(day<2, month==1, year==2013)`.
* We can easily express the bin-width of 600 seconds in `geom_freqpoly(binwidth = 600)`. When you use date-times in a numeric context (like in a histogram), 1 means 1 second; for dates, 1 means 1 day.
* The plot x axis has nice readable labels.

### The epoch
UNIX systems sometimes represent time as "the number of seconds which have elapsed since 1-1-1970." This date is known as "the epoch". So you may occasionally come across date-times that look like:

To convert these to date format you can use `as_datetime`:

### Exercise
Alice was born on August 1, 1999. How many seconds old is Alice? (To the nearest million, say).

### Date-time components
The functions `year()`, `month()`, `mday()` (day of the month), `yday()` (day of the year), `wday()` (day of the week), `hour()`, `minute()`, and `second()` can extract components from dates and times:

By extracting the minute component of scheduled departure time, we uncover a surprising pattern:

On the other hand, when grouped by scheduled departure time the delays seem to be random:

As explained by the book, there is a bias in scheduled departure times towards nice round numbers:

#### As accessors
The component functions also work as accessors, meaning they can be used on the left-hand side of an assignment:

To return a new (date)time rather than modifying in place, you can use the `update` command:

### Time spans
Time spans are the difference between two time points. These are represented in R by the `difftime` class:

Because it is usually simpler to reason about time differences in terms of a single number, `lubridate` also provides a `duration` class which is stored in terms of seconds:

Arithmetic with durations works as you would expect:

You have to be careful when adding durations and dates. What is 1 day after October 31st at 1pm?

To prevent this sort of thing from happening, `lubridate` also offers objects called "periods":

Similarly, periouds have the expected behaviour if you add one year to a leap year:

### Exercise
Jack is 20,000 days old today. What is Jack's birthday?

#### Example
Earlier in the semester we saw how some flights seem to have arrived before they departed:

This is because these are overnight flights. To fix this, we can now simply add one day to `arr_time`:

### Time zones
When we create a date-time, the default time zone is "UTC":

This is a standard time zone which is, for historical reasons, equal to time in Greenwich, England.

If your times are coming from a different time zone you must specify using the `tz=` option:

The command `OlsonNames()` will list all the possible time zones: