# Parsing Dates
These are the solutions to selected code blocks. You will have to run the code blocks to see what they do.

In [1]:
# Load libraries
library(lubridate)  # lubridate
library(tidyverse)  # tidyverse


Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.2.1 ──
[32m✔[39m [34mggplot2[39m 3.2.0     [32m✔[39m [34mpurrr  [39m 0.2.5
[32m✔[39m [34mtibble [39m 2.1.3     [32m✔[39m [34mdplyr  [39m 0.8.3
[32m✔[39m [34mtidyr  [39m 0.8.1     [32m✔[39m [34mstringr[39m 1.3.1
[32m✔[39m [34mreadr  [39m 1.1.1     [32m✔[39m [34mforcats[39m 0.3.0
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mlubridate[39m::[32mas.difftime()[39m masks [34mbase[39m::as.difftime()
[31m✖[39m [34mlubridate[39m::[32mdate()[39m        masks [34mbase[39m::date()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m          masks [34mstats[39m::filter()
[31m✖[39m [34mlubridate[39m::[32mintersect()[39m   masks [34mbase[39m::intersect()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m             mask

In [2]:
# Use library(help = "package")
# to see the functions
library(help = "lubridate")

## today()
The current date. 
### Usage
today(tzone = "")

In [3]:
# Execute today()
today()

## now()
The current time. 
### Usage
now(tzone = "")

In [4]:
# Execute now()
now()

[1] "2019-12-04 12:47:38 PST"

In [11]:
x <- c("09-01-01", "09-01-02", "09-01-03")
ymd(x)
x <- c("2009-01-01", "2009-01-02", "2009-01-03")
ymd(x)

x <- "2009/Nov"
ymd(x)

“All formats failed to parse. No formats found.”

## ymd()
Parse dates according to the order in that year, month, and day elements appear in the input vector. Transforms dates stored in character and numeric vectors to Date or POSIXct objects (see tz argument). These functions recognize arbitrary non-digit separators as well as no separator. As long as the order of formats is correct, these functions will parse dates correctly even when the input vectors contain differently formatted dates.
### Usage
* ymd(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* ydm(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* mdy(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* myd(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* dmy(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* dym(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"),
  truncated = 0)

* yq(..., quiet = FALSE, tz = NULL, locale = Sys.getlocale("LC_TIME"))

In [5]:
# Create some dates

# Typical ANSI standard date
yyyy_mm_dd <- "2008-12-22"

# Text short month and underscore delimiter
yyyy_MM_dd <- "2011_Jun_09"

# Text full month and day with comma separation
MMM_DD_yyyy <- "November 16th, 1970"  

# Mix of - and _
dd_mm_yyyy <- "30-07_1971" 

# This is a number not a string
yyyymmdd <- 19701116  

# Create your own custom date string to try
my_custom_date <- "2008/02/29"

# Create data frame df_date
# Column birthdate_str with all the dates above
df_date <- data_frame(birthdate_str = c(yyyy_mm_dd, yyyy_MM_dd, 
                                   MMM_DD_yyyy, dd_mm_yyyy,
                                  yyyymmdd, my_custom_date))

# Glimpse result
glimpse(df_date)

“`data_frame()` is deprecated, use `tibble()`.

Observations: 6
Variables: 1
$ birthdate_str [3m[90m<chr>[39m[23m "2008-12-22", "2011_Jun_09", "November 16th, 1970", "30…


In [6]:
# Create birthdate columns in df
# birthdate_ymd using ymd()
# birthdate_mdy using mdy()
# birthday_dmy using dmy()
df_date <- df_date %>% mutate(birthdate_ymd = ymd(birthdate_str),
                   birthdate_mdy = mdy(birthdate_str),
                   birthdate_dmy = dmy(birthdate_str))

# Display result
df_date


“ 4 failed to parse.”

birthdate_str,birthdate_ymd,birthdate_mdy,birthdate_dmy
<chr>,<date>,<date>,<date>
2008-12-22,2008-12-22,,
2011_Jun_09,2011-06-09,,2009-11-20
"November 16th, 1970",,1970-11-16,
30-07_1971,,,1971-07-30
19701116,1970-11-16,,
2008/02/29,2008-02-29,,


Were all the dates able to be parsed by at least one method?

* Delimiters such as / or - and a mix of both are supported
* Months and days can be numerical or text
* Most important element is consistent order of year, month, and day known ahead of time

In [7]:
# Create some timestamps

# ymd hms
ymd_hms <- "2017-10-04 15:22:06"

# mdy hms with timezone
mdy_hms_timezone <- "10/04/2017 15:22:06 PDT"

# ymd hms fraction of seconds
mdy_hms_ms <- "2017-10-04 15:22:06.123456789"

# ymd hms am pm
mdy_hms_pm <- "2017-10-04 3:22:06 PM"

# hms
hms <- "15:22:06"

# ymd_hm
ymd_hm <- "2017-10-04 15:22"

# Create your own custom time string to try
my_custom_time <- "2017-10-04T15:22:06"

# Create data frame df_time
# Column timestamp_str with all the times above
df_time <- data_frame(timestamp_str = c(ymd_hms, mdy_hms_timezone, 
                                   mdy_hms_ms, mdy_hms_pm, hms,
                                   ymd_hm, my_custom_time))

# Glimpse result
glimpse(df_time)

Observations: 7
Variables: 1
$ timestamp_str [3m[90m<chr>[39m[23m "2017-10-04 15:22:06", "10/04/2017 15:22:06 PDT", "2017…


In [8]:
# Create birthdate columns in df
# timestamp_ymd_hms using ymd_hms()
# timestamp_mdy_hms using mdy_hms()
# timestamp_ymd_hm using ymd_hm()
# timestamp_hms using hms()
df_time <- df_time %>% mutate(timestamp_ymd_hms = ymd_hms(timestamp_str),
                   timestamp_mdy_hms = mdy_hms(timestamp_str),
                   timestamp_ymd_hm = ymd_hm(timestamp_str),
                   timestamp_hms = hms(timestamp_str))

# Display result
df_time

“Some strings failed to parse, or all strings are NAs”

timestamp_str,timestamp_ymd_hms,timestamp_mdy_hms,timestamp_ymd_hm,timestamp_hms
<chr>,<dttm>,<dttm>,<dttm>,<Period>
2017-10-04 15:22:06,2017-10-04 15:22:06,,,
10/04/2017 15:22:06 PDT,,2017-10-04 15:22:06,,
2017-10-04 15:22:06.123456789,2017-10-04 15:22:06,,,
2017-10-04 3:22:06 PM,2017-10-04 15:22:06,,,
15:22:06,,,,15H 22M 6S
2017-10-04 15:22,,,2017-10-04 15:22:00,
2017-10-04T15:22:06,2017-10-04 15:22:06,,,


Were all the times able to be parsed by at least one method?

* Handles timezone
* Handles AM / PM and 24 hour
* Most important element is consistent order of year, month, day, hours, minutes, and seconds known ahead of time

## year(), month(), day()
Get/set days component of a date-time.
### Usage
year(x)

month(x, label = FALSE, abbr = TRUE)

day(x)

mday(x)

wday(x, label = FALSE, abbr = TRUE)

qday(x)

yday(x)

In [9]:
# Create df_date_components from df_date
# Create columns for year, month, and day
# from df_date$birthdate_ymd
# birthdate_year use year()
# birthdate_month use month()
# birthdate_day use day()
df_date_components <- df_date %>% 
   mutate(birthdate_year = year(birthdate_ymd),
         birthdate_month = month(birthdate_ymd),
         birthdate_day = day(birthdate_ymd))

# Display df_date
df_date_components


birthdate_str,birthdate_ymd,birthdate_mdy,birthdate_dmy,birthdate_year,birthdate_month,birthdate_day
<chr>,<date>,<date>,<date>,<dbl>,<dbl>,<int>
2008-12-22,2008-12-22,,,2008.0,12.0,22.0
2011_Jun_09,2011-06-09,,2009-11-20,2011.0,6.0,9.0
"November 16th, 1970",,1970-11-16,,,,
30-07_1971,,,1971-07-30,,,
19701116,1970-11-16,,,1970.0,11.0,16.0
2008/02/29,2008-02-29,,,2008.0,2.0,29.0


Use year(), month(), and day() to split a date into its components.

## make_date()
Efficient creation of date-times from numeric representations.
### Usage
make_datetime(year = 1970L, month = 1L, day = 1L, hour = 0L, min = 0L,
  sec = 0, tz = "UTC")

make_date(year = 1970L, month = 1L, day = 1L)

In [10]:
# Create date from separate year, month, day numeric columns
# Use df_date_components and create df_date_comp2
# birthdate_year, birthdate_month, birthdate_day
# Create new column, birthdate_comp from above columns
df_date_comp2 <- df_date_components %>%
   mutate(birthdate_comp = make_date(birthdate_year, birthdate_month, birthdate_day))
          
# Display df_date_components
df_date_comp2


birthdate_str,birthdate_ymd,birthdate_mdy,birthdate_dmy,birthdate_year,birthdate_month,birthdate_day,birthdate_comp
<chr>,<date>,<date>,<date>,<dbl>,<dbl>,<int>,<date>
2008-12-22,2008-12-22,,,2008.0,12.0,22.0,2008-12-22
2011_Jun_09,2011-06-09,,2009-11-20,2011.0,6.0,9.0,2011-06-09
"November 16th, 1970",,1970-11-16,,,,,
30-07_1971,,,1971-07-30,,,,
19701116,1970-11-16,,,1970.0,11.0,16.0,1970-11-16
2008/02/29,2008-02-29,,,2008.0,2.0,29.0,2008-02-29


Do the dates in birthdate_comp match the dates in birthdate_ymd?

## as_date(), as_datetime()
Convert an object to a date or date-time. 
### Usage
as_date(x, ...)

as_datetime(x, ...)

In [11]:
# Convert from datetime to date
# Create df_time from df_time_date
# Add column birthdate
# from column timestamp_ymd_hms
# Hint: use as_date()
df_time_date <- df_time %>% 
   mutate(birthdate = as_date(timestamp_ymd_hms))

# Glimpse df_time_date
glimpse(df_time_date)

Observations: 7
Variables: 6
$ timestamp_str     [3m[90m<chr>[39m[23m "2017-10-04 15:22:06", "10/04/2017 15:22:06 PDT", "…
$ timestamp_ymd_hms [3m[90m<dttm>[39m[23m 2017-10-04 15:22:06, NA, 2017-10-04 15:22:06, 2017…
$ timestamp_mdy_hms [3m[90m<dttm>[39m[23m NA, 2017-10-04 15:22:06, NA, NA, NA, NA, NA
$ timestamp_ymd_hm  [3m[90m<dttm>[39m[23m NA, NA, NA, NA, NA, 2017-10-04 15:22:00, NA
$ timestamp_hms     [3m[90m<Period>[39m[23m NA, NA, NA, NA, 15H 22M 6S, NA, NA
$ birthdate         [3m[90m<date>[39m[23m 2017-10-04, NA, 2017-10-04, 2017-10-04, NA, NA, 20…


In [12]:
# Convert from date to datetime
# Create df_date_time from df_date
# Add column birthdatetime
# from column birthdate_ymd
# Hint: Use as_datetime()
df_date_time <- df_date %>% 
   mutate(birthdatetime = as_datetime(birthdate_ymd))

# Glimpse df_date_time
glimpse(df_date_time)

Observations: 6
Variables: 5
$ birthdate_str [3m[90m<chr>[39m[23m "2008-12-22", "2011_Jun_09", "November 16th, 1970", "30…
$ birthdate_ymd [3m[90m<date>[39m[23m 2008-12-22, 2011-06-09, NA, NA, 1970-11-16, 2008-02-29
$ birthdate_mdy [3m[90m<date>[39m[23m NA, NA, 1970-11-16, NA, NA, NA
$ birthdate_dmy [3m[90m<date>[39m[23m NA, 2009-11-20, NA, 1971-07-30, NA, NA
$ birthdatetime [3m[90m<dttm>[39m[23m 2008-12-22, 2011-06-09, NA, NA, 1970-11-16, 2008-02-29


* Notice the data types from glimpse(). 
* The conversion to date uses < date > 
* The conversion to time uses < dttm >

# Summary
Lubridate includes several powerful date parsing functions to convert your string dates into date or date time data types.