# Date Manipulation

This code is going to help us learn about formatting dates.

#### Step 1: Import a data set using the reader package.

In [2]:
library(readr)

health <- read_csv("data/health-app-data-subset.csv")

Parsed with column specification:
cols(
  Start = col_character(),
  Finish = col_character(),
  `Active Calories (kcal)` = col_integer(),
  `Body Fat Percentage (%)` = col_integer(),
  `Body Mass Index (count)` = col_integer(),
  `Dietary Calories (cal)` = col_integer(),
  `Distance (mi)` = col_double(),
  `Flights Climbed (count)` = col_integer(),
  `Heart Rate (count/min)` = col_integer(),
  `Steps (count)` = col_double()
)


In [3]:
summary(health)

    Start              Finish          Active Calories (kcal)
 Length:1110        Length:1110        Min.   :0             
 Class :character   Class :character   1st Qu.:0             
 Mode  :character   Mode  :character   Median :0             
                                       Mean   :0             
                                       3rd Qu.:0             
                                       Max.   :0             
                                       NA's   :1081          
 Body Fat Percentage (%) Body Mass Index (count) Dietary Calories (cal)
 Min.   :0               Min.   :0               Min.   :0             
 1st Qu.:0               1st Qu.:0               1st Qu.:0             
 Median :0               Median :0               Median :0             
 Mean   :0               Mean   :0               Mean   :0             
 3rd Qu.:0               3rd Qu.:0               3rd Qu.:0             
 Max.   :0               Max.   :0               Max.   :0             


Notice from the summary function that the Start and Finish columns, which represent dates, are shown as characters.  This is how we know we need to manipulate the format of the date so we can transform the data into other useful forms, for example, what day of the week it is.

#### Step 2: Use the parsedate package to transform the characters into a date that can be further worked on.

In [4]:
install.packages("parsedate")
library(parsedate)

health$date <- parse_date(health$Start) #parsedate package

summary(health)

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done

Attaching package: ‘parsedate’

The following object is masked from ‘package:readr’:

    parse_date



ERROR: Error in seq_len(sum(positive)): argument must be coercible to non-negative integer


Now view the format of the new column we added, called date.

#### Step 3: Use the lubridate package to then transform dates into forms like day of week, month, day, and year.  We can then use these pieces of data to create graphs and look for patterns based on time period.

In [5]:
install.packages("lubridate")
library(lubridate)

health$newdate <- as.Date(health$date)
health$dayofweek <- format(health$newdate, "%A")
health$month <- format(health$newdate, "%B")
health$day <- format(health$newdate, "%d")
health$year <- format(health$newdate, "%Y")

summary(health)

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done

Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date

“Unknown or uninitialised column: 'date'.”

ERROR: Error in as.Date.default(health$date): do not know how to convert 'health$date' to class “Date”


In [None]:
head(health)

By viewing the data frame, the five columns at the end show what we did in this exercise: Transform text into a date format and then parse that date into different periods of time like days or months for futher analysis.