Padding of missing records in time series
Switch branches/tags
Clone or download
1
Latest commit a499e80 Jun 26, 2018

README.md

padr

Build Status codecov.io CRAN_Status_Badge

padr is an R package that assists with preparing time series data. It provides two main functions that will quickly get the data in the format you want. When data is observed on too low a level, thicken will add a column of a higher interval to the data frame, after which the user can apply the appropriate aggregation. When there are missing records for time points where observations were absent, pad will automatically insert these records. A number of fill_ functions help to subsequently fill the missing values.

Usage

library(padr)
library(tidyverse)
coffee <- data.frame(
  time_stamp =  as.POSIXct(c(
    '2016-07-07 09:11:21', '2016-07-07 09:46:48',
    
    '2016-07-09 13:25:17',
    '2016-07-10 10:45:11'
  )),
  amount = c(3.14, 2.98, 4.11, 3.14)
)

coffee %>%
  thicken('day') %>%
  dplyr::group_by(time_stamp_day) %>%
  dplyr::summarise(day_amount = sum(amount)) %>%
  pad() %>%
  fill_by_value(day_amount, value = 0)
## # A tibble: 4 × 2
##   time_stamp_day day_amount
##           <date>      <dbl>
## 1     2016-07-07       6.12
## 2     2016-07-08       0.00
## 3     2016-07-09       4.11
## 4     2016-07-10       3.14

More information

See the the general introduction Vignette for more examples. The implementation details Vignette describes how padr handles different time zones and daylight savings time.