Daff: diff, patch and merge for data.frames
daff is an R package that can find difference in values between
data.frames, store this difference, render it and apply this difference to patch a
data.frame. It can also merge two versions of a
data.frame having a common parent.
It wraps the daff.js library using the V8 package.
The diff format is described in http://dataprotocols.org/tabular-diff-format.
- write/read diff:
- render to html:
- merge two tables based on a same version:
- add htmlwidgets
- implement extra parameters for
- make column type changes explicit (is now internally available)
- see if daff can be implemented in C++, using the Haxe C++ target of daff: this would remove the V8/jsonlite dependency
Install from CRAN
The latest version of
daff can be installed with
Calculate the difference between a reference and a changed
library(daff) y <- iris[1:3,] x <- y x <- head(x,2) # remove a row x[1,1] <- 10 # change a value x$hello <- "world" # add a column x$Species <- NULL # remove a column patch <- diff_data(y, x) # write a patch to disk write_diff(patch, "patch.csv")
render_diff(patch) will generate the following HTML page:
data.frame using a diff generated with
# read a diff from disk patch <- read_diff("patch.csv") # apply patch y_patched <- patch_data(y, patch)
data.frames that have diverged from a common parent
parent <- a <- b <- iris[1:3,] a[1,1] <- 10 b[2,1] <- 11 # succesful merge merge_data(parent, a, b) parent <- a <- b <- iris[1:3,] a[1,1] <- 10 b[1,1] <- 11 # conflicting merge (both a and b change same cell) merged <- merge_data(parent, a, b) merged #note the conflict #find out which rows contain a conflict which_conflicts(merged)