Diff, patch and merge for data.frames, see http://paulfitz.github.io/daff/
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R fixing no change Jul 5, 2017
examples
inst adding images Jul 6, 2017
man
tests Merge pull request #21 from edwindj/daff-0.3.3 Jul 7, 2017
tools moved fig in README.md into tools directory so it is available to CRAN Mar 31, 2017
.Rbuildignore
.gitignore added merge_data Feb 21, 2015
.travis.yml updated NEWS Apr 6, 2015
DESCRIPTION Man pages and DESCRIPTION generated by Roxygen 6.0.1 instead of 5.0.1. Jul 3, 2017
LICENSE updated DESCRIPTION Feb 23, 2015
NAMESPACE Update namespace to properly export `print.summary_data_diff`. Jun 27, 2017
NEWS.md Update NEWS.md Jul 3, 2017
README.md moved fig in README.md into tools directory so it is available to CRAN Mar 31, 2017
appveyor.yml added badges and appveyor Jul 31, 2015
daff.Rproj

README.md

Daff: diff, patch and merge for data.frames

daff is an R package that can find difference in values between data.frames, store this difference, render it and apply this difference to patch a data.frame. It can also merge two versions of a data.frame having a common parent. It wraps the daff.js library using the V8 package.

The diff format is described in http://dataprotocols.org/tabular-diff-format.

version downloads Build Status AppVeyor Build Status

Working:

  • diff: diff_data
  • patch: patch_data
  • write/read diff: read_diff and write_diff
  • render to html: render_diff
  • merge two tables based on a same version: merge_data

TODO:

  • add htmlwidgets
  • implement extra parameters for diff_data: ids, ignore etc.
  • make column type changes explicit (is now internally available)
  • see if daff can be implemented in C++, using the Haxe C++ target of daff: this would remove the V8/jsonlite dependency

Install

Install from CRAN

install.packages('daff')

The latest version of daff can be installed with devtools

devtools::install_github("edwindj/daff")

Usage

diff_data

Calculate the difference between a reference and a changed data.frame

library(daff)
y <- iris[1:3,]
x <- y

x <- head(x,2) # remove a row
x[1,1] <- 10 # change a value
x$hello <- "world"  # add a column
x$Species <- NULL # remove a column

patch <- diff_data(y, x)

# write a patch to disk
write_diff(patch, "patch.csv")

render_diff(patch) will generate the following HTML page:

render_diff

patch_data

Patch a data.frame using a diff generated with diff_data.

# read a diff from disk
patch <- read_diff("patch.csv")

# apply patch
y_patched <- patch_data(y, patch)

merge_data

Merge two data.frames that have diverged from a common parent data.frame.

parent <- a <- b <- iris[1:3,]
a[1,1] <- 10
b[2,1] <- 11
# succesful merge
merge_data(parent, a, b)

parent <- a <- b <- iris[1:3,]
a[1,1] <- 10
b[1,1] <- 11
# conflicting merge (both a and b change same cell)
merged <- merge_data(parent, a, b)
merged #note the conflict

#find out which rows contain a conflict
which_conflicts(merged)