Skip to content

Commit

Permalink
removed dependency on 'settings'
Browse files Browse the repository at this point in the history
  • Loading branch information
markvanderloo committed Nov 13, 2015
1 parent f482cd4 commit 93c5a6c
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 6 deletions.
2 changes: 1 addition & 1 deletion build/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ Depends: R (>= 3.2.0)
URL: https://github.com/data-cleaning/validate.modify
BugReports: https://github.com/data-cleaning/validate.modify/issues
Date: 2015-09-10
Imports: methods, settings, yaml, validate
Imports: methods, yaml, validate
Suggests: testthat, knitr
VignetteBuilder: knitr
1 change: 0 additions & 1 deletion pkg/R/modify.R
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
#' @name validate.modify
#' @aliases package-validate.modify validate.modify
#' @import methods
#' @import settings
#' @import yaml
#' @import validate
#'
Expand Down
29 changes: 25 additions & 4 deletions pkg/vignettes/introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ vignette: >
### A first statement

In the iris dataset, replace `Sepal.Width` with 4 value if it exceedsd 4.
```{r}
```{r,eval=FALSE}
library(validate.modify)
library(magrittr)
iris %<>% modify_so( if(Sepal.Width > 4 ) Sepal.Width <- 4 )
Expand Down Expand Up @@ -45,6 +45,7 @@ head(retailers[-(1:2)],3)

First we define a set of modifying rules, using `modifier`.
```{r}
library(validate.modify)
m <- modifier(
if (other.rev < 0) other.rev <- -1 * other.rev
, if ( is.na(staff.costs) ) staff.costs <- mean(staff.costs)
Expand Down Expand Up @@ -84,19 +85,39 @@ in the first rule of `m` above evaluates to `NA` in the first record of the `ret

### Performance, and a glimpse under the hood.

You, the user can assume that the rules are evaluated record-by-record. In reality, the package is smart enough to analyse the rules a little bit and to make sure they can be evaluated in a vectorized manner.
You, the user can assume that the rules are evaluated record-by-record. In
reality, the package is smart enough to analyse the rules a little bit and to
make sure they can be evaluated in a vectorized manner. This way explicit (and slow)
R-loops are avoided as much as possible.

In short, when you call `modify`, or `modify_so`, the following steps are performed.

1. The
1. The rules are transformed to statements that can be executed in a vectorized manner by R.
2. If any macros present, they are inserted into the statements
3. For each assignment, the conditions under which they should be executed are collected.
4. The conditions are evaluated and assignments are exectuted on a selection of the data.


### Difference with dplyr::mutate

The functionality of this package in resembles `dplyr::mutate`, since it also
allows one to specify data mutations on data frames (or other tabular data
objects). The dplyr package is especially usefull for interactive use, for use
in programming through the 'underscored' functions such as `mutate_`.

The `validate.modify` package has been developed with a production street in
mind where similar data sets are processed frequently. By taking the modifying
rules out of the software, R programmers can build an application that allows
users that are less knowledgable about programming to specify their modification
rules. Since the package allows for logging of modifications the effect of each
modifying rule can be monitored, and reverted if necessary.







### Difference with dplyr::mutate



Expand Down

0 comments on commit 93c5a6c

Please sign in to comment.