README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)
```
# dpurifyr

[![Build Status](https://travis-ci.org/teramonagi/dpurifyr.svg?branch=master)](https://travis-ci.org/teramonagi/dpurifyr)
[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/dpurifyr)](http://cran.r-project.org/package=dpurifyr) 

## Overview
dpurifyr package gives you a practical way for `data preprocessing` providing a consistent set of verbs that help you solve the most common data preprocessing challenges.

## Installation
You can install from CRAN with:
```{r gh-installation, eval = FALSE}
# Not yet
# install.packages("dpurifyr")

# Or the development version from GitHub:
# install.packages("devtools")
devtools::install_github("teramonagi/dpurifyr")
```

## Simple Example
The following example uses dpurifyr to solve a fairly realistic problem: apply different types of data preprocessing (`standard_scale` and `scale_minmax`) to columns (`dplyr::starts_with("Sepal")` and `Petal.Width`) selected by the way which is consistent with other tidyverse packages.

```{r example-1, message=FALSE}
library("dpurifyr")
df <- head(iris)
# Create Data Pre-Processing chain while data preproessing for df is done.
pp <- dpurifyr::scale_standard(df, dplyr::starts_with("Sepal")) %>% 
  dpurifyr::scale_minmax(Petal.Width) 

# PreProcessing object(pp) behave like data.frame object
head(pp)
```

Once you get `preprocessing` object, `preprocessing` object can be applied to the other data.
This means that you can apply the same `preprocessing` with fixed parameter to other data. 
```{r example-2, message=FALSE}
# You can apply the same preprocessing to different data.frame
pp <- dpurifyr::scale_standard(head(iris), Sepal.Width, Petal.Length) %>% 
  dpurifyr::scale_standard(Sepal.Length) 
# Using the same parameters( `use_param=TRUE`) estimated in pp object.
pp2 <- dpurifyr::apply(head(iris, 10), pp, TRUE) 
pp2
```

You can find more information in [vignettes](https://github.com/teramonagi/dpurifyr/blob/master/vignettes/introduction-to-dpurifyr.Rmd).

## Contribution
- If you encounter a clear bug, please file a minimal reproducible example on [github](https://github.com/teramonagi/dupurifyr/issues)


## Citation

To cite package `dpurifyr` in publications use:

```
Nagi Teramo, Shinichi Takayanagi (2017). dpurifyr: A grammar of data preprocessing. R package version 0.1.0. https://github.com/teramonagi/dpurifyr
```

A BibTeX entry for LaTeX users is
```
@Manual{,
  title = {dpurifyr: A grammar of data preprocessing},
  author = {Nagi Teramo, Shinichi Takayanagi},
  year = {2017}, 
  note = {R package version 0.1.0},
  url = {https://github.com/teramonagi/dpurifyr},
}
```