Skip to content
/ currr Public

Curried functional interfaces atop `dplyr`

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

mikedecr/currr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

currr

currr provides a deferred-evaluation interface for dplyr operations. currr functions create curried versions of dplyr verbs; the user provides all arguments except for the data frame. This allows data-manipulation operations to be composed as functions, without the need to provide data up front.

Installation

devtools::install_github("mikedecr/currr")

Example

A demonstration of currr by contrasting with dplyr.

library(currr)
library(dplyr)

How to filter with dplyr: we need the data (mtcars) and the arguments (mpg == max(mpg)) when we call filter.

# filter mpg == max(mpg)
# with dplyr, data and function are combined
mtcars |> filter(mpg == max(mpg))

How to filter with currr: filtering only requires the arguments (mpg == max(mpg)). This creates a curried filter function that fixes the arguments. We can then call the curried function any time later by passing the data (mtcars).

# with currr, function is separated from the data
flt_max_mpg = filtering(mpg == max(mpg))
flt_max_mpg(mtcars)

The benefit of currr becomes more apparent when functions are composed.

With dplyr, if I want to apply the same filter step on a grouped data frame, I have to type all of my filter code again.

# filter max mpg, by am
# with dplyr: must rewrite the filter step
mtcars |> group_by(am) |> filter(mpg == max(mpg))

But with currr, I already wrote a filter function. I can recycle that function by composing it with another function (grouping).

# with currr, pre-defined functions are reusable and composable
by_am = grouping(am)
(by_am %;% flt_max_mpg)(mtcars)

As my data analysis code grows more complex, the value of pre-defining small functions pays greater rewards.

currr functions produces memoized functions of data frames, which cache values according to their inputs. If a curried function has already computed its result on a data frame, passing the same data frame returns the cached value instead of recomputing it. This lets us achieve the same efficiency of storing "intermediate data frames" without actually needing to manage those intermediate objects.

About

Curried functional interfaces atop `dplyr`

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages