You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One issue that distplyr hasn't considered yet is that of alternative parameterizations of a distribution.
For example, dst_norm() accepts mean and variance as the parameters, but other parameterizations would be useful, such as specifying, say, the 0.05- and 0.95-quantiles (use case: "I'm pretty sure this expense will be between $100 and $200", just set these as lower and upper quantiles). Even mean and stdev would be useful.
I have two ideas for achieving this, but they shouldn't be implemented until after the package is submitted to CRAN.
A parametric distribution is still only specified using its "canonical" parameters (mean and variance for Normal; alpha and beta for Beta; etc.), but changed downstream.
This would operate rlang's tidy evaluation machinery, leaving unevaluated parameters as unknown variables (quosures, probably), until they are evaluated downstream. Bonus: this would also allow for scenarios like dst_unif(a, a + 1), where a is unknown.
Specify parameters by name in the dst_ call.
I'm thinking along the lines of ggplot2's aes() function. The aes() function "parameterizes" a plot according to some aesthetics (see ?geom_linerange for an example of being able to parameterize in more than one way). Except having a special function like aes() might not be useful here, because the parameters can be specified right in the dst_*() call.
So, instead of:
dst_norm <- function(mean, variance) { ... }
we would have something like:
dst_norm <- function(mean, variance, ...)
akin to aes <- function(x, y, ...).
Calls would look something like:
dst_norm(0, 1) # For mean and variance
dst_norm(0, stdev = 5) # For mean and standard deviation
dst_norm(0.05 ~ 100, 0.95 ~ 200) # For the 0.05- and 0.95-quantiles.
The text was updated successfully, but these errors were encountered:
One issue that distplyr hasn't considered yet is that of alternative parameterizations of a distribution.
For example,
dst_norm()
acceptsmean
andvariance
as the parameters, but other parameterizations would be useful, such as specifying, say, the 0.05- and 0.95-quantiles (use case: "I'm pretty sure this expense will be between $100 and $200", just set these as lower and upper quantiles). Evenmean
andstdev
would be useful.I have two ideas for achieving this, but they shouldn't be implemented until after the package is submitted to CRAN.
Something like:
This would operate rlang's tidy evaluation machinery, leaving unevaluated parameters as unknown variables (quosures, probably), until they are evaluated downstream. Bonus: this would also allow for scenarios like
dst_unif(a, a + 1)
, wherea
is unknown.dst_
call.I'm thinking along the lines of ggplot2's
aes()
function. Theaes()
function "parameterizes" a plot according to some aesthetics (see?geom_linerange
for an example of being able to parameterize in more than one way). Except having a special function likeaes()
might not be useful here, because the parameters can be specified right in thedst_*()
call.So, instead of:
we would have something like:
akin to
aes <- function(x, y, ...)
.Calls would look something like:
The text was updated successfully, but these errors were encountered: