Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iic feature #115

Merged
merged 15 commits into from
Aug 13, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions NAMESPACE
Expand Up @@ -22,6 +22,7 @@ S3method(gain_capture,data.frame)
S3method(gain_curve,data.frame)
S3method(huber_loss,data.frame)
S3method(huber_loss_pseudo,data.frame)
S3method(iic,data.frame)
S3method(j_index,data.frame)
S3method(j_index,matrix)
S3method(j_index,table)
Expand Down Expand Up @@ -92,6 +93,8 @@ export(huber_loss)
export(huber_loss_pseudo)
export(huber_loss_pseudo_vec)
export(huber_loss_vec)
export(iic)
export(iic_vec)
export(j_index)
export(j_index_vec)
export(kap)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Expand Up @@ -2,6 +2,8 @@

## New metrics and functionality

* `iic()` is a numeric metric for computing the index of ideality of correlation. It can be seen as a potential alternative to the traditional correlation coefficient, and has been used in QSAR models (@jyuu, #115).

* `average_precision()` is a probability metric that can be used as an alternative to `pr_auc()`. It has the benefit of avoiding any issues of ambiguity in the case where `recall == 0` and the current number of false positives is `0`.

## Other improvements
Expand Down
69 changes: 69 additions & 0 deletions R/conditions.R
@@ -0,0 +1,69 @@
try_cor <- function(truth, estimate) {
handler <- make_cor_handler(truth, estimate)

withCallingHandlers(
expr = cor(truth, estimate),
simpleWarning = handler
)
}

# Below, `!is.null(findRestart("muffleWarning"))` is to ensure that something
# else has not already signaled the warning under a different protocol (like stop()).
# This checks that a "restart" is actually on the stack before trying to muffle
# and restart

make_cor_handler <- function(truth, estimate) {
handle_zero_variance <- function(cnd) {
if (cnd$message != "the standard deviation is zero") {
return(invisible())
}

n_unique_truth <- length(unique(truth))
n_unique_estimate <- length(unique(estimate))

if (n_unique_truth == 1L && !is.null(findRestart("muffleWarning"))) {
warn_correlation_undefined_constant_truth(truth)
rlang::cnd_muffle(cnd)
}

if (n_unique_estimate == 1L && !is.null(findRestart("muffleWarning"))) {
warn_correlation_undefined_constant_estimate(estimate)
rlang::cnd_muffle(cnd)
}

invisible()
}

handle_zero_variance
}

warn_correlation_undefined_constant_truth <- function(truth) {
warn_correlation_undefined(
what = "truth",
truth = truth,
.subclass = "yardstick_warning_correlation_undefined_constant_truth"
)
}

warn_correlation_undefined_constant_estimate <- function(estimate) {
warn_correlation_undefined(
what = "estimate",
estimate = estimate,
.subclass = "yardstick_warning_correlation_undefined_constant_estimate"
)
}

warn_correlation_undefined <- function(what, ..., .subclass = character()) {
message <- paste0(
"A correlation computation is required, but `", what, "` is constant ",
"and has 0 standard deviation, resulting in a divide by 0 error. ",
"`NA` will be returned."
)

rlang::warn(
message = message,
.subclass = c(.subclass, "yardstick_warning_correlation_undefined"),
...
)
}

88 changes: 88 additions & 0 deletions R/num-iic.R
@@ -0,0 +1,88 @@
#' Index of ideality of correlation
#'
#' @description
#'
#' Calculate the index of ideality of correlation. This metric has been
#' studied in QSPR/QSAR models as a good criterion for the predictive
#' potential of these models. It is highly dependent on the correlation
#' coefficient as well as the mean absolute error.
#'
#' Note the application of IIC is useless under two conditions:
#'
#' * When the negative mean absolute error and positive mean absolute
#' error are both zero.
#'
#' * When the outliers are symmetric. Since outliers are context
#' dependent, please use your own checks to validate whether this
#' restriction holds and whether the resulting IIC has
#' interpretative value.
#'
#' The IIC is seen as an alternative to the traditional correlation
#' coefficient and is in the same units as the original data.
#'
#' @family numeric metrics
#' @family accuracy metrics
#' @templateVar metric_fn iic
#' @template return
#'
#' @inheritParams rmse
#'
#' @references Toropova, A. and Toropov, A. (2017). "The index of ideality
#' of correlation. A criterion of predictability of QSAR models for skin
#' permeability?" _Science of the Total Environment_. 586: 466-472.
#'
#' @author Joyce Cahoon
#'
#' @template examples-numeric
#'
#' @export
iic <- function(data, ...) {
UseMethod("iic")
}

class(iic) <- c("numeric_metric", "function")

#' @rdname iic
#' @export
iic.data.frame <- function(data, truth, estimate, na_rm = TRUE, ...) {

metric_summarizer(
metric_nm = "iic",
metric_fn = iic_vec,
data = data,
truth = !!enquo(truth),
estimate = !!enquo(estimate),
na_rm = na_rm,
... = ...
)

}

#' @export
#' @rdname iic
iic_vec <- function(truth, estimate, na_rm = TRUE, ...) {

iic_impl <- function(truth, estimate) {
deltas <- truth - estimate

delta_neg <- deltas[deltas < 0]
delta_pos <- deltas[deltas >= 0]

mae_neg <- mean(abs(delta_neg))
mae_pos <- mean(abs(delta_pos))

adjustment <- min(mae_neg, mae_pos) / max(mae_neg, mae_pos)

try_cor(truth, estimate) * adjustment
}

metric_vec_template(
metric_impl = iic_impl,
truth = truth,
estimate = estimate,
na_rm = na_rm,
cls = "numeric",
...
)

}
18 changes: 10 additions & 8 deletions man/ccc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 10 additions & 8 deletions man/huber_loss.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 10 additions & 8 deletions man/huber_loss_pseudo.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

118 changes: 118 additions & 0 deletions man/iic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.