Skip to content

Commit

Permalink
add note
Browse files Browse the repository at this point in the history
add note to warn against resubstitution bias
  • Loading branch information
mdbrown committed Jul 17, 2018
1 parent b2dd119 commit 5a0cf6c
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 7 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: rmda
Type: Package
Title: Risk Model Decision Analysis
Version: 1.6
Version: 1.7
Date: 2018-05-30
Author: Marshall Brown
Maintainer: Marshall Brown <mdbrown@fredhutch.org>
Expand Down
2 changes: 1 addition & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ rmda v 1.6
========================
* update default y-axis limits for sNB plots to show positive values only.
* print a message warning of bias due to overfitting when the same data is used to fit and evaluate a model using decision_curve.
* fix a bug where CI's for the policy of treating 'None' were not plotted when policy = "opt-out".
* fix a bug where CI's for the policy of treating 'None' were not plotted or printed using summary when policy = "opt-out".

rmda v 1.5
========================
Expand Down
12 changes: 8 additions & 4 deletions R/decision_curve.R
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,11 @@ decision_curve <- function(formula,

#retreive outcome and check
outcome <- data[[all.vars(formula[[2]])]];
#extract the model name from formula
predictors <- c(Reduce(paste, deparse(formula[[3]])), 'All', 'None')
predictor.names <- c(Reduce(paste, deparse(formula)), 'All', 'None')


if(length(unique(outcome)) != 2) stop('outcome variable is not binary (it does not take two unique values).')
stopifnot(is.numeric(outcome))
if(min(outcome) != 0 | max(outcome) != 1) stop('outcome variable must be binary taking on values 0 for control and 1 for case.')
Expand All @@ -145,7 +150,9 @@ decision_curve <- function(formula,
provided.risks <- data[[Reduce(paste, deparse(formula[[3]]))]] #get the name of the fitted risk variable from formula.
if(min(provided.risks) < 0 | max(provided.risks) > 1) stop('When fitted.risks = TRUE, all risks provided must be between 0 and 1.')

}else{
}else if(length(strsplit(predictors[[1]], "+", fixed = TRUE)[[1]]) > 1) {

message("Note: The data provided is used to both fit a prediction model and to estimate the respective decision curve. This may cause bias in decision curve estimates leading to over-confidence in model performance. ")
#print a message about potential bias due to overfitting when the same data is used to fit/evaluate a model.
}
#########
Expand All @@ -155,9 +162,6 @@ decision_curve <- function(formula,
#calculate curves
#first we fit the model

#extract the model name from formula
predictors <- c(Reduce(paste, deparse(formula[[3]])), 'All', 'None')
predictor.names <- c(Reduce(paste, deparse(formula)), 'All', 'None')

#indicate whether we are fitting a model with a formula or not
#the last two are FALSE since they correspond to 'all' and 'none'
Expand Down
2 changes: 1 addition & 1 deletion inst/notes/tutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ First we use the function `decision_curve` to create a decision curve object for
```{r, message=FALSE, warning = FALSE}
set.seed(123)
#first use rmda with the default settings (set bootstraps = 50 here to reduce computation time).
baseline.model <- decision_curve(Cancer~Age + Female + Smokes, #fitting a logistic model
baseline.model <- decision_curve( Cancer~Age + Female + Smokes, #fitting a logistic model
data = dcaData,
study.design = "cohort",
policy = "opt-in", #default
Expand Down

0 comments on commit 5a0cf6c

Please sign in to comment.