diff --git a/NEWS.md b/NEWS.md index 8c4ad86d..ec5b4471 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,8 @@ # hardhat (development version) +* Added more documentation about importance and frequency weights in + `?importance_weights()` and `?frequency_weights()` (#214). + # hardhat 1.2.0 * We have reverted the change made in hardhat 1.0.0 that caused recipe diff --git a/R/case-weights.R b/R/case-weights.R index 3b8f1652..762c73ac 100644 --- a/R/case-weights.R +++ b/R/case-weights.R @@ -8,6 +8,15 @@ #' are supplied as a non-negative double vector, where fractional values are #' allowed. #' +#' @details +#' Importance weights focus on how much each row of the data set should +#' influence model estimation. These can be based on data or arbitrarily set to +#' achieve some goal. +#' +#' In tidymodels, importance weights only affect the model estimation and +#' _supervised_ recipes steps. They are not used with yardstick functions for +#' calculating measures of model performance. +#' #' @param x A double vector. #' #' @return A new importance weights vector. @@ -117,6 +126,14 @@ vec_ptype_abbr.hardhat_importance_weights <- function(x, ...) { #' are supplied as a non-negative integer vector, where only whole numbers are #' allowed. #' +#' @details +#' Frequency weights are integers that denote how many times a particular row of +#' the data has been observed. They help compress redundant rows into a single +#' entry. +#' +#' In tidymodels, frequency weights are used for all parts of the preprocessing, +#' model fitting, and performance estimation operations. +#' #' @param x An integer vector. #' #' @return A new frequency weights vector. diff --git a/man/frequency_weights.Rd b/man/frequency_weights.Rd index a830251b..6bbd1f1c 100644 --- a/man/frequency_weights.Rd +++ b/man/frequency_weights.Rd @@ -20,6 +20,14 @@ to compactly repeat an observation a set number of times. Frequency weights are supplied as a non-negative integer vector, where only whole numbers are allowed. } +\details{ +Frequency weights are integers that denote how many times a particular row of +the data has been observed. They help compress redundant rows into a single +entry. + +In tidymodels, frequency weights are used for all parts of the preprocessing, +model fitting, and performance estimation operations. +} \examples{ # Record that the first observation has 10 replicates, the second has 12 # replicates, and so on diff --git a/man/importance_weights.Rd b/man/importance_weights.Rd index 56030b40..1a71613e 100644 --- a/man/importance_weights.Rd +++ b/man/importance_weights.Rd @@ -20,6 +20,15 @@ to apply a context dependent weight to your observations. Importance weights are supplied as a non-negative double vector, where fractional values are allowed. } +\details{ +Importance weights focus on how much each row of the data set should +influence model estimation. These can be based on data or arbitrarily set to +achieve some goal. + +In tidymodels, importance weights only affect the model estimation and +\emph{supervised} recipes steps. They are not used with yardstick functions for +calculating measures of model performance. +} \examples{ importance_weights(c(1.5, 2.3, 10)) }