Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up docs #107

Merged
merged 4 commits into from
Apr 27, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
28 changes: 6 additions & 22 deletions R/linear_pool.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,7 @@
#' each combination of model task, output type, and output type id. Supported
#' output types include `mean`, `quantile`, `cdf`, and `pmf`.
#'
#' @param model_outputs an object of class `model_output_df` with component
#' model outputs (e.g., predictions).
#' @param weights an optional `data.frame` with component model weights. If
#' provided, it should have a column named `model_id` and a column containing
#' model weights. Optionally, it may contain additional columns corresponding
#' to task id variables, `output_type`, or `output_type_id`, if weights are
#' specific to values of those variables. The default is `NULL`, in which case
#' an equally-weighted ensemble is calculated.
#' @param weights_col_name `character` string naming the column in `weights`
#' with model weights. Defaults to `"weight"`
#' @param model_id `character` string with the identifier to use for the
#' ensemble model.
#' @param task_id_cols `character` vector with names of columns in
#' `model_outputs` that specify modeling tasks. Defaults to `NULL`, in which
#' case all columns in `model_outputs` other than `"model_id"`, the specified
#' `output_type_col` and `output_type_id_col`, and `"value"` are used as task
#' ids.
#' @inheritParams simple_ensemble
#' @param n_samples `numeric` that specifies the number of samples to use when
#' calculating quantiles from an estimated quantile function. Defaults to `1e4`.
#' @param ... parameters that are passed to `distfromq::make_q_fn`, specifying
Expand All @@ -37,14 +21,14 @@
#' in three steps:
#' 1. Interpolate and extrapolate from the provided quantiles for each component
#' model to obtain an estimate of the cdf of that distribution.
#' 2. Draw samples from the distribution for each component model. To reduce Monte
#' Carlo variability, we use pseudo-random samples corresponding to quantiles
#' of the estimated distribution.
#' 2. Draw samples from the distribution for each component model. To reduce
#' Monte Carlo variability, we use pseudo-random samples corresponding to
elray1 marked this conversation as resolved.
Show resolved Hide resolved
#' quantiles of the estimated distribution.
#' 3. Collect the samples from all component models and extract the desired quantiles.
#' Steps 1 and 2 in this process are performed by `distfromq::make_q_fn`.
#'
#' @return a `model_out_tbl` object of ensemble predictions. Note that any additional
#' columns in the input `model_outputs` are dropped.
#' @return a `model_out_tbl` object of ensemble predictions. Note that any
#' additional columns in the input `model_outputs` are dropped.
#'
#' @export
#'
Expand Down
34 changes: 2 additions & 32 deletions R/linear_pool_quantile.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,39 +2,9 @@
#' (distributional mixture) of component model outputs for the `quantile`
#' output type.
#'
#' @param model_outputs an object of class `model_output_df` with component
#' model outputs (e.g., predictions) with only a `quantile` output type.
#' Should be pre-validated.
#' @param weights an optional `data.frame` with component model weights. If
#' provided, it should have a column named `model_id` and a column containing
#' model weights. Optionally, it may contain additional columns corresponding
#' to task id variables, `output_type`, or `output_type_id`, if weights are
#' specific to values of those variables. The default is `NULL`, in which case
#' an equally-weighted ensemble is calculated. Should be pre-validated.
#' @param weights_col_name `character` string naming the column in `weights`
#' with model weights. Defaults to `"weight"`.
#' @param model_id `character` string with the identifier to use for the
#' ensemble model.
#' @param task_id_cols `character` vector with names of columns in
#' `model_outputs` that specify modeling tasks. Defaults to `NULL`, in which
#' case all columns in `model_outputs` other than `"model_id"`, the specified
#' `output_type_col` and `output_type_id_col`, and `"value"` are used as task
#' ids. Should be pre-validated.
#' @param n_samples `numeric` that specifies the number of samples to use when
#' calculating quantiles from an estimated quantile function. Defaults to `1e4`.
#' @param ... parameters that are passed to `distfromq::make_q_fun`, specifying
#' details of how to estimate a quantile function from provided quantile levels
#' and quantile values for `output_type` `"quantile"`.
#' @inherit linear_pool params details
#' @noRd
#' @details The underlying mechanism for the computations to obtain the quantiles
#' of a linear pool in three steps is as follows:
#' 1. Interpolate and extrapolate from the provided quantiles for each component
#' model to obtain an estimate of the cdf of that distribution.
#' 2. Draw samples from the distribution for each component model. To reduce Monte
#' Carlo variability, we use pseudo-random samples corresponding to quantiles
#' of the estimated distribution.
#' 3. Collect the samples from all component models and extract the desired quantiles.
#' Steps 1 and 2 in this process are performed by `distfromq::make_q_fun`.
#'
#' @return a `model_out_tbl` object of ensemble predictions for the `quantile` output type.
#' @importFrom rlang .data

Expand Down
7 changes: 3 additions & 4 deletions R/simple_ensemble.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
#' model weights. Optionally, it may contain additional columns corresponding
#' to task id variables, `output_type`, or `output_type_id`, if weights are
#' specific to values of those variables. The default is `NULL`, in which case
#' an equally-weighted ensemble is calculated.
#' an equally-weighted ensemble is calculated. Should be pre-validated.
#' @param weights_col_name `character` string naming the column in `weights`
#' with model weights. Defaults to `"weight"`
#' @param agg_fun a function or character string name of a function to use for
Expand All @@ -21,9 +21,8 @@
#' ensemble model.
#' @param task_id_cols `character` vector with names of columns in
#' `model_outputs` that specify modeling tasks. Defaults to `NULL`, in which
#' case all columns in `model_outputs` other than `"model_id"`, the specified
#' `output_type_col` and `output_type_id_col`, and `"value"` are used as task
#' ids.
#' case all columns in `model_outputs` other than `"model_id"`, `"output_type"`,
#' `"output_type_id"`, and `"value"` are used as task ids.
#'
#' @details The default for `agg_fun` is `"mean"`, in which case the ensemble's
#' output is the average of the component model outputs within each group
Expand Down
19 changes: 9 additions & 10 deletions man/linear_pool.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 3 additions & 4 deletions man/simple_ensemble.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.