New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update tidy_distribution_summary_tbl()
function
#211
Comments
Function: #' Tidy Distribution Summary Statistics Tibble
#'
#' @family Summary Statistics
#' @family Table Data
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @details This function takes in a `tidy_` distribution table and
#' will return a tibble of the following information:
#' - `sim_number`
#' - `mean_val`
#' - `median_val`
#' - `std_val`
#' - `min_val`
#' - `max_val`
#' - `skewness`
#' - `kurtosis`
#' - `range`
#' - `iqr`
#' - `variance`
#'
#' The kurtosis and skewness come from the package `healthyR.ai`
#'
#' @description This function returns a summary statistics tibble. It will use the
#' y column from the `tidy_` distribution function.
#'
#' @param .data The data that is going to be passed from a a `tidy_` distribution
#' function.
#' @param ... This is the grouping variable that gets passed to [dplyr::group_by()]
#' and [dplyr::select()].
#'
#' @examples
#' library(dplyr)
#'
#' tn <- tidy_normal(.num_sims = 5)
#' tb <- tidy_beta(.num_sims = 5)
#'
#' tidy_distribution_summary_tbl(tn)
#' tidy_distribution_summary_tbl(tn, sim_number)
#'
#' data_tbl <- tidy_combine_distributions(tn, tb)
#'
#' tidy_distribution_summary_tbl(data_tbl)
#' tidy_distribution_summary_tbl(data_tbl, dist_type)
#'
#' @return
#' A summary stats tibble
#'
#' @export
#'
tidy_distribution_summary_tbl <- function(.data, ...) {
# Get the data attributes
atb <- attributes(.data)
if (!"tibble_type" %in% names(atb) & !"tibble_type" %in% names(atb$all)) {
rlang::abort("The data passed must come from a `tidy_` distribution function.")
}
data_tbl <- dplyr::as_tibble(.data)
summary_tbl <- data_tbl %>%
dplyr::group_by(...) %>%
dplyr::select(..., y) %>%
dplyr::summarise(
mean_val = mean(y, na.rm = TRUE),
median_val = stats::median(y, na.rm = TRUE),
std_val = sd(y, na.rm = TRUE),
min_val = min(y),
max_val = max(y),
skewness = tidy_skewness_vec(y),
kurtosis = tidy_kurtosis_vec(y),
range = tidy_range_statistic(y),
iqr = stats::IQR(y),
variance = stats::var(y),
ci_low = ci_lo(y),
ci_high = ci_hi(y)
) %>%
dplyr::ungroup()
return(summary_tbl)
} Example: > library(dplyr)
>
> tn <- tidy_normal(.num_sims = 5)
> tb <- tidy_beta(.num_sims = 5)
>
> tidy_distribution_summary_tbl(tn)
# A tibble: 1 × 12
mean_val median_val std_val min_val max_val skewness kurtosis range iqr
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.0210 0.0807 0.990 -2.68 2.22 -0.0478 2.78 4.90 1.22
# … with 3 more variables: variance <dbl>, ci_low <dbl>, ci_high <dbl>
> tidy_distribution_summary_tbl(tn, sim_number)
# A tibble: 5 × 13
sim_number mean_val median_val std_val min_val max_val skewness kurtosis range
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.122 0.0777 0.895 -1.45 2.12 0.480 2.74 3.57
2 2 0.0456 0.153 0.945 -1.79 2.02 0.0614 2.51 3.81
3 3 -0.0754 -0.107 1.19 -2.68 1.99 -0.142 2.34 4.67
4 4 -0.167 0.0313 0.887 -2.17 1.95 -0.195 2.54 4.12
5 5 0.180 0.231 1.00 -2.12 2.22 -0.189 3.07 4.34
# … with 4 more variables: iqr <dbl>, variance <dbl>, ci_low <dbl>, ci_high <dbl>
>
> data_tbl <- tidy_combine_distributions(tn, tb)
>
> tidy_distribution_summary_tbl(data_tbl)
# A tibble: 1 × 12
mean_val median_val std_val min_val max_val skewness kurtosis range iqr
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.254 0.338 0.764 -2.68 2.22 -0.755 4.52 4.90 0.671
# … with 3 more variables: variance <dbl>, ci_low <dbl>, ci_high <dbl>
> tidy_distribution_summary_tbl(data_tbl, dist_type)
# A tibble: 2 × 13
dist_type mean_val median_val std_val min_val max_val skewness kurtosis range
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Gaussian c(… 0.0210 0.0807 0.990 -2.68 2.22 -0.0478 2.78 4.90
2 Beta c(1, 1… 0.486 0.476 0.282 0.00133 0.997 0.113 1.88 0.995
# … with 4 more variables: iqr <dbl>, variance <dbl>, ci_low <dbl>, ci_high <dbl> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add
ci_lo
andci_hi
totidy_distribution_summary_tbl()
The text was updated successfully, but these errors were encountered: