create_summary_report doesn't support more than 5 populations or models to compare #70

filbert42 · 2022-07-21T08:30:36Z

When I'm trying to compare more than 5 populations or models with create_summary_report it throws me an error:

<simpleError in data.frame(value = lvls_str, label = labels_values, stringsAsFactors = FALSE): arguments imply differing number of rows: 6, 10>.

Is it a bug or is it an intended behavior? :)

The text was updated successfully, but these errors were encountered:

uriahf · 2022-08-10T15:57:03Z

@filbert42 Are you familiar with a good reproducible example that is relevant?

I'll try to use fairness::compas dataset but it might be more intuitive to have examples of ROC with >5 subpopulations rather than specific metric in a table.

uriahf · 2022-08-14T14:24:23Z

Support filtering for more than 5 populations when calling render_performance_table() 06b41ce
Support filtering for more than 5 populations when calling create_table_for_prevalence() eb0f222
Support filtering for more than 5 populations when calling create_table_for_auc() de596ac
Provide colors for more than 5 populations when calling render_performance_table(): 30417c3
New Color Palette:
c("#1b9e77", "#d95f02", "#7570b3", "#e7298a", "#07004D",
"#E6AB02", "#FE5F55", "#54494B", "#006E90" , "#BC96E6",
"#52050A", "#1F271B", "#BE7C4D", "#63768D", "#08A045",
"#320A28", "#82FF9E, "#2176FF", "#D1603D", "#585123")
Change color defaults for all create_*_curve() and plot_*_curve() functions 06b41ce 0421ac9 202bfff
Update crosstalk checkboxes colors in the summary report template cf9e90a

uriahf · 2022-08-15T14:12:18Z

Should be fine now:

4756e3d
b3ba132

Reproducible Example:

library(purrr)
library(fairness)
library(dplyr)
library(rtichoke)

#collapse-show
# extract data

compas <- fairness::compas
df     <- compas[, !(colnames(compas) %in% c('probability', 'predicted'))]

# partitioning params
set.seed(77)
val_percent <- 0.3
val_idx     <- sample(1:nrow(df))[1:round(nrow(df) * val_percent)]

# partition the data
df_train <- df[-val_idx, ]
df_valid <- df[ val_idx, ]

# fit logit models
model1 <- glm(Two_yr_Recidivism ~ .,            
              data   = df_train, 
              family = binomial(link = 'logit'))

df_valid$prob_1 <- predict(model1, df_valid, type = 'response')
df_valid$Two_yr_Recidivism_01 <- ifelse(df_valid$Two_yr_Recidivism == 'yes', 1, 0)

named_group_split <- function(.tbl, ...) {
  grouped <- group_by(.tbl, ...)
  names <- rlang::inject(paste(!!!group_keys(grouped), sep = " / "))
  
  grouped %>% 
    group_split(.keep = FALSE) %>% 
    rlang::set_names(names)
}

df_valid_for_rtichoke <- df_valid %>%
  select(prob_1, Two_yr_Recidivism_01, ethnicity) %>% 
  list(
    probs = select(., ethnicity, prob_1) %>% 
      named_group_split(ethnicity) %>% 
      map(~ .x %>%
            pull(prob_1)),
    reals = select(., ethnicity, Two_yr_Recidivism_01) %>% 
      named_group_split(ethnicity)  %>% 
      map(~ .x %>%
            pull(Two_yr_Recidivism_01))
  )

create_summary_report(
  probs = df_valid_for_rtichoke$probs,
  reals = df_valid_for_rtichoke$reals
)

uriahf added bug Something isn't working Feature Difficulty: intermediate Priority: high labels Jul 21, 2022

uriahf self-assigned this Jul 21, 2022

uriahf closed this as completed Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create_summary_report doesn't support more than 5 populations or models to compare #70

create_summary_report doesn't support more than 5 populations or models to compare #70

filbert42 commented Jul 21, 2022

uriahf commented Aug 10, 2022

uriahf commented Aug 14, 2022 •

edited

uriahf commented Aug 15, 2022

create_summary_report doesn't support more than 5 populations or models to compare #70

create_summary_report doesn't support more than 5 populations or models to compare #70

Comments

filbert42 commented Jul 21, 2022

uriahf commented Aug 10, 2022

uriahf commented Aug 14, 2022 • edited

uriahf commented Aug 15, 2022

uriahf commented Aug 14, 2022 •

edited