Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighted frequencies #136

Closed
andresimi opened this issue Sep 13, 2020 · 3 comments
Closed

Weighted frequencies #136

andresimi opened this issue Sep 13, 2020 · 3 comments

Comments

@andresimi
Copy link

Hi, is it possible to apply a weight to have weighted frequencies in modelsummary?
Or to work with survey data from survey package?
thanx

@vincentarelbundock
Copy link
Owner

vincentarelbundock commented Sep 13, 2020 via email

@andresimi
Copy link
Author

andresimi commented Sep 15, 2020

Thanx a lot.
You can dowload the data from here.

The table I am trying to reproduce is something like this one:
image

And the code I was working until now is the following:

library(rio); library(tidyverse); library(sjmisc); library(sjlabelled); library(modelsummary)

data <- readRDS("table_example.rds") %>% 
  mutate(across(starts_with("dc"), as_label))
data %>% names()
#data %>% select(starts_with("dc")) %>% names %>% paste(collapse = " + ")

# Table one
f <- dcany + dcanyanx + dcgena + dcpanic + dcagor + dcsoph + dcspph + dcptsd + dcocd + dcsepa + dcotanx + 
  dcanydep + dcmadep + dcotdep + dcanyhk + dcadhdi + dcadhdh + dcadhdc + dcadhdo + dcanycd + dcodd + 
  dccd + dcothcd + dcanyother + dcmania + dcpsych + dceat + dcpdd + dctic ~ redcap_event_name*selection*age_group*DropEmpty()*(1+Percent(denom = "col"))

t <- data %>% 
  datasummary(f, data = ., output = "data.frame", title = "title") %>% 
  as.tibble() %>% 
  set_na(1, na="") %>% 
  fill(1) %>%
  filter(`  `=="Yes") %>% 
  select(` `, 
         starts_with("Wave0 Random 5-9"), starts_with("Wave0 Random 10-14"), starts_with("Wave0 High Risk 5-9"), starts_with("Wave0 High Risk 10-14"),
         starts_with("Wave1 Random 9-12"), starts_with("Wave1 Random 13-17"), starts_with("Wave1 High Risk 9-12"), starts_with("Wave1 High Risk 13-17"),
         starts_with("Wave2 Random 12-17"), starts_with("Wave2 Random 18-21"), starts_with("Wave2 High Risk 12-17"), starts_with("Wave2 High Risk 19-21")) %>% 
  as_hux(add_colnames=T) %>% 
  set_bottom_border(row=1, col = everywhere)
t


# Weighted table
f <- dcany + dcanyanx + dcgena + dcpanic + dcagor + dcsoph + dcspph + dcptsd + dcocd + dcsepa + dcotanx + 
  dcanydep + dcmadep + dcotdep + dcanyhk + dcadhdi + dcadhdh + dcadhdc + dcadhdo + dcanycd + dcodd + 
  dccd + dcothcd + dcanyother + dcmania + dcpsych + dceat + dcpdd + dctic ~ redcap_event_name*age_group*DropEmpty()*(1+Percent(denom = "col"))

tw <- data %>%
  as_survey_design(weights = weights) %>% 
  datasummary(f, data = ., output = "data.frame", title = "title") %>% 
  as.tibble() %>% 
  set_na(1, na="") %>% 
  fill(1) %>%
  filter(`  `=="Yes") %>% 
  select(` `, 
         starts_with("Wave0 5-9"), starts_with("Wave0 10-14"), starts_with("Wave0 5-9"), starts_with("Wave0 10-14"),
         starts_with("Wave1 9-12"), starts_with("Wave1 13-17"), starts_with("Wave1 9-12"), starts_with("Wave1 13-17"),
         starts_with("Wave2 12-17"), starts_with("Wave2 18-21"), starts_with("Wave2 12-17"), starts_with("Wave2 19-21")) %>% 
  as_hux(add_colnames=T) %>% 
  set_bottom_border(row=1, col = everywhere)
tw

@vincentarelbundock
Copy link
Owner

vincentarelbundock commented Sep 15, 2020

The easiest way to do this would probably be to divide your weights by the sum of weights in the subgroup which you want to use as margins (i.e., in the subgroup where the sum of frequencies should equal 100%). Then, you just take the sum of these new weights.

Does something like this work for you?

library(tidyverse)
library(modelsummary) 

data <- readRDS("table_example.rds") %>%
        group_by(redcap_event_name, age_group, selection) %>%
        mutate(weights = weights / sum(weights) * 100)

f <- Factor(dcany) + Factor(dcanyanx) + 1 ~ 
     sum * weights * redcap_event_name * age_group * selection * DropEmpty()
datasummary(f, data)

Screen Shot 2020-09-15 at 15 10 13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants