-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plotEffects does not show model estimates in order of magnitude #1
Comments
Thanks for your question. First of all, I am currently in the process of releasing the package on CRAN and it will ship with a comprehensive vignette. For now you can obtain the main manual vignette by setting:
And then:
To your question:
For instance, individual with subject ID 70 will be the leftmost dot on both plots: I think the issue is that the ordering according to observed effect size does not work because a lot of individuals in your data set have the exact same observed effect size. In future releases I will add functionality for the ordering of individuals, either by observed effect or estimates. For now, you can get the model estimates like this:
And you can build your own plot from that. I hope that helps! |
Thank you so much @lukasklima for your helpful comment, and for your amazing vignette. I read the vignette and it was super clear. I will open another issue later for a couple of small suggestions regarding the vignette as an interested reader of your works with a narrow knowledge of modeling. And regarding your answer to my problem with the ordering issue, I am a bit confused why our model estimates are so different than our observed data. Consider the first participant. The shrinkage is really strong on its observed data. And this shrinkage is not consistent across participants, so we find the weird pattern of the model estimate plot. plotEffects(accuracy_model, .raw = TRUE) %>%
filter(individual == "adfhwht0ur") # subjct 1
#individual effect differences estimate ordering
# <fct> <fct> <fct> <dbl> <int>
#1 adfhwht0ur Observed effect 2>1 -1 1
#2 adfhwht0ur Model estimates 2>1 0.177 1 Just to make sure if I have not done a silly mistake, I ran the model with the old quid package, I extracted both raw and estimated data, and plotted them on top of each other: accuracy_model_quid <- quid(id = aggregated_data$subject, condition = aggregated_data$cond,
rt = aggregated_data$accuracy, prior = c(.118/1.75, .275/1.75)) #c(effect/noise, sd/noise)
# raw data
aggregated_data %>%
group_by(subject, conflict) %>%
summarise(mean_effect = mean(accuracy)) %>%
ungroup()%>%
pivot_wider(names_from = conflict, values_from = mean_effect) %>%
mutate(effect = conflict - nonconflict,
source = "Observed Data") %>%
rowid_to_column("subj_id") %>%
mutate(subj_id = factor(subj_id)) %>%
select(-conflict, -nonconflict, -subject) %>%
# Now the model estimates
rbind(
accuracy_model_quid$theta %>%
as_tibble() %>%
gather() %>%
group_by(key) %>%
mean_hdi(value) %>%
ungroup() %>%
rename(effect = value) %>%
rowid_to_column("subj_id") %>%
mutate(subj_id = factor(subj_id)) %>%
mutate(source = "Model Estimates") %>%
select(subj_id, effect, source)
) %>%
mutate(subj_id = factor(subj_id)) %>%
# plotting
ggplot(aes(x = fct_reorder(subj_id, effect), y = effect, color = source, shape= source)) +
geom_hline(yintercept = 0, linetype= "dashed")+
geom_point(size = 2) +
theme_classic() +
theme(legend.title=element_blank(),
axis.text.x=element_blank()) And the plot makes more sense: I am not sure but I guess the discrepancy is coming from the way that the quid function saves theta values. It just saves them as follows. So, if quid has already arranged them, that could be the reason why they align with raw data as I just bind model estimates with raw data assuming that quid theta outputs have the same order as my dataframe. accuracy_model_quid$theta
# beta_209 beta_210 beta_211 beta_212 beta_213 beta_214 beta_215 beta_216
# [1,] 0.04053379343 0.17811431479 0.42796485672 0.17638261097 0.17311516775 0.147790279872 0.26140632396 0.197217833041
# [2,] 0.16078008593 0.16175056554 0.13935805023 0.05951553760 0.28413290548 0.235406515488 0.26334917960 0.256220405783
# [3,] 0.18931454554 -0.01112141561 0.23011657969 0.05727937337 0.03295848137 -0.061864080823 0.33462365491 0.218882215163
# [4,] -0.05681328330 0.05460159034 0.17433153785 0.06193325861 0.10613123885 0.010305496672 0.17662742460 0.094221618851 And finally, to compare these two outputs, I ran a full model with accuracy_brms <-
brm(data = aggregated_data,
family = bernoulli(link = "logit"),
accuracy ~ conflict +
(conflict | subject) +
(conflict | content),
prior=c(prior(normal(0, 2), class = Intercept),
prior(normal(0, 2), class = b),
prior(cauchy(0, 1), class = sd),
prior(lkj(2), class = cor)),
sample_prior = TRUE,
cores = 4, warmup = 1000, iter = 5000, chains = 4, seed = 32,
save_all_pars=TRUE,
file = here("Output","brms_models","previous_exp_brms_logit")
) So, my question is if you know where this discrepancy is coming from? Thank you again for your help. |
Thanks again for the reply! At the moment I am clueless as to where the discrepancy is coming from. Any chance you can provide me with the data or a sample of it? Then I can check what is going on in the background. |
yes, sure. That would be great. Here is the data: And here is the code that I use in case you want to reproduce the results: data <- read_csv("./data.csv") %>%
mutate(subject = factor(subject),
cond = factor(cond))
model <- constraintBF(formula = accuracy ~ subject*cond,
data = data,
whichRandom = "subject",
ID = "subject",
whichConstraint = c("cond" = "2 > 1"),
rscaleEffects = c("subject" = 1, "cond" = .118/1.75, "subject:cond" = .275/1.75)) Thanks again. |
Sorry for the late reply. Your outcome variable "accuracy" is dichotomous. At this point my package was only developed with a continuous outcome variables in mind. Thus, I do not recommend using my package for this kind of analysis. I also forwarded the issue to my supervisor (author of the original quid function) to provide you with more information. I'll get back to you soon. |
Thanks Lukas, You are right. using a gaussian likelihood might not be the best option for analyzing the accuracy data. I am trying to implement your procedure in brms. But I would like to hear from you and your supervisor about the possible methods to run such an analysis using quid. Thank you again for your time and looking forward to hearing from you :) |
Dear @lukasklima,
I am using your amazing package to test the existence of qualitative individual differences. I have two issues and I would appreciate your comments. To reproduce the problems, I will copy my codes here. I used this model:
Which gives me the following plot:
Is there any way to arrange the model estimates based on their magnitudes? I guess the problem is that ggplot uses the ordering for the observed data as we plot it first. Then, why is the model estimate for a subject different than its data? Super interesting!
Before finding your package, I was using quid (https://github.com/jstbcs/play/) and instead of plotting both estimates on a single plot, I plotted them separately and then attached both plots. But now I do not know, for example, where subject 1 model estimate is on the model plot.
Now, my second question. Is there a way to extract model estimates for each subject? The previous quid function gives us theta values in the output which are basically model estimates but the problem is that the function changes the subject ID. For example, instead of subj_143, it gives beta_143, and again, I am not sure if this beta value belongs to that subject. Can we get model estimates for each subject? I want to use model estimates in a correlational analysis later.
Thanks.
The text was updated successfully, but these errors were encountered: