-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emmeans no longer recognizes stratified factors in Survival #473
Comments
Terry's response is correct. You can't expect him to solve an issue in another package.
That said, I am not sure how easily I can fix it. But I'll look at it. I do remember some issue with survival not long ago, and I wonder if this is related. I do not have much experience with survival analysis in my own work, and I will be traveling soon; so please be patient.
|
There was a change in emmeans version 1.8.8 in relation to issue #429. It seems to be exactly the reverse of what you are reporting. Can you review that issue please, and let me know how yours differs? NoteI edited and deleted from your initial submission my instructions for reporting issues. The idea is for you to use the same sectioning (the prompts that start with |
PS -- I cannot reproduce your issue because I do not have the dataset |
I don't see a re-open button, so maybe the comments section will do. Here is a reproducible example (cox model borrowed from Therneau: "A package for survival analysis in R". My copy is dated Aug 5, 2022.):
Specifying strata(inst) instead does not help:
Other variables in the model produce the expected results:
Replacing strata(inst) with inst in the model allows emmeans to be generated (but presumably violates the proportional hazards assumption)
My own case requires an interaction between a treatment and a stratified factor. Emmeans are not available for either the stratified factor or the interacting treatment. For example,
Error in emmeans(cfit222dd, ~sex) :
Below is the full output from sessionInfo()
Matrix products: default locale: time zone: America/Winnipeg attached base packages: other attached packages: loaded via a namespace (and not attached): |
Hi Dr. Lenth. I cannot find a way to re-open the issue, and discussion on
Stack Exchange seems to indicate that I won't be able to beause I am not a
collaborator. I have added reproducible examples to the comments section of
the closed issue, and I am copying them below.
Michael Peirson
I don't see a re-open button, so maybe the comments section will do.
Here is a reproducible example (cox model borrowed from Therneau: "A
package for survival analysis in R". My copy is dated Aug 5, 2022.):
require(survival)
Loading required package: survival
require(emmeans)
Loading required package: emmeans
cfit2 <- coxph(Surv(time, status) ~ age + sex + wt.loss + strata(inst),
data=lung)
emm <- emmeans(cfit2, ~ inst)
Error in emmeans(cfit2, ~inst) :
No variable named inst in the reference grid
Specifying strata(inst) instead does not help:
emm <- emmeans(cfit2, ~ strata(inst))
Error in emmeans(cfit2, ~strata(inst)) :
No variable named inst in the reference grid
Other variables in the model produce the expected results:
emm <- emmeans(cfit2, ~ wt.loss)
emm
wt.loss emmean SE df asymp.LCL asymp.UCL
9.78 0.679 0.736 Inf -0.763 2.12
Replacing strata(inst) with inst in the model allows emmeans to be
generated (but presumably violates the proportional hazards assumption)
cfit222 <- coxph(Surv(time, status) ~ age + sex + wt.loss + inst,
data=lung)
emm <- emmeans(cfit222, ~ inst)
emm
inst emmean SE df asymp.LCL asymp.UCL
11 0.41 0.679 Inf -0.92 1.74
My own case requires an interaction between a treatment and a stratified
factor. Emmeans are not available for either the stratified factor or the
interacting treatment. For example,
cfit222dd <- coxph(Surv(time, status) ~ strata(sex)*age, data=lung)
emm <- emmeans(cfit222dd, ~ sex)
Error in emmeans(cfit222dd, ~sex) :
No variable named sex in the reference grid
In addition: Warning message:
In model.matrix.default(trms, m, contrasts.arg = object$contrasts) :
variable 'strata(sex)' is absent, its contrast will be ignored
emm <- emmeans(cfit222dd, ~ age)
Warning message:
In model.matrix.default(trms, m, contrasts.arg = object$contrasts) :
variable 'strata(sex)' is absent, its contrast will be ignored
emm
Error in X[ii, ii, drop = FALSE] %*% y[ii] : non-conformable arguments
Below is the full output from sessionInfo()
sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
time zone: America/Winnipeg
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] emmeans_1.10.0 survival_3.5-7
loaded via a namespace (and not attached):
[1] compiler_4.3.3 Matrix_1.6-5 estimability_1.4.1 tools_4.3.3
[5] coda_0.19-4.1 mvtnorm_1.2-4 splines_4.3.3 grid_4.3.3
[9] xtable_1.8-4 lattice_0.22-5
…On Sun, Mar 31, 2024 at 9:18 PM Russell V. Lenth ***@***.***> wrote:
PS -- I cannot reproduce your issue because I do not have the dataset
newBPHsurv. My issue guidelines clearly say that I need a reproducible
example. So I will close this issue until you re-open it and either provide
the dataset or prepare a similar example using data I can access, e.g. a
dataset in the *survey* package. (Please re-open *this* issue; do not
create a new issue.)
—
Reply to this email directly, view it on GitHub
<#473 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BHN62H4SCJJK6LUZCQTZPE3Y3C7YZAVCNFSM6AAAAABFQPIYEKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRZGA2DEOJWGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I agree, issue 429 looks like a related problem except using survreg
instead of coxph.
On July 10th, in that thread you asked about interactions with stratified
factors.
Here's a question for you... What does it mean if I *interact* one of thse
strata() or cluster() or frailty()) terms with other predictors?
I can't say I fully understand what's going on under the hood (sorry,
biologist not statistician by training), but I'll take a stab at this. I
think the cox model calculates a baseline survival curve for the whole
dataset based on all factors being at their first level; the estimates for
the other levels of all factors assume that the risk in any given period
is proportional to the baseline.
When a factor is stratified, the risk of death is not assumed to be
proportional among levels of that factor over time. Separate baseline
survival curves are calculated for each level of the stratified factor. The
effect of that factor *is not estimated *because hazards are not assumed to
be proportional over time among the levels of the factor. However,
estimates for the other factors still apply; to get the correct survival
probability at any time you multiply the appropriate baseline risk curve by
the effect estimate for the treatment of interest (multiply might be the
wrong word; there's definitely an exponent involved somehow but I can't
remember the math). The interaction of a treatment with a stratified factor
simply allows the effect estimate of the treatment to be different at
different levels of the stratified factor; to get the actual survival
probability you have to combine the effect estimate with the correct
baseline risk curve.
As I write this, it seems to me that it is correct for (using the models
from my message sent earlier today)
cfit2 <- coxph(Surv(time, status) ~ age + sex + wt.loss + strata(inst),
data=lung)
emm <- emmeans(cfit2, ~ inst)
not to produce a result, because there is no effect estimate for the
stratified variable inst.
However, after
cfit222dd <- coxph(Surv(time, status) ~ strata(sex)*age, data=lung)
emm <- emmeans(cfit222dd, ~ age)
and
emm <- emmeans(cfit222dd, ~ age | sex)
should both work.
I can't remember what survreg does, so not sure how this fits together with
the previous issue.
Michael Peirson
…On Sun, Mar 31, 2024 at 8:58 PM Russell V. Lenth ***@***.***> wrote:
There was a change in *emmeans* version 1.8.8 in relation to issue #429
<#429>. It seems to be exactly
the reverse of what you are reporting. Can youreview that issue please, and
let me know how yours differs?
—
Reply to this email directly, view it on GitHub
<#473 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BHN62H62RANRBCEOD6NUU6TY3C5MTAVCNFSM6AAAAABFQPIYEKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRZGAZDMOBQG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
An issue can be reopened via the "Reopen with comment" button just next to the "Comment" button below the comment compose window. Similarly, the issue can be closed with the "Close with comment" button. I think that since you are listed as a "participant" on this issue, I think that you can reopen it; and I think I have experienced this happening with past issue submitters. But maybe I'm wrong. I asked you to look at issue #429, but you offer no comment on that. To tell the truth, I am surprised that your example worked in the past, due to that very bug. I suppose it has to do with the interaction. And if so (see below), some coeffficients are not available, and it's possible that what seemed to "work" in fact didn't, so that the results obtained are nonsense. This seems to be a no-win situation. I fixed #429 on July 10, 2023 by disallowing strata factors. Restoring strata factors will possibly fix the problem here, but will revive the bug in #429. Meanwhile, I've looked at various articles on the web about strata in Cox regression, and several of them say that you cannot do inference about the stratification factors. So that, along with other considerations, suggests to me that it is correct to not allow stratification factors in the reference grid. My intuition suggests that strata are something akin to random block effects in a linear model. This seems to be borne-out by the coefficient of
For the other model, we get
This summary does include coefficients for the interaction, but not a main effect of Adding further confusion is that
The model The bottom lineI cannot allow stratification factors as factors of inferential interest in Cox models, because even if it's appropriate to do so in the thinking of some users, we don't have regression coefficients, nor variances and covariances for them. I do not know why it seemed to work in earlier versions, but I am guessing that those results are suspect. |
OK, here's the deal. I don't know what's right -- even after going through the user manual for survival and looking at all 136 cases where the word "strata" is used (many, to my irritation, being grammatical errors where the author meant "stratum"). So I decided to play along with it. That means allowing any factors in The changes I made also side-step the bug observed in #429. Just keep in mind that I'm not keeping you from doing anything that might be nonsensical -- it's on you to do appropriate analyses. ## What the hell to do about `strata()`?
library(survival)
library(emmeans)
#### Additive model
moda <- coxph(Surv(time, status) ~ wt.loss + age + strata(sex), data = lung)
#### Interaction model
modi <- coxph(Surv(time, status) ~ wt.loss + age*strata(sex), data = lung)
### survreg additive model
sra <- survreg(Surv(time, status) ~ wt.loss + age + strata(sex), data = lung)
### survreg interactive model
sri <- survreg(Surv(time, status) ~ wt.loss + age * strata(sex), data = lung)
at = list(age = c(20,30))
emmeans(moda, ~ age*sex, at = at)
## age sex emmean SE df asymp.LCL asymp.UCL
## 20 1 0.386 0.203 Inf -0.01142 0.783
## 30 1 0.578 0.296 Inf -0.00178 1.158
## 20 2 0.386 0.203 Inf -0.01142 0.783
## 30 2 0.578 0.296 Inf -0.00178 1.158
##
## Results are given on the log (not the response) scale.
## Confidence level used: 0.95
emmeans(modi, ~ age*sex, at = at)
## age sex emmean SE df asymp.LCL asymp.UCL
## 20 1 0.459 0.247 Inf -0.0255 0.943
## 30 1 0.688 0.364 Inf -0.0259 1.402
## 20 2 0.247 0.332 Inf -0.4042 0.897
## 30 2 0.369 0.492 Inf -0.5956 1.335
##
## Results are given on the log (not the response) scale.
## Confidence level used: 0.95
emmeans(sra, ~ age*sex, at = at)
## age sex emmean SE df lower.CL upper.CL
## 20 1 6.70 0.311 209 6.09 7.32
## 30 1 6.56 0.242 209 6.08 7.04
## 20 2 6.70 0.311 209 6.09 7.32
## 30 2 6.56 0.242 209 6.08 7.04
##
## Results are given on the log (not the response) scale.
## Confidence level used: 0.95
emmeans(sri, ~ age*sex, at = at)
## age sex emmean SE df lower.CL upper.CL
## 20 1 6.58 0.310 208 5.97 7.20
## 30 1 6.43 0.243 208 5.95 6.91
## 20 2 6.70 0.308 208 6.09 7.31
## 30 2 6.61 0.241 208 6.13 7.08
##
## Results are given on the log (not the response) scale.
## Confidence level used: 0.95 Created on 2024-04-04 with reprex v2.0.2 |
Hi, Dr. Lenth. Following your revisions, I have run the script from my previous work and it returns the results that it used to, which is correct according to my understanding. That is, it gives estimates for the non-stratified factor at each level of the stratified factor, and they are the same estimates as previously. Thank you for your help. |
I sent the below to Terry Therneau (maintainer of the survival package) and he gave an unhelpful response:
"Why are you asking me about a failure in another package? Ask the author of emmeans."
As an ordinary user, all I know is that script which used to work doesn't work now. Could you take a look?
Survival models with a stratified factor appear to have stopped cooperating with emmeans.
For example, the following model, which was fully functional on R for Windows last year (and was used in my published work), currently produces errors when run in R for Ubuntu (I don't currently have R on a Windows machine, so can't check that):
cphfr <- coxph(Surv(tstart, tstop, death) ~ strata(Region) * Patties + Fume, newBHPsurv, cluster=ColonyGroup)
emm <- emmeans(cphfr, ~ Patties | Region)
seem <- summary(pairs(emm, reverse = T), by = NULL, type="response", adjust = "none", infer = c(T, T))
names(seem)[1] <- "Patties effect"
pander(seem)
Error in emmeans(cphfr, ~Patties | Region) :
No variable named Region in the reference grid
In addition: Warning message:
In model.matrix.default(trms, m, contrasts.arg = object$contrasts) :
variable 'strata(Region)' is absent, its contrast will be ignored
The above definitely worked in:
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
survival_3.4-0 emmeans_1.8.2
And is not working in:
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS
[1] survival_3.5-7 emmeans_1.10.0
The text was updated successfully, but these errors were encountered: