Problem in cph, when there is missing value and cluster #46

tamas-ferenci · 2017-07-27T12:53:57Z

Here is a minimal reproducible example:

set.seed(1)

RawData <- data.frame( time = runif( 100, 1, 5 ), event = sample( 0:1, 100, replace = TRUE ),
x1 = c( NA, rnorm( 99 ) ), x2 = as.factor( sample( 0:1, 100, replace = TRUE ) ),
ID = c( 1:70, 1:30 ) )

dd <- datadist( RawData )
options( datadist = "dd" )

cph( Surv( time, event ) ~ x1 + x2 + cluster( ID ), data = RawData )

It is not even an error, but a warning, but it nevertheless shouldn't be there. After some experimentation I think the problem is with the presence of a missing value and cluster().

The problem seems to occur at the very end of rms:::Design. I think the problem is that jz will index every right hand side term (including the cluster), but fname will only contain the explanatory variables (without the cluster). Thus
names(nmiss)[jz] <- fname[asm != 9]
will try to assign a length-2 vector to one of length 3.

harrelfe · 2017-09-08T22:30:33Z

The rms package prefers for you to handling cluster after the fit using robcov or bootcov.

tamas-ferenci · 2017-10-05T16:36:26Z

It works with
fit <- cph( Surv( time, event ) ~ x1 + x2, data = RawData )
fit <- robcov( fit, cluster = RawData$ID )

Thank you very much!

tamas-ferenci closed this as completed Oct 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem in cph, when there is missing value and cluster #46

Problem in cph, when there is missing value and cluster #46

tamas-ferenci commented Jul 27, 2017

harrelfe commented Sep 8, 2017

tamas-ferenci commented Oct 5, 2017

Problem in cph, when there is missing value and cluster #46

Problem in cph, when there is missing value and cluster #46

Comments

tamas-ferenci commented Jul 27, 2017

harrelfe commented Sep 8, 2017

tamas-ferenci commented Oct 5, 2017