support competing events in prep-data-long-surv #42

jburos · 2017-01-07T22:27:18Z

For competing & semi-competing risk models (related to #36), we often have multiple events (& separate covariates for each event). The first part of supporting competing & semi-competing risk models is to allow for multiple events in prep_data_long_surv.

Some design decisions:

input data We will assume for now that the data for each event is stored in separate dummy variables, we want to handle the scenario where the user provides a list of event_col names rather than a single event_col name. The case where the user has data stored with numerical indices is one we can support at a later date, by first transforming the two-outcome data to a factor.

event type there are several ways to handle multiple events, depending on their type:

binary event, recurring: let's consider an event that occurs at t=3, and let's assume we have integer timepoints numbered 0, 1, 2, .. etc. The event data for this time-series will look like: [0, 0, 0, 1, 0, 0, 0, ...]. In other words, it occurred at t=3, but did not occur before that time & did not occur at timepoints following.
binary event, terminating: let's consider the same set up with an event at t=3, as above. In this case, our event data should look like: [0, 0, 0, 1] for t = [0, 1, 2, 3]. After t=3, there should be no following records in the dataset. (this is the default type)
binary event, one-time: considering a similar set up as in previous examples, this event type with an event at t=3 will produce data with 0 values before the event, and 1 values at and following the event: [0, 0, 0, 1, 1, 1, ...]. The most natural use case for this type of event in a clinical setting is the entering of an intermediate state, like "recurrence" or "hospitalization".

The current plan is to support each of these, to varying degrees.

The text was updated successfully, but these errors were encountered:

jburos · 2017-01-07T23:12:13Z

The assumption will be that the data for multiple events will be provided in a "tidy" form:

This means that:

All event data are provided in a single data frame
Each "observation" of an event state is provided in a separate record.
Each record contains:
- subject identifier
- time value
- event label
- event value (1: yes / 0: no)

In other words, the data will look something like the following:

subject_id	time	event_name	event_value
1	0.22	new_lesion	1
1	0.44	new_lesion	1
1	1.2	death	0
2	5.5	death	1
3	5.5	death	0

The first subject had two recurring events (type 1 above), then a censoring event at t=1.2 (censored because the value = 0). Subjects 2 and 3 had no new-lesion events and had terminating events at t=5.5.

There are two assumptions which are often made in processing these data, which are helpful to make explicit:

Assume that the event state is "0" between observed timepoints for each subject, and that no records will be created following the event unless otherwise specified. This is equivalent to the second case above, of the binary, terminating event type.
Also assume that each subject is censored at a single time, which applies uniformly for all event types. Here, the censor time is assumed to be the time of last observation for each subject_id.

In other words, in the example above, we are assuming that no "new_lesion" events occurred between the event at t=0.44 and the censoring event at t=1.2.

It also means that we will infer, for subjects 2 and 3 above, that no new-lesion events occurred, since the censor status for the 'death' event is tied to that for the 'new-lesion' event. Depending on your data, this may or may not be accurate. (In other words, knowing there are no new-lesion-events is very different from not knowing whether there is a new-lesion).

In order to relax assumption 2, we would need to know the time of censor for new-lesion events, which would require that the user create an additional record at t=1.2 censoring the second event type at t=1.2. For now, we don't have a need to support this.

… with multiple events (#41 & #42, #36)

jburos self-assigned this Jan 7, 2017

jburos added the enhancement label Jan 7, 2017

jburos added a commit that referenced this issue Jan 9, 2017

implement data-sim for joint model; test code for prep_data_long_surv…

e7a7657

… with multiple events (#41 & #42, #36)

jburos closed this as completed in 9e6d62d Jan 11, 2017

jburos mentioned this issue Jan 9, 2019

prep_data_long_surv fails on multi-event data #78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support competing events in prep-data-long-surv #42

support competing events in prep-data-long-surv #42

jburos commented Jan 7, 2017 •

edited

Loading

jburos commented Jan 7, 2017 •

edited

Loading

support competing events in prep-data-long-surv #42

support competing events in prep-data-long-surv #42

Comments

jburos commented Jan 7, 2017 • edited Loading

jburos commented Jan 7, 2017 • edited Loading

jburos commented Jan 7, 2017 •

edited

Loading

jburos commented Jan 7, 2017 •

edited

Loading