- History:
- (Ex) Physicist
- (Ex) System Administrator
- (Ex) Software Developer
- (Ex) FOSS Community Manager
- Data scientist (~1 year)
- Caveat Emptor!
- Contact:
15/05/2019
A study about cancer recurrance
Time is recurrance-free time in days
An “event” is the cancer recurring (triangle)
Otherwise the patient gets a “censored” marker (circle) when they leave the study
Also applies to any other time-to-event data
print(small_data)
## rowid time cens start ## 1 127 357 1 0 ## 2 482 1922 0 0 ## 3 393 867 1 0 ## 4 115 857 0 0 ## 5 644 1692 0 0
mean(c(357,1922,867,857,1692))
## [1] 1139
mean(c(357,NA,867,NA,NA),na.rm = T)
## [1] 612
Theory
\[ S(t) = 1 - F(t) = P(T > t) \]
Interpretation
Probability that duration is greater than \(t\)
Median: The median duration is t.
Proportion at time \(t\): \(100 \cdot \hat S(t)\) percent of durations are longer than t.
km <- survfit(Surv(time,cens) ~ 1, data=small_data) ggsurvplot(km,conf.int = F, censor.shape = 4, censor.size = 9, risk.table = 'nrisk_cumevents', legend = 'none')
Formal Definition \[ \hat S(t) = \prod_{i: t_i < t} \frac{n_i - d_i}{n_i} \]
km <- survfit(Surv(time,cens) ~ 1, data = GBSG2) ggsurvplot(km, censor = F, conf.int = T, surv.median.line = 'hv', legend = 'none')
km <- survfit( Surv(time,cens) ~ 1, data = GBSG2)
km <- survfit(Surv(time, cens) ~ horTh, data = GBSG2) ggsurvplot(km, data = GBSG2, surv.median.line = "hv", legend.title = "Hormone Therapy", legend = 'right', pval = TRUE, conf.int = TRUE )
wb=survreg(Surv(time, cens)~horTh, GBSG2) predict(wb, type = "quantile", p = 1 - 0.9, newdata = data.frame( horTh='yes'))
## 1 ## 475.1155
Good for CI and comparative plots across predictors