Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Joss review] Need tests for Kaplan Meier and other non-parametrics #18

Closed
CamDavidsonPilon opened this issue Aug 5, 2021 · 9 comments

Comments

@CamDavidsonPilon
Copy link

The KM is by far the most widely used model, and most important non-parametric model. I'd like to see sufficient tests for it included.

openjournals/joss-reviews#3484

@derrynknife
Copy link
Owner

I've created tests for the Nelson-Aalen, Kaplan-Meier, and Fleming-Harrington, and was working on the Turnbull, and my first step was to plagiarise lifelines.

I did some looking and I can't convince myself that the NPMLE implementation in lifelines is correct. Am I missing something?

If we take the midpoint for the test example in lifelines and treat it like a non-intervally censored example we get substantialy different results:

import surpyval as surv
import numpy as np

left = [1, 8, 8, 7, 7, 17, 37, 46, 46, 45]
right = [7, 8, 10, 16, 14, np.inf, 44, np.inf, np.inf, np.inf]

x = np.vstack([left, right])

c = (~np.isfinite(x[1, :])).astype(int)
x = np.where(np.isfinite(x.mean(axis=0)), x.mean(axis=0), x[0, :])

print(surpyval.KaplanMeier.fit(x=x, c=c).R)
"""
[0.9   0.8   0.7   0.6   0.5   0.5   0.375 0.375 0.375]
"""

The lifelines test compares this to:

[0.16667016, 0.33332984, 0.125, 0.375]

And it's inverse is:

[0.83332984, 0.66667016, 0.875     , 0.625     ]

The solution that lifelines compares to doesn't align to the Kaplan-Meier solution if the midpoints are used. The lifelines npmle seems to have the correct value of the survival function as the CDF for the last value. Am I getting something wrong?

@CamDavidsonPilon
Copy link
Author

Can you share the lifelines code used?

@derrynknife
Copy link
Owner

This is the test in lifelines:

def test_npmle():
    left, right = [1, 8, 8, 7, 7, 17, 37, 46, 46, 45], [7, 8, 10, 16, 14, np.inf, 44, np.inf, np.inf, np.inf]
    npt.assert_allclose(npmle(left, right, verbose=True)[0], np.array([0.16667016, 0.33332984, 0.125, 0.375]), rtol=1e-4)

@CamDavidsonPilon
Copy link
Author

So npmle is a low-level function, and it needs to be paired with reconstruct_survival_function:

from lifelines.fitters.npmle import *
results = npmle(left, right)
sf = reconstruct_survival_function(*results)

This is equivalent to the (high-level API) KaplanMeierFitter().fit_interval_censoring(left, right)

@derrynknife
Copy link
Owner

Thanks, that makes more sense, and sorry to be a pest, but I've implemented surpyval differently to avoid the following scenario.

alpha = 20
beta = 15
x = np.random.weibull(beta, 100) * alpha
n, xx = np.histogram(x, bins=20)
x = np.vstack([xx[0:-1], xx[1::]])

kmf.fit_interval_censoring(x[0, :]+1e-12, x[1, :]).plot()
ax = plt.gca()
kmf.fit_interval_censoring(x[0, :], x[1, :]).plot(color='red', ax=ax)

This results in catastrophic differences in the estimates. It seems that the left edge of some intervals are considered to be failing in the previous interval. This is behaviour that I wouldn't expect and therefore comparisons between lifelines and surpyval are different.

Are you sure the Turnbull intervals in lifelines are correct?

@derrynknife
Copy link
Owner

Tests for Kaplan-Meier, Nelson-Aalen, and Fleming-Harrington included in f3a264a.

Still working on Turnbull tests as per the thread above.

@CamDavidsonPilon
Copy link
Author

Are you sure the Turnbull intervals in lifelines are correct?

No lol - I only know they are correct up-to the unit tests in lifelines.

The data you presented is an interesting case I'll have to explore more. I would suggest comparing results to icenreg in R if you can. (Seeing your written tests now - I suggest comparing vs R's implementations as well as lifelines)

@derrynknife
Copy link
Owner

Tests for Turnbull complete agains R's icenReg. Complete's nonparametric tests. See 4086c12

@derrynknife
Copy link
Owner

No lol - I only know they are correct up-to the unit tests in lifelines.

No probs. SurPyval doesn't actually find the intervals, it just creates an interval for between every observed value and every truncated value and brute forces the solution. This implementation will result in a few additional calculations but it makes the code more readable and was easier to develop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants