Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model evaluation metric #79

Closed
peiyaoli opened this issue Aug 22, 2019 · 1 comment
Closed

Model evaluation metric #79

peiyaoli opened this issue Aug 22, 2019 · 1 comment

Comments

@peiyaoli
Copy link

Hi, @sebp

I have several questions regard evaluation of survival prediction model, which are extensions of my former question #75
In my project, I would like to build a survival model using GradientBoosting. Since the gradientboosting in scikit-survival is slow, I choose XGBoost to implement the model. Two metrics are used to evaluate and compare model: C-index and time ROC, as you suggested in tutorial.

  1. For the XGBoost, one answer from stackoverflow suggested I could use
xgb_model.predict(x_test, margin=True)

to get comparable result with scikit-survival prediction result. Then I could use your implementation of c-index to compare two models. However, I am not sure if this work.

  1. In Shap's official tutorial notebook, they implemented C-index as bellow:
def c_statistic_harrell(pred, labels):
    total = 0
    matches = 0
    for i in range(len(labels)):
        for j in range(len(labels)):
            if labels[j] > 0 and abs(labels[i]) > labels[j]:
                total += 1
                if pred[j] > pred[i]:
                    matches += 1
    return matches/total

So what the difference between this and yours? I tried both, there are some difference.

Thanks for your answer

Best
Peiyao

@sebp
Copy link
Owner

sebp commented Aug 22, 2019

AFAICT, the code you posted differs in 2 aspects from concordance_index_censored.

  1. It does not consider tied risk scores.
  2. Assuming labels is the time of the event, then two scores pred[i] and pred[j] are concordant if the i-th patient survived longer and has a lower predicted score, whereas for concordance_index_censored:

If the estimated risk is larger for the sample with a higher time of event/censoring, the predictions of that pair are said to be concordant.

If you get a c-index smaller 0.5, you might need to flip the sign of the predictions to obtain the correct order.

@sebp sebp closed this as completed Sep 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants