SAGE values on cross-validation #6

garkavem · 2021-02-11T12:24:03Z

Hello! On my dataset SAGE values depend quite a lot on the train-test split. Would it be correct to average the SAGE values means and stds on cross-validation?

iancovert · 2021-02-15T23:59:21Z

Hi there, that's an interesting situation. When you try a different train-test split, do you train a new model? Or do you use a different train-test split (with the same model) just when estimating SAGE values? And also, is the estimator running to convergence so that you get pretty narrow confidence intervals?

Assuming that the SAGE values are known with high confidence (narrow confidence intervals), here's what I think you can do.

If it's the first situation, then it may mean that your model depends quite a bit on the train-test split. Ideally that wouldn't happen, especially if there's enough data, but averaging the SAGE values is a reasonable approach. (For the confidence intervals, I would calculate the standard deviations by taking the square root of the average variance.)

If it's the second situation, then I would put more trust in the SAGE values that are calculated using data that was not touched during training (the test data), because the loss values (and therefore the SAGE values) may be artificially changed by overfitting to the train set.

Let me know how that sounds.

garkavem · 2021-02-16T12:42:23Z

Hello, thank you for the answer! It is the first situation. Maybe there is not enough data. I guess I will average values and calculate confidence intervals as you suggest. Thanks!

garkavem closed this as completed Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAGE values on cross-validation #6

SAGE values on cross-validation #6

garkavem commented Feb 11, 2021

iancovert commented Feb 15, 2021

garkavem commented Feb 16, 2021

SAGE values on cross-validation #6

SAGE values on cross-validation #6

Comments

garkavem commented Feb 11, 2021

iancovert commented Feb 15, 2021

garkavem commented Feb 16, 2021