You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Thanks for the great package. I want to clarify a point of confusion I have before proceeding. I found the sample code you posted here and ran it locally. Quick reprex:
library(xgboost)
library(fastshap)
library(SHAPforxgboost) #to load the dataXy_var<-"diffcwv"dataX<- as.matrix(dataXY_df[,-..y_var])
# hyperparameter tuning resultsparam_list<-list(objective="reg:squarederror", # For regressioneta=0.02,
max_depth=10,
gamma=0.01,
subsample=0.95
)
mod<- xgboost(data=dataX, label= as.matrix(dataXY_df[[y_var]]),
params=param_list, nrounds=10, verbose=FALSE,
nthread=parallel::detectCores() -2, early_stopping_rounds=8)
# Grab SHAP values directly from XGBoostshap<- predict(mod, newdata=dataX, predcontrib=TRUE)
# Compute shapley values shap2<- explain(mod, X=dataX, exact=TRUE, adjust=TRUE)
# Compute bias term; difference between predictions and sum of SHAP valuespred<- predict(mod, newdata=dataX)
head(bias<-pred- rowSums(shap2))
#> [1] 0.4174776 0.4174775 0.4174775 0.4174775 0.4174775 0.4174776# Compare to output from XGBoost
head(shap[, "BIAS"])
#> [1] 0.4174775 0.4174775 0.4174775 0.4174775 0.4174775 0.4174775# Check that SHAP values sum to the difference between pred and mean(pred)
head(cbind(rowSums(shap2), pred- mean(pred)))
#> [,1] [,2]#> [1,] -0.03048085 -0.03053582#> [2,] -0.08669319 -0.08674819#> [3,] -0.05410853 -0.05416352#> [4,] -0.09465271 -0.09470773#> [5,] -0.01655553 -0.01661054#> [6,] -0.01729831 -0.01735327
In this code, the SHAP values' sum is not equal to the difference between pred and mean(pred) as suggested. Instead the SHAP values' sum is (nearly) equal to the BIAS term from the stats::predict(object, X, predcontrib = TRUE, ...) call in explain.xgb.Booster when exact = TRUE.
Should adjust = TRUE have the same effect for exact = TRUE output as it does for exact = FALSE output? In the line above (explain(mod, X = dataX, exact = TRUE, adjust = TRUE)), adjust = TRUE has no function. Is is simply passed on to the predict method of xgb.Booster and silently swallowed. Is this the intended behavior?
Can you briefly explain the difference between the baseline/bias term (produced by predict(xgb.Booster, newdata = X, predcontrib = FALSE) as the last matrix column) and mean(prediction)? I scoured the xgboost/lightgbm docs but couldn't find much.
The text was updated successfully, but these errors were encountered:
Hi @dfsnow, thanks for the note. Setting adjust = TRUE has no affect on the output when using exact = TRUE since they are already supposed to be additive. I'm not sure why the SHAP values aren't additive here (and I get the same issue when using XGBoost directly), so it may be better to ask on the XGBoost issues page. The bias column/term should be the average of all the training predictions (i.e., E(f(x))), which also corresponds to the difference between a particular prediction and the sum of its corresponding Shapley values.
Hi! Thanks for the great package. I want to clarify a point of confusion I have before proceeding. I found the sample code you posted here and ran it locally. Quick reprex:
In this code, the SHAP values' sum is not equal to the difference between pred and mean(pred) as suggested. Instead the SHAP values' sum is (nearly) equal to the
BIAS
term from thestats::predict(object, X, predcontrib = TRUE, ...)
call inexplain.xgb.Booster
whenexact = TRUE
.So, quick questions:
adjust = TRUE
have the same effect forexact = TRUE
output as it does forexact = FALSE
output? In the line above (explain(mod, X = dataX, exact = TRUE, adjust = TRUE)
),adjust = TRUE
has no function. Is is simply passed on to the predict method of xgb.Booster and silently swallowed. Is this the intended behavior?predict(xgb.Booster, newdata = X, predcontrib = FALSE)
as the last matrix column) andmean(prediction)
? I scoured the xgboost/lightgbm docs but couldn't find much.The text was updated successfully, but these errors were encountered: