Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upSomething off with coefitients? #9
Comments
|
Yes, actually there is a pretty cool reason why you do not want to have beta*value as separate contributions (see below) In the This is to make contributions resistant to shifting of an X variables. It is easy to get such individual contributions.
|
|
Thank you for the explanation! May I suggest giving the user the option to use the centered or regular x values, as well as providing some explanation in the documentation. This is a great chart, but confusing without any explanation of using type = "terms" |
|
Yes, some documentation is required. Winter semester has just ended so I will have some time to work on it. |
|
dear @pbiecek Following @alathrop it would be great to have an option for having directly the application of the different terms rather than the centered values. I completely understand for point of view. But in other context, such plot would be relevant, e.g. for pedagogic purpose. When teaching, I often need to explain to my students how a single prediction is obtained from a model, in particular when explaining how to interpret interactions. Thanks for this package |
|
Maybe some code could be helpful. I have tried the following. betas <- function (object, newdata)
{
tt <- terms(object)
Terms <- delete.response(tt)
mm <- model.matrix(Terms, newdata)
ass <- attr(mm, "assign")
tl <- attr(Terms, "term.labels")
co <- coef(object)
pred <- co * mm
ret <- matrix(rep_len(NA, length.out = length(tl) * nrow(newdata)), nrow = nrow(newdata))
colnames(ret) <- tl
rownames(ret) <- rownames(ret)
for (i in 1:length(tl)) {
ret[, i] <- rowSums(pred[, ass == i, drop = FALSE], na.rm = TRUE)
}
attr(ret, "constant") <- rowSums(pred[, ass == 0, drop = FALSE], na.rm = TRUE)
ret
}At the beginning of Would you consider adding such options? |
|
I have prepared a pull request, just in case |
|
Thanks, merged. |
|
thanks |
Hi, I don't understand how the broken function calculates the coefficients? (or something is off?)
In the lm function this is my test result:
Call:
lm(formula = TotalCharges ~ ., data = data_in_test)
Residuals:
Min 1Q Median 3Q Max
-1943.33 -453.71 -94.64 490.26 1887.26
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2162.4583 21.9717 -98.420 < 2e-16 ***
MonthlyCharges 36.1234 0.3080 117.301 < 2e-16 ***
tenure 65.3606 0.3683 177.476 < 2e-16 ***
SeniorCitizen -86.7050 24.3449 -3.562 0.000371 ***
Test user:
-2162.4583 + (data_in_test[analysed_user,]$MonthlyCharges * 36.1234) +
data_in_test[analysed_user,]$tenure65.3606 +
data_in_test[analysed_user,]$SeniorCitizen(-86.7050)
[1] 721.2045
While you get: (u can see that the intercept is different)
Obviously one would expect that contributions of a waterfall plot would be simply Y=intercept + beta*value ... etc. from the summary output?