You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’ve got a question about the lmrob.S function in robustbase. Perhaps it relates to the goals of this repo, so I’ll take a chance on raising it here.
S-estimates of scale should give a “loss.S” value (to use terms of fasts.R in this repo) equal to b:
$$\frac{1}{n} \sum_1^n \tilde{\rho}(r_i/s) = b,$$
where ${ r_i : i}$ are residuals of the S-fit. Indeed, the S-estimate of scale is defined to be the smallest value of $s$ for which the mean at left is smaller than $b$. However, this appears not to be the case for the S-estimates generated in the examples of robustbase’s lmrob.
I take robustbase’s bb to correspond to the $b$ in my equation above, and its Mchi to my \tilde{\rho}. So in the examples below, the loss.S value looks too small:
Looking at detailed feedback from the fit by increasing control$trace.lev to 2 (not shown), the difference between scale * .7 and scale does seem large by lmrob.S’s standards for this problem.
I see something similar with similar with lmrob’s other data example, if a bit less extreme.
>m2<- lmrob(Y~., data=coleman, setting="KS2011") #to evolve the seed as in the examples>RlmST<- lmrob(log.light~log.Te, data=starsCYG)
>RlmST$init.S$control$bb
[1] 0.5> mean(with(RlmST$init.S, Mchi(residuals/scale, cc=control$tuning.chi, psi=control$psi) ))
[1] 0.4787234> mean(with(RlmST$init.S, Mchi(residuals/(scale*.97), cc=control$tuning.chi, psi=control$psi) ))
[1] 0.4917663
These calculations were done with robustbase version 0.92.5, R version 3.2.3.
Was I wrong to think $s$ should solve loss.S = b? Perhaps the solutions are only approximate, with the quality of the solution increasing with sample size? -Ben
The text was updated successfully, but these errors were encountered:
@benthestatistician@mmaechler@kollerma I will check this more carefully later today, but this could be related to the fact that the S-estimator in lmrob doesn't solve mean( loss ) = b, but rather sum(loss)/(n-p) = b where n is the sample size, and p is the number of explanatory variables (including the intercept if present).
I think the robustbase function .vcov.avar1 has the same expectation I did, that one should have mean( loss ) ==b, at this line:
u4<- mean(w0^2-bb^2) * tcrossprod(a)
Here the vector w0 encodes the loss. The mean part of this calculation aims to recover an empirical variance of that vector, but only does so if mean(w0) == bb, which isn't the case (as seen on this thread).
Perhaps you or @mmaechler could refer this further as appropriate? I'd also be happy to reach out to whoever's in the best position to act on this, but in that case I'd appreciate your advice on who that should be.
Dear @msalibian, @kollerma and @mmaechler,
I’ve got a question about the
lmrob.S
function inrobustbase
. Perhaps it relates to the goals of this repo, so I’ll take a chance on raising it here.S-estimates of scale should give a “loss.S” value (to use terms of fasts.R in this repo) equal to b:
where${ r_i : i}$ are residuals of the S-fit. Indeed, the S-estimate of scale is defined to be the smallest value of $s$ for which the mean at left is smaller than $b$ . However, this appears not to be the case for the S-estimates generated in the examples of
robustbase
’slmrob
.I take$b$ in my equation above, and its
robustbase
’sbb
to correspond to theMchi
to my\tilde{\rho}
. So in the examples below, the loss.S value looks too small:Correspondingly, there are smaller values of$s$ satisfying $\frac{1}{n} \sum_1^n \tilde{\rho}(r_i/s) \leq b$ .
Looking at detailed feedback from the fit by increasing
control$trace.lev
to 2 (not shown), the difference betweenscale * .7
andscale
does seem large bylmrob.S
’s standards for this problem.I see something similar with similar with
lmrob
’s other data example, if a bit less extreme.These calculations were done with
robustbase
version 0.92.5, R version 3.2.3.Was I wrong to think$s$ should solve loss.S = b? Perhaps the solutions are only approximate, with the quality of the solution increasing with sample size? -Ben
The text was updated successfully, but these errors were encountered: