Strange bug in PLS algorithm #88

svkucheryavski · 2020-06-09T10:35:27Z

This code:

data(people)
set.seed(6)
people <- as.data.frame(people)
X <- people[, -4]
y <- people[,  4, drop = FALSE]
m <- pls(X, y, cv = 8)
m <- pls(X, y, cv = 8)
m <- pls(X, y, cv = 8)
m <- pls(X, y, cv = 8)
m <- pls(X, y, cv = 8)

Leads to the following error (only when last command is executed):

Error in solve.default(crossprod(object$xloadings, object$weights)) : 
  system is computationally singular: reciprocal condition number = 2.74318e-32

In predict.cal() inside cross-validation loop (when m.loc is used):

xscores <- x %*% (object$weights %*% solve(crossprod(object$xloadings, object$weights)))

The text was updated successfully, but these errors were encountered:

svkucheryavski · 2020-06-10T07:50:32Z

Seems like it is caused by too small eigenvalues in SIMPLS algorithm when cross-validation is applied, so number of observations is smaller than for calibration set. Will fix by adding a check inside cross-validation loop and if number of components is too large it will raise and error and ask user to limit the number.

klebyn · 2020-06-10T10:47:53Z

Perhaps the kappa function can help you by measuring how poorly conditioned the matrix is.
?kappa

svkucheryavski · 2020-06-10T11:42:08Z

Thanks for suggestion. Actually the problem was not in the line I mentioned in the first comment, but a bit deeper, in the SIMPLS algorithm. When maximum number of components estimated or provided by user is too large (within the limits bounded by number of observations and variables but larger than the effective rank), then it causes, of course, computational issues in the algorithm. There is actually a corresponding check inside in the algorithm and if this happens the algorithm reduces the number of components and warn user about this.

But if random cross-validation is used and some variables are discrete, in one of the steps there can be a local calibration set where all values for one or several such variables are constant. This further reduces the effective rank and led to the above mentioned error. The situation is very unlikely and this is why all tests passed so far until I started experimenting with some things.

Will simply add another check inside cross-validation and ask user to limit number of components in this case.

svkucheryavski added the bug label Jun 9, 2020

svkucheryavski added a commit that referenced this issue Jun 10, 2020

fixes bug #88

eab2f33

svkucheryavski closed this as completed Jun 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange bug in PLS algorithm #88

Strange bug in PLS algorithm #88

svkucheryavski commented Jun 9, 2020 •

edited

Loading

svkucheryavski commented Jun 10, 2020

klebyn commented Jun 10, 2020

svkucheryavski commented Jun 10, 2020

Strange bug in PLS algorithm #88

Strange bug in PLS algorithm #88

Comments

svkucheryavski commented Jun 9, 2020 • edited Loading

svkucheryavski commented Jun 10, 2020

klebyn commented Jun 10, 2020

svkucheryavski commented Jun 10, 2020

svkucheryavski commented Jun 9, 2020 •

edited

Loading