Error "Error: no valid set of coefficients has been found: please supply starting values" for model that runs OK in nnet::multinom #28

tomwenseleers · 2022-09-02T17:26:43Z

Recently bumped into a model for which no valid set of coefficients could be found, even though the model runs OK in nnet::multinom :

library(nnet)
library(mclogit)
library(splines)
data(mtcars)
dat = mtcars
dat$cyl = factor(dat$cyl)
dat$am = factor(dat$am)

fit_multinom = multinom(cyl ~ ns(mpg, df=2) + am, data = dat) # runs OK

fit_mblogit = mblogit(cyl ~ ns(mpg, df=2) + am, data = dat, 
                       from.table=FALSE) 
# returns error "Error: no valid set of coefficients has been found: please supply starting values"

Is there any way to pass starting coefficients actually?

The text was updated successfully, but these errors were encountered:

melff · 2022-10-07T21:11:07Z

The warning about the missing starting values occurs after the 8-th iteration - where it should not occur. This is a bug that I am able to fix. Another bug is that there actually is no way to provide starting values. I am going to fix this in the next days.

The real issue here is that the algorithm diverges in the present situation. I am able to tweak the algorithm so that it converges with the help of some stepsize-halving. However, this is a "false convergence" in so far as the algorithm does not stop at a true optimum of the objective function. This is all because the objective function does not seem to have an optimum at all, because we have an instance of separation at hand and the algorithm should not converge because regular ML estimates do not exist.

If I inspect the results of multinom() with summary(fit_multinom) I find that the standard errors are huge in comparison to the estimates, which may be indicative of a flat log-likelihood function as it may occur with separation. To be specific summary(fit_multinom) gives me:

Call:
multinom(formula = cyl ~ ns(mpg, df = 2) + am, data = dat)

Coefficients:
  (Intercept) ns(mpg, df = 2)1 ns(mpg, df = 2)2        am1
6   -2.045219        -21.44797        -77.41784  0.0177848
8   17.592329        -52.62519        -58.36619 -1.8778491

Std. Errors:
  (Intercept) ns(mpg, df = 2)1 ns(mpg, df = 2)2      am1
6    100.0897         148.5055         112.9144 2.048988
8    100.4552         148.8285         118.5161 3.381616

With 7356b69 I get with your model and the data after 27 iterations:

Iteration 27 - deviance = 10.1658 - criterion = 5.345477e-09
converged
Warning: algorithm stopped at boundary value
Warning: fitted probabilities numerically 0 occurred

and summary(fit_mblogit) gives:

Call:
mblogit(formula = cyl ~ ns(mpg, df = 2) + am, data = dat, maxit = 100)

Equation for 6 vs 4:
                   Estimate Std. Error z value Pr(>|z|)
(Intercept)           2.745  11024.528   0.000    1.000
ns(mpg, df = 2)1  -2476.730 249497.665  -0.010    0.992
ns(mpg, df = 2)2  -7753.087 780664.603  -0.010    0.992
am1                -108.745  83881.630  -0.001    0.999

Equation for 8 vs 4:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)          19.31   11024.52   0.002    0.999
ns(mpg, df = 2)1  -2508.33  249497.66  -0.010    0.992
ns(mpg, df = 2)2  -7748.32  780664.60  -0.010    0.992
am1                -110.33   83881.63  -0.001    0.999

Null Deviance:     70.31 
Residual Deviance: 10.17

The deviance that multinom() gives is 15.31, while the deviance that mblogit() gives is 10.17. Arguably mblogit() now fits the data closer (but not better) than multinom(). It is not a "better" fit, because an MLE does not exist for this model and these data.

tomwenseleers · 2022-10-11T14:29:28Z

Many thanks for the info! So it's a case with complete separation. Do you think anything could be added to deal with such cases a bit better? E.g. allowing one to add a small ridge or adaptive ridge penalty (equivalent to adding a Gaussian prior) to make such models fit? (I recall that with binomial GLMs one can add a ridge penalty by row augmenting the model matrix with a diagonal matrix with sqrt(lambdas) along the diagonal and augmenting the observations with some zeros - I presume the same is possible in the context of a multinomial GLM? I see a ridge pentalty being used here e.g., https://ieeexplore.ieee.org/abstract/document/1424458, aside from other penalties that promote sparsity (which is also cool). Haven't seen any ports of that algorithm actually in R - though there are some in Matlab, C & .NET
https://github.com/slavomir-sidor/princeton-mvpa-toolbox/blob/master/core/learn/smlr.m
https://github.com/slavomir-sidor/princeton-mvpa-toolbox/blob/master/core/learn/smlr_mex.c
https://github.com/inzxx/adastra/blob/master/src/Accord24/Accord.Statistics/Models/Regression/Nonlinear/Fitting/LowerBoundNewtonRaphson.cs

fweber144 mentioned this issue Oct 10, 2022

Frequent error in mblogit "Error in solve.default(Information) : system is computationally singular: reciprocal condition number" with models that run OK in nnet::multinom #27

Closed

melff closed this as completed Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error "Error: no valid set of coefficients has been found: please supply starting values" for model that runs OK in nnet::multinom #28

Error "Error: no valid set of coefficients has been found: please supply starting values" for model that runs OK in nnet::multinom #28

tomwenseleers commented Sep 2, 2022

melff commented Oct 7, 2022 •

edited

Loading

tomwenseleers commented Oct 11, 2022

Error "Error: no valid set of coefficients has been found: please supply starting values" for model that runs OK in nnet::multinom #28

Error "Error: no valid set of coefficients has been found: please supply starting values" for model that runs OK in nnet::multinom #28

Comments

tomwenseleers commented Sep 2, 2022

melff commented Oct 7, 2022 • edited Loading

tomwenseleers commented Oct 11, 2022

melff commented Oct 7, 2022 •

edited

Loading