non-positive definite Hessian #240

BertvanderVeen · 2025-12-22T09:45:43Z

BertvanderVeen
Dec 22, 2025
Collaborator

Thanks for the insight @BertvanderVeen! I hadn't thought to look at the Hessian (and before this didn't know what to look for), but I think you were right as I was seeing both negative and postive values. I was indeed setting Power = NULL already.

Very briefly, I'm modelling diet (biomass) in salmon collected over multiple seasons and years in a large sound (lots and lots of zero biomass and small values for different taxa, then some large values when a salmon eats something big like a fish). So I have biological variables (condition, body size), temporal variables (day of year, year), and spatial variables (I've ended up using PCNM values as predictors rather than an autocorrelation structure on the row effect of collection site, as DHARMa scaled residual plots/tests and AIC both seem to think PCNM does a better job). I was avoiding interactions because I got to a place where DHARMa seemed OK and I didn't want to further complicate matters. However trying emmeans made me realized that my 'best' model likely did have non-positive-defiteness in the Hessian, as you say. Because I know biologically the salmon were using different areas (spatially) depending on body size, I have just tried putting a body size interaction on all my PCNM values, and now I can get emmeans with confience intervals. DHARMa still seems OK as it's going to get for 15 taxa likely, and AIC is lower than the no-interaction model.

So, probably still more exploring to make sure I'm not bungling up anything else, but thanks a ton for pointing me in a productive direction again!

Originally posted by @GCov in #159 (reply in thread)

BertvanderVeen · 2025-12-22T09:47:30Z

BertvanderVeen
Dec 22, 2025
Collaborator Author

Thanks. You'll get lower AIC when you include extra fixed effects compared to a random effects autocorrelation structure, because that is usually how it works - random effects add a penalty to the likelihood whereas fixed-effects do not. As such, comparing models with random vs. fixed effects is challenging. In fact, you shouldn't do it; comparing a model with and without random effects constitutes a test on the boundary of the valid parameter space, which is where AIC breaks down (i.e., is invalid to use).

Residual diagnostics (as in DHARMa, or the native residuals included in gllvm) do not measure goodness-of-fit, only validity of assumptions.

Usually, if a model is finicky it is a sample size problem. How many non-zero observations do you have for each species, and how many effects in the model?

0 replies

GCov · 2025-12-22T17:42:02Z

GCov
Dec 22, 2025

Thanks Bert!

That makes sense about the random vs. fixed effects and AIC. I do understand that DHARMa checks validity of assumptions, but if fixed effects are capturing variation better than a random effects autocorrelation structure and thus showing that assumptions are better met according to DHARMa, that should suggest prefering the fixed effect structure, shouldn't it? I took it to mean that there were complex spatial effects beyond just closer sites being more correlated with one another. So fixed spatial effects with multiple dimensions might be better capturing spatial gradients. I tend to think about things biologically first (what makes sense) then use DHARMa to check that the structure I've chosen is appropriately capturing variation. If not, I adjust model structure. This is what I've picked up after reading various statistics texts and online, but correct me if this sounds flawed! I will stop using AIC for random effects though, do you have suggestion for a better metric? Or does looking at DHARMa still make sense?

You're right that it could be a sample size problem. I have 575 observations from salmon of consumed biomass in their stomachs across 15 taxa. They were sampled across 13 sites, which I'm including as a random effect, as well as including 3 PCNM values as fixed effects to account for spatial effects. They were also sampled across three years and different time ranging from May to September, so I'm including year * day of year as fixed effects. I'm concluding fork length and condition factor as fixed effects, as well as whether the salmon are presumed to be hatchery or wild fish (from various tags/metrics), so two levels, as well as which genetic stock (aka where they spawn), which has 5 levels. I standarized all continuous predictors. There are a lot of zeros in the biomass data, mostly over 50% for all taxa. Here's a plot of percent non-zeros for each taxa.

I've noticed that the coefficient plots have been really sensitive to model structure, but that predictive plots don't tend to change as much. As you say, this may have to do with sample size and being sensitive towards whether coefficient 95% CI overlaps with zero or not. However, because the model predictive plots and variance partitioning plots seemed more robust to changes in model structure, I have been leaning towards interpreting those in terms of which factors are important for explaining ingested biomass, especially when DHARMa suggests that my final model is at least (mostly) meeting assumptions. For example, whether they're hatchery fish and from which stock they come constently shows low variance partitioning, but spatial and temporal effects high, as well as higher fork length predicting greater consumption of other fish according to predictive plots when averaging over other effects, even when the coefficient doesn't come out as 'significant', perhaps due to fork length being highly inter-related to spatial and temporal variables, which we know it is.

Overall, I'm not trying to forecast and the people asking for this analysis are mostly interested in what the salmon are eating when and whether hatchery fish or certain stocks have different dietary habits. I think I can give them conclusions on that despite an imperfect model, and it doesn't seem like I'll get a 'perfect model' given the data and complexity of the problem. But let me know if you think I can improve my approach somehow, and sorry for the long post!!

3 replies

BertvanderVeen Jan 16, 2026
Collaborator Author

This all sounds good, sorry for the slow reply! Residual diagnostics are always valid, unlike information criteria. Let me know if there's anything else I can do to help.

GCov Mar 3, 2026

Thanks Bert! Sorry, I have also been slow to reply. The new se.fit option in the predict function has been really useful for me, thank you for adding that. One kind of unexpected use has been in it giving the warning about fixed or random effect covriance matrices not being semi positive-definite. This warning made me realize that the constrained oridination models I was trying were too complicated and led to me switching to an unconstrained model, which I think was the right decision and also led to better interpretability. I wonder if there is another way I should be checking for this issue, because I hadn't realized that it was a problem before. My usual diagnostics (DHARMa) don't tell me anything about this, and I was able to get emmeans (from your code) and predictions (before having the se.fit option) fine before. This has been a fun learning process for me, and if you have any tips about checking for this kind of issue as I go I'd much appreciate it!

BertvanderVeen Mar 3, 2026
Collaborator Author

Thanks Bert! Sorry, I have also been slow to reply. The new se.fit option in the predict function has been really useful for me, thank you for adding that. One kind of unexpected use has been in it giving the warning about fixed or random effect covriance matrices not being semi positive-definite. This warning made me realize that the constrained oridination models I was trying were too complicated and led to me switching to an unconstrained model, which I think was the right decision and also led to better interpretability. I wonder if there is another way I should be checking for this issue, because I hadn't realized that it was a problem before. My usual diagnostics (DHARMa) don't tell me anything about this, and I was able to get emmeans (from your code) and predictions (before having the se.fit option) fine before. This has been a fun learning process for me, and if you have any tips about checking for this kind of issue as I go I'd much appreciate it!

It's hard to diagnose, and there are debates in the stats community whether model fits at a point with a singular Hessian are valid. Having said that, usually in gllvm it indicates that the Hessian is not numerically stable. That can be because the model is too complex, the model is not flexible enough, or there are data deficient species. Very often, the issue can be resolved by setting n.init and n.init.max to something ridiculously large and grabbing a coffee.

More specifically, the hessian of fixed effects is stored in the model object under object$Hess$cov.mat.mod. If you want to more extensively check what is going on, or which parameters are causing issues, you can try something like setNames(diag(object$Hess$cov.mat.mod), names(object$TMBfn$par[object$Hess$incl]))[diag(object$Hess$cov.mat.mod)<0] to check which parameters have (asymptotic) variances below zero. Depending on which parameters are causing problems, there are then different things you can do to adjust your model fit and improve the issue. Since the latter steps are very case-specific, it is hard for me to be more specific here.

Switching from an unconstrained to a constrained ordination should not be necessary, but constrained ordination models are a bit finicky (especially with fixed-effects canonical coefficients, which is the default in gllvm), and may require a bit more troubleshooting to get things right. If you can share the code/model specifics for you case, I can try to give more in-depth advice.

GCov · 2026-03-03T18:10:26Z

GCov
Mar 3, 2026

Thank you!

For this example I'm looking at factors explaining salmon diet across a sound. After removing empty stomachs, there are 575 fish samples from 13 locations during May-September over three years. It's not a balanced sampling design and some 'sites' are very close to one another and farther from others, so I've coverted latitute and longitude to four PCNM (principle coordinates of neighbour matrices) values. The salmon are genetically IDed to five stocks (based on where they spawn), although one of those groups is 'unknown'. I also have data on the fork length and condition index of the salmon and whether each fish is likely from a hatchery or a natural population. I'm interested in how spatial, temporal, and biological (size, condition, stock, hatchery) affect diet. Diet is mass in grams so I'm using a Tweedie distribution.

The variables in the model are:

length (fork length)
condition (condition index)
year
DOY (day of year standarized to be from 0-1)
stock
hatchery (yes or no)
PCNM1
PCNM2
PCNM3
PCNM4
location (collection site)

Comparing unconstrained and constrained models:

Unconstrained
UCmod <- gllvm(y = Yvars, X = Xvars), family = "tweedie", num.RR = 0, num.lv = 2, studyDesign = Ranef, formula = ~ stock + hatchery + length + condition + year * st_doy + PCNM1 + PCNM2 + PCNM3 + PCNM4, lvCor = ~ (1 | set_location), method = "EVA", sd.errors = TRUE, Power = NULL, control.start = list(n.init = 3), seed = 2452)

Constrained
Cmod <- gllvm(y = Yvars, X = Xvars), family = "tweedie", num.RR = 2, num.lv = 0, studyDesign = Ranef, lv.formula = ~ stock + hatchery + length + condition + year * st_doy + PCNM1 + PCNM2 + PCNM3 + PCNM4, row.eff = ~ (1 | set_location), method = "EVA", sd.errors = TRUE, Power = NULL, control.start = list(n.init = 3), seed = 2452)

UCmod gives me 4/15 taxa with mild DHARMa residual issues, predict(se.fit = TRUE) works. Also nice because I can make a species correlation plot.

Cmod gives me 6/15 taxa with DHARMa residual issues, predict(se.fit = TRUE) does not work and tells me that the fixed effect covariance matrix is not semi positive definite.

In the end, simulating from either model ends up telling a similar story about the impact of most variables (larger fish are more likely to eat bigger things as they grow later in the season, with some spatial differences in what they're eating). The major difference is that the unconstrained model suggests that stock explains 18% of variation, on average (from a variance partitioning plot) , while the constrained model is closer to 1%. That said, running EMmeans on the unconstrained model shows overlapping 95% CIs on the stock means. I'm trusting the unconstrained model here because DHARMa seems a bit better, and it's not throwing the error with singularity in the Hessian. A single LV unconstrained model actually has a lower AICc, but I'm going with the 2 LV model because it's more interpretable (can look at degree of species covariance), with the caveat that it doesn't necessarily explain the data better.

What are your thoughts on this?

10 replies

GCov Mar 6, 2026

Running randomCoefPlot gives me the error:
Error in 1:nr[re] : argument of length 0

summary is a bit helfpful but hard to interpret with so many variables and levels. For the random effects LV predictors part, it shows the same CLV standard deviations and variances for all predictors. Is that correct?

plot(summary(model)) is also useful, although again I find interpretation difficult. For example, it suggest fork length is not significant, however when I simulate from the model, there is a clearly a strong increase in fish consumption with fork length (which makes biological sense).

BertvanderVeen Mar 6, 2026
Collaborator Author

Yes, that's correct.

I'm not sure what's going on with the coefplot, I'll check it out. If you're able to before I check it out, can you construct a reproducible example?

It's possible that your interval from predict was somewhat too narrow, I realised a mistake in the predict.gllvm calculation of the intervals that's now been corrected, although I need to have another look on Monday with fresh eyes.

GCov Mar 6, 2026

Hmm, I tried to generate an example but randomCoefplot worked on my example data and I'm not sure how to generate a more complicated example that might produce this error, sorry!

BertvanderVeen Mar 7, 2026
Collaborator Author

Alternatively, can you save and send me your model object?

GCov Mar 9, 2026

Ok! I just emailed you!

non-positive definite Hessian #240

Uh oh!

BertvanderVeen Dec 22, 2025 Collaborator

Replies: 3 comments · 13 replies

Uh oh!

BertvanderVeen Dec 22, 2025 Collaborator Author

Uh oh!

Uh oh!

GCov Dec 22, 2025

Uh oh!

BertvanderVeen Jan 16, 2026 Collaborator Author

Uh oh!

GCov Mar 3, 2026

Uh oh!

BertvanderVeen Mar 3, 2026 Collaborator Author

Uh oh!

Uh oh!

GCov Mar 3, 2026

Uh oh!

GCov Mar 6, 2026

Uh oh!

Uh oh!

BertvanderVeen Mar 6, 2026 Collaborator Author

Uh oh!

GCov Mar 6, 2026

Uh oh!

BertvanderVeen Mar 7, 2026 Collaborator Author

Uh oh!

GCov Mar 9, 2026

BertvanderVeen
Dec 22, 2025
Collaborator

Replies: 3 comments 13 replies

BertvanderVeen
Dec 22, 2025
Collaborator Author

GCov
Dec 22, 2025

BertvanderVeen Jan 16, 2026
Collaborator Author

BertvanderVeen Mar 3, 2026
Collaborator Author

GCov
Mar 3, 2026

BertvanderVeen Mar 6, 2026
Collaborator Author

BertvanderVeen Mar 7, 2026
Collaborator Author