-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sampleMcmc() error in chol.default(W) #8
Comments
@keilast thanks for posting this here instead of the personal email.
I guess that potential source of error you encounter can be simply due to the numerical precision of the spatial covariance matrix. Thus, since you have very close locations and quite-far-away locations (guessing based on that you have Region random effect), the covariance matrix for your spatial random effect is going to be close to singular, leading to the Cholesky decomposition failure. |
Yes, I suspected the same re: the precision of the spatial points. I have considered converting the coordinates to UTM, however they span two UTM zones, and I am not sure this will do anything to fix the problem. Yes, the code throws the error instantaneously, before any "Computing chain" outputs. Removing sites so the minimum cutoff distance is >10% of the maximum distance does not fix the problem either, yet works fine when I try to fit that reduced-site version of the model without the spatial rL. Thank you! |
@keilast , so now I am almost 100% sure that your problem happens in the internal function Given your description of the spatial data you input to the model fitting, there should be no problem, yet you still encounter it. Thus, I suggest that you try to execute that part of code outside of package in separate script, which would allow to explicitly check what and why goes wrong. So, just copy-paste the lines 47-75 from that file and try to run them after you have defined your model as you did earlier, but save it to If you have the same error message, check the content of variables |
Hello, So I have gone and done as you've recommended, and the same error came up. W = square matrix of mostly NA's, except for the diagonal elements which = 1 Since you mentioned the coordinates may not be read properly, I saved them as a matrix (previously was a dataframe) before naming it as sData in rL4, and ran the exact same code. Now this error comes up:
, but I checked hM$rL[[4]]$s give the correct spatial coordinates. I have tried transforming the coordinate data (ie. multiplying by 10 000) but that did not change the outcome. Thanks |
Hello,
This clearly indicates that there is some issue with imputation of the coordinates. I believe that the reason is due to reading-in some non we almost figured out the issue.
Are you sure that your matrix consists of numbers and not strings and its rownames are the same values as you plug to the Thanks for co-operation! |
Hello, sorry for the delayed response on this! A colleague tinkered with my data and the function ran successfully:
Cheers! |
Dear @gtikhonov , I am having a similar problem trying to sampleMcmc a model 2 random effects, one that was the spatial coordinates and the other a regional categorical variable. Since I have >1000 community samples, the spatial method set to GPP, using about 117 knots. It is a hurdle model, so I am first fitting the presence/absence data and then species abundances (individuals ha-1 from 0.001 until ~2000) and relative densities (0-1 values). I log-transformed abundances and relative densities to use distr = "normal". All my coordinates are in meters and I have jittered coordinates that were too close to each other to avoid problems related to the precision of the spatial coordinates. However, I still have distances between units that vary from 5 meters to 3e+06 meters. Anyways, not sure if this spatial precision is indeed an issue while using GPP. Notably, the error only appeared when I increased the thin x samples combination and only for the abundance part of the model. The error appears error by the end of chain 1, sometimes of chain 2: I run the tests you suggested for keilast here are the results for the computeDataParameters part of the sampleMCMC function. Here are the results: In conclusion, I think the problem is probably in the small distance between some of my sites. But there may be a possible interaction between these distances and data type (problem not arises with presence/absence data, only abundances). Any clues? You mentioned some that you had more detailed advice on how you can handle the whole data. I have spent the entire morning truing to solve this issue, so any help would be very welcome. Thanks in advance, |
Hi @LimaRAF Generally, I would recommend to reconsider how you input the spatial effect to the study. However, the way you do that depends on what you want to capture with the spatial effect. One option is instead of having a spatial effect on the level of samples, you can have it on the level of "clusters of samples". and then extra no-spatial effect on the level of samples. Other option is to have a single spatial latent effect, but restrict its spatial range to somewhat small (although GPP is a poor choice for this one). The first option aims to capture long-range spatial correlation, the second - short range. Well, you can have both as well. But the key problem here is that it causes numerical instabilities if you do it in the straightforward way, so here is the idea of separating it in two levels. |
Dear @gtikhonov , Thanks for the quick reply and sorry for posting in an different issue type (do you want me to post it separatedly for the record?) However, I am not sure I fully understood your suggestion. You are proposing to nest my samples in my regional categorical variable? The region has a big impact on species distribution, but setting it as a fixed effects led to predictions on parts of the regions where the species actually does not occurrence or creating very sharp differences in the prediction in the transitions/contact of these regions. So the solution was to use it as an extra random effect, hoping to still capture the effect but without creating such 'weird' predictions. Currently, I am defining my study design and random effects like this: My Sdata: Defining the random effects and setting the spatial model: Are you suggesting to do: and then drop rL.region from my the 'ranLevels'of my model? Thanks again, |
I have opened and will reply in a separate issue. |
For people reading this in the future: I had the same issue as keilast I was able to fix this error by making sure that the rownames of the latlong matrix matched the spatial plots/routes/whatever the coordinates are supposed to mark in the studyDesign data frame. My code looks like this: |
I am trying to fit a model with 4 random effects, one is spatial coordinates to be specified with the sData argument, and three are hierarchical (Quadrat, Site, and Region). The input for the spatial random effect object (sData) is a matrix with two columns (lat and long), and a row for each observation. The spatial coordinates are in conventional lat and long to four decimal places, as some points are as close as 15m apart. However each row has a distinct set of coordinates; I have tested this. The object "Pi" below describes my three hierarchical random effects.
When I use the sampleMcmc() function I get the error:
Error in chol.default(W) : the leading minor of order 2 is not positive definite
I do not see this error when I attempt to fit the exact same model but with no spatial random effect (rL4 not included). I have also tried swapping the order of the rL's when specifying my Hmsc object, but this doesn't change anything.
The text was updated successfully, but these errors were encountered: