-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random Error in Function Bsptime #8
Comments
I have tested this with the pause inside the foreach loop but unfortunately this has no positive effect on the random error. So the problem must be somewhere else... |
So, I could now trace the error back to the function "Bsptime/BspTimer_sptime/spTimer::spT.Gibbs". It seems that the used package 'spTime' causes the error... |
I extracted the problem function that causes the random error from the 'Bsptime/BspTimer_sptime' function and set it up to test with the same settings and data I used in my first post. I think this should help clarify the error.
I think the current behavior of the package 'spTimer' occurs on all models not just the GPP model. But I didn't test that, because currently with my data the GPP-model gets the best results. Here is a short summary of what I want to use your package for. I am trying to create habitat modeling for different plants. Using the uploaded example dataset for the plant 'Alium usinum' you can see what kind of data I have available. The dependent variable always has the range 0 <= y <= 1 (or even 0% <= y <= 100% cover). The model I am currently using is the GPP-model, as this has so far given the best results with a reasonable computation time. The goal is to find a regression model with which I can make spatial and/or temporal predictions. I am not sure if the truncatedGPP-model would be better for my kind of data. I did some tests with it but they were always worse than the GPP-model. Also, in the current implementation of the truncatedGPP-model you can only define one boundary and not two. For these reasons I decided to use the GPP-model for now. What do you advise me to do? My further procedure in reference to your book and other habitat models I have already created for the identification of the 'best' regression model is currently as follows:
For the implementation I use several loops in which I create a variety of models using their package within a tryCatch - environment, then compare them and select the best one afterwards. And exactly at this point, despite the used tryCatch - environment, I always get a random error, so that the outermost loop ends with an error and then the whole script terminates. |
I think I am beginning to understand the problem. I think this is a classic convergence problem of the regression function which is not properly intercepted/handled. I mean that the respective regression model does not find a solution for the corresponding data. I don't know what the correct technical term for this is in the field of Bayesian statistics. I am not well versed in this branch of statistics. My thesis is based on the example and the data set used there from the first post here. I scale (normalization of a dataset using the mean value and standard deviation) independent variables in the dataset for testing and then ran the example from first Post in a loop 100000 times. Result, not a single error! After that I extended the loop with a second loop for different model formulas. Unfortunately this test was not successful. Means that one or more models inside the loop do not converge properly (does not find a solution), thus causing an error and thus causing the whole loop to terminate. If my thesis is true, a quick solution would be to write a work-around which intercepts the non-converging regression model from the package "spTimer" (I think also the other packages) and outputs an appropriate message and as result NA and thus prevents the current error. This work-around could be extended to the effect that a non-converging regression model is tried to be executed x times and only then a message and as result NA is output about a non-converging model. |
I have made a fork of your repository and implemented the described work-around there. I am currently testing whether it works like this. But I won't have any results until Monday....if it should work like this, you can take a look at my commits and see if you want to adopt this work-around for the official repository... |
Hello, A very general question: Does it make sense to transform (more normalise) the input data for the Bavarian regression in advance using Tukey's Ladder of Powers transformation or is this not necessary for a Bavarian regression? My tests have shown that a standardization (scale() in R) has a positive influence on the model building, but what about a transformation? |
Sorry, I have not been able to replicate the random error you reported in the foreach loop. Again I ran the code on both my Windows and Linux machines. Please can you try to run this example on a different machine and tell me if this is still a problem. It will take me time to implement the power transformation. I will talk to you over email before adding this feature. Please let me knoiw if it is okay to close this issue. |
I have tested the example from the first post on different computers and it appears every now and then when spTimer does not find a solution and outputs different NA in the model. This then causes various subsequent errors which I catch in the fork. Please have a look at the change to solve the problem in the fork. It works very well and should be included in the maincode if possible, unless they have a better solution. |
Sorry, I am not able to see your fork or solution. I did not see your pull request as well. Please can you post it again. I will have a look. |
Oh, my bad. I thought you can see the fork on github. I make a pull request on... |
Please take another look at the closed thread about the error in the prediction. I have written something else about it.... |
I could not see that thread. Please feel free to open a new issue. |
I looked at it all again and it looks like the total time is not the problem after all (Issue). What the actual problem is, I don't know. However, I have observed that if you run the "Bsptime" function with the same data and settings in a loop, that it then sporadically produces errors.
Here is a minimal example with my data where the error occurs from time to time:
My data: data.csv
Note: The error occurs randomly. If you do not get the error immediately just increase the number of loop passes.
Here is a screenshot of a minimal example with 3 passes with the error:
You can see that the model was executed 2 times without error and only on the 3rd time the model produces error. All settings and data are the same!
Here a screenshot of the same function being executed twice (same data and settings) once without error and then with error outside a loop:
What is very strange is the fact that sometimes the model works without error and then again it doesn't....or is it my data?
Maybe the functions must not be executed so quickly one after the other, because something is not yet processed in the background. I do not know....
Please take a look at this, otherwise I can't use your wonderful package because it doesn't work reliably.
My system:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)
Version bmstdr: 0.2.2
The text was updated successfully, but these errors were encountered: