Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in plotSimulatedResiduals for very few unique predictions #42

Closed
florianhartig opened this issue Jan 31, 2018 · 16 comments
Closed

Comments

@florianhartig
Copy link
Owner

From a user: It's working fine most of the time but sometimes I get this error using:
plotSimulatedResiduals

Error in qrnn::qrnn.fit(x = as.matrix(pred), y = as.matrix(res), n.hidden = 4, :
zero variance column(s) in "x"

@florianhartig florianhartig changed the title Error message Error message in plotSimulatedResiduals Jan 31, 2018
@florianhartig
Copy link
Owner Author

This error message comes from the quantile regression that plots these lines on the residuals, probably in some cases this functions fails. I will add some code to catch the error in the next version of the pacakge.

In the meantime, you can suppress the quantile regression by setting quantreg = F in the arguments of plotSimulatedResiduals.

@mcauchoix
Copy link

Thank you so much for your super quick reply.
I tried that argument and the new error i get is:
Error in smooth.spline(pred, res, df = 10) :
need at least four unique 'x' values

@florianhartig
Copy link
Owner Author

hmm ... that seems weird, maybe more a problem of the models you are trying to plot. How many unique values do you have? can you do a

unique(predict(fittedModel))

If you really have <4 unique predictions (which could happen if you have only one categorical predictor), the plotSimulatedResiduals function won't work at the moment, but you can use

plotResiduals

specifying the x values by hand with as.factor(predict(fittedModel)) - if you specify the x as factor, the function will do a boxplot instead of the normal plot.

At the moment, I have no option implemented to do the same in the main function. If you want to do the quantile plot alone, just use

gap::qqunif(simulationOutput$scaledResiduals,pch=2,bty="n", logscale = F, col = "black", cex = 0.6, main = "QQ plot residuals", cex.main = 1)

If that is indeed the problem, I guess I could implement something that automatically realizes if you have very few unique predictions, and in this case switches the res vs pred plot to a categorical plot options, or at least suppresses the lines

@florianhartig
Copy link
Owner Author

If you don't mind and if that's possible, could you send me a fitted model (use the save function) so that I can have a try myself? Note that the data will be attached to this object, not sure if that's an issue.

@mcauchoix
Copy link

mcauchoix commented Jan 31, 2018 via email

@florianhartig
Copy link
Owner Author

Hi Maxime - OK, I have looked at this. As you see above, in principle, you have enough variation in the response. So, a naive residual plot

res = simulateResiduals(Rn_Mod)
plotResiduals(predict(Rn_Mod), res$scaledResiduals)

would work. However, if you try this, you will see that this plot shows a pattern. This is not because of a problem in your model, but because your variation in x stems mostly from the random effect in the model, which you could view as a kind of residual as well, so you are plotting res against residual. Therefore, the DHARMa default plot plots only the fixed effect predictions on x. You can emulate this via

plotResiduals(predict(Rn_Mod, re.form = ~0), res$scaledResiduals)

Now, we have only two values for the predictions, and this is what seems to have caused your problems. Curiously, for me the fitted splines / quantile regressions never produced an error, so I don't know if this is a problem specific to your R / package version - I'll have to write a test to see if this is somehow platform-dependent.

Anyway, this is the reason for the problems. The quick fix for you would be to do the plots by hand, and convert the predictions as fact, then DHARMa will produce boxplots instead of the scatter plots, so you can do

plotResiduals(as.factor(predict(Rn_Mod, re.form = ~0)), res$scaledResiduals)

and the qq plot via

gap::qqunif(res$scaledResiduals,pch=2,bty="n", logscale = F, col = "black", cex = 0.6, main = "QQ plot residuals", cex.main = 1)

I will try to implement some kind of fix in the next package version, I'm just not sure yet if I should catch the errors or rather switch the plots if there are only a few unique values for pred.

Will leave this ticket open until this is done

@florianhartig florianhartig changed the title Error message in plotSimulatedResiduals Error message in plotSimulatedResiduals for few unique prediction values Jan 31, 2018
@florianhartig florianhartig changed the title Error message in plotSimulatedResiduals for few unique prediction values Error in plotSimulatedResiduals for very few unique predictions Jan 31, 2018
@mcauchoix
Copy link

mcauchoix commented Feb 1, 2018 via email

@florianhartig
Copy link
Owner Author

sorry, I get the error of smooth.line, but the default plot with the quantile regression works fine. In any case, something has to be done, I'm just not sure yet whether to suppress the lines or switch to the boxplot if the unique values are <4 or so.

@mcauchoix
Copy link

Switching to boxplot would be great! I can test it for you if want to update that ;-)

@mcauchoix
Copy link

I'm running nearly 300 models on different datasets with all kind of distribution for a meta-analysis so it might be useful to test the generality of the code.

florianhartig added a commit that referenced this issue Feb 1, 2018
* significance reported in qq plot
* qqplot moved to extra function
* catch error in regression splines / quantile regression, see
#42
@florianhartig
Copy link
Owner Author

OK, I have introduced error catching in the plot function, so at least this suppresses the error so that your script doesn't stop. It will take a bit until this is pushed to CRAN, but you can get this feature already now by installing the development version of DHARMa from GitHub, see https://github.com/florianhartig/DHARMa.

Of course, there are also some other changes in the development version - if you just want to get the plot function, you can load DHARMa as usual, and then overwrite the plot functions with the current development version via source("https://raw.githubusercontent.com/florianhartig/DHARMa/master/DHARMa/R/plotResiduals.R")

@florianhartig florianhartig added this to the 0.1.6 Release milestone Feb 6, 2018
@florianhartig
Copy link
Owner Author

Todo:

  • Write unit test to test this problem
  • Decide on final solution for the plot

@mcauchoix
Copy link

mcauchoix commented Feb 13, 2018 via email

@florianhartig
Copy link
Owner Author

test uniformity is doing a KS test for uniformity, see help. You can think of this as the equivalent of a shapiro-test in a linear regression, where test residuals for normality. But in DHARMa, we expect residuals to be uniform (see vignette for explanations), therefore we test uniform.

I have no formal test for heteroskedasticity yet, but you should of course look out for it in the res vs. predicted and res vs. variable plots.

overdispersion = basically yes, p>0,05 = no overdispersion, althought strictly speaking, has for any null hypothesis, it just means you can't show that H0 = no overdispersion is wrong , doesn't mean that H0 is right

@mcauchoix
Copy link

mcauchoix commented Feb 13, 2018 via email

florianhartig added a commit that referenced this issue Mar 5, 2018
Various plot updates

* rank transformation #44
* asFactor option and warning
#42
* new plot layout with help lines
* help updates
@florianhartig
Copy link
Owner Author

OK, I think this is now working, will be included in the 0.1.6 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants