-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in plotSimulatedResiduals for very few unique predictions #42
Comments
This error message comes from the quantile regression that plots these lines on the residuals, probably in some cases this functions fails. I will add some code to catch the error in the next version of the pacakge. In the meantime, you can suppress the quantile regression by setting quantreg = F in the arguments of plotSimulatedResiduals. |
Thank you so much for your super quick reply. |
hmm ... that seems weird, maybe more a problem of the models you are trying to plot. How many unique values do you have? can you do a
If you really have <4 unique predictions (which could happen if you have only one categorical predictor), the plotSimulatedResiduals function won't work at the moment, but you can use
specifying the x values by hand with as.factor(predict(fittedModel)) - if you specify the x as factor, the function will do a boxplot instead of the normal plot. At the moment, I have no option implemented to do the same in the main function. If you want to do the quantile plot alone, just use
If that is indeed the problem, I guess I could implement something that automatically realizes if you have very few unique predictions, and in this case switches the res vs pred plot to a categorical plot options, or at least suppresses the lines |
If you don't mind and if that's possible, could you send me a fitted model (use the save function) so that I can have a try myself? Note that the data will be attached to this object, not sure if that's an issue. |
Thanks! I will try that. Attached the fitted model.
Browse[2]> unique(predict(Rn_Mod))
[1] 0.395069351 -0.274755256 -0.129444404 0.135819345
[5] -0.471134318 -0.136551554 -0.482839849 -0.244911294
[9] -0.004249219 0.394139831 0.424494156 -0.297700412
[13] -0.127675511 0.223951042 0.292125137 -0.065351270
[17] 0.425362814 -0.200097983 -0.191881019 -0.034607072
[21] 0.315443759 -0.025441888 -0.111515551 -0.412508223
[25] 0.049440135 0.554819254 0.395069351 -0.274755256
[29] -0.129444404 0.135819345 -0.471134318 -0.136551554
[33] -0.482839849 -0.244911294 -0.004249219 0.394139831
[37] 0.424494156 -0.297700412 -0.127675511 0.223951042
[41] 0.292125137 -0.065351270 0.425362814 -0.200097983
[45] -0.191881019 -0.034607072 0.315443759 -0.025441888
[49] -0.111515551 -0.412508223 0.049440135 0.554819254
…________________________________________________
Maxime Cauchoix
PhD, Station d’écologie experimentale du CNRS à Moulis
07 85 23 51 43
???!
(°v°) ? ! ...
(O) (°v°) (°v°) (°v°)
_II_ \\ #############
________________________________________________
2018-01-31 13:56 GMT+01:00 Florian Hartig <notifications@github.com>:
If you don't mind and if that's possible, could you send me a fitted model
(use the save function) so that I can have a try myself? Note that the data
will be attached to this object, not sure if that's an issue.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#42 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAsYfJm0oZdcxO96Xx6WjHYZx86GI-0oks5tQGMSgaJpZM4Rz2lv>
.
|
Hi Maxime - OK, I have looked at this. As you see above, in principle, you have enough variation in the response. So, a naive residual plot
would work. However, if you try this, you will see that this plot shows a pattern. This is not because of a problem in your model, but because your variation in x stems mostly from the random effect in the model, which you could view as a kind of residual as well, so you are plotting res against residual. Therefore, the DHARMa default plot plots only the fixed effect predictions on x. You can emulate this via
Now, we have only two values for the predictions, and this is what seems to have caused your problems. Curiously, for me the fitted splines / quantile regressions never produced an error, so I don't know if this is a problem specific to your R / package version - I'll have to write a test to see if this is somehow platform-dependent. Anyway, this is the reason for the problems. The quick fix for you would be to do the plots by hand, and convert the predictions as fact, then DHARMa will produce boxplots instead of the scatter plots, so you can do
and the qq plot via
I will try to implement some kind of fix in the next package version, I'm just not sure yet if I should catch the errors or rather switch the plots if there are only a few unique values for pred. Will leave this ticket open until this is done |
Thank you Florian.
I had a look into smooth.line:
The x vector should contain at least four distinct values. ‘Distinct’ here
is controlled by tol: values which are regarded as the same are replaced by
the first of their values and the corresponding y and w are pooled
accordingly.
defaults to 1e-4 (formerly 1e-3).
Maybe it should be adapted to x range?
Thanks again!
…________________________________________________
Maxime Cauchoix
PhD, Station d’écologie experimentale du CNRS à Moulis
07 85 23 51 43
???!
(°v°) ? ! ...
(O) (°v°) (°v°) (°v°)
_II_ \\ #############
________________________________________________
2018-01-31 21:54 GMT+01:00 Florian Hartig <notifications@github.com>:
Hi Maxime - OK, I have looked at this. As you see above, in principle, you
have enough variation in the response. So, a naive residual plot
res = simulateResiduals(Rn_Mod)
plotResiduals(predict(Rn_Mod), res$scaledResiduals)
would work. However, if you try this, you will see that this plot shows a
pattern. This is not because of a problem in your model, but because your
variation in x stems mostly from the random effect in the model, which you
could view as a kind of residual as well, so you are plotting res against
residual. Therefore, the DHARMa default plot plots only the fixed effect
predictions on x. You can emulate this via
plotResiduals(predict(Rn_Mod, re.form = ~0), res$scaledResiduals)
Now, we have only two values for the predictions, and this is what seems
to have caused your problems. Curiously, for me the fitted splines /
quantile regressions never produced an error, so I don't know if this is a
problem specific to your R / package version - I'll have to write a test to
see if this is somehow platform-dependent.
Anyway, this is the reason for the problems. The quick fix for you would
be to do the plots by hand, and convert the predictions as fact, then
DHARMa will produce boxplots instead of the scatter plots, so you can do
plotResiduals(as.factor(predict(Rn_Mod, re.form = ~0)),
res$scaledResiduals)
and the qq plot via
gap::qqunif(res$scaledResiduals,pch=2,bty="n", logscale = F, col =
"black", cex = 0.6, main = "QQ plot residuals", cex.main = 1)
I will try to implement some kind of fix in the next package version, I'm
just not sure yet if I should catch the errors or rather switch the plots
if there are only a few unique values for pred.
Will leave this ticket open until this is done
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#42 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAsYfF-k8HD7G9Ic9BuBRqHthym2-gBgks5tQNMDgaJpZM4Rz2lv>
.
|
sorry, I get the error of smooth.line, but the default plot with the quantile regression works fine. In any case, something has to be done, I'm just not sure yet whether to suppress the lines or switch to the boxplot if the unique values are <4 or so. |
Switching to boxplot would be great! I can test it for you if want to update that ;-) |
I'm running nearly 300 models on different datasets with all kind of distribution for a meta-analysis so it might be useful to test the generality of the code. |
* significance reported in qq plot * qqplot moved to extra function * catch error in regression splines / quantile regression, see #42
OK, I have introduced error catching in the plot function, so at least this suppresses the error so that your script doesn't stop. It will take a bit until this is pushed to CRAN, but you can get this feature already now by installing the development version of DHARMa from GitHub, see https://github.com/florianhartig/DHARMa. Of course, there are also some other changes in the development version - if you just want to get the plot function, you can load DHARMa as usual, and then overwrite the plot functions with the current development version via source("https://raw.githubusercontent.com/florianhartig/DHARMa/master/DHARMa/R/plotResiduals.R") |
Todo:
|
Dear Florian,
I have difficulty to find information on what is uniformity of residuals
and then what your uniformity test is actually testing. Would that be like
a test of normality or more homoscedasticity of residuals?
Do you have a statistical test allowing to test normality of residuals? or
it's just visual inspection of QQplots?
Although for parametric testing of overdisperssion, I'm not certain about
H0. p>0,05 would mean that there is no overdispression, right?
Many thanks
Maxime
…________________________________________________
Maxime Cauchoix
PhD, Station d’écologie experimentale du CNRS à Moulis
07 85 23 51 43
???!
(°v°) ? ! ...
(O) (°v°) (°v°) (°v°)
_II_ \\ #############
________________________________________________
2018-02-01 8:54 GMT+01:00 Maxime Cauchoix <mcauchoixxx@gmail.com>:
Thank you Florian.
I had a look into smooth.line:
The x vector should contain at least four distinct values. ‘Distinct’
here is controlled by tol: values which are regarded as the same are
replaced by the first of their values and the corresponding y and w are
pooled accordingly.
defaults to 1e-4 (formerly 1e-3).
Maybe it should be adapted to x range?
Thanks again!
________________________________________________
Maxime Cauchoix
PhD, Station d’écologie experimentale du CNRS à Moulis
07 85 23 51 43
???!
(°v°) ? ! ...
(O) (°v°) (°v°) (°v°)
_II_ \\ #############
________________________________________________
2018-01-31 21:54 GMT+01:00 Florian Hartig ***@***.***>:
> Hi Maxime - OK, I have looked at this. As you see above, in principle,
> you have enough variation in the response. So, a naive residual plot
>
> res = simulateResiduals(Rn_Mod)
> plotResiduals(predict(Rn_Mod), res$scaledResiduals)
>
> would work. However, if you try this, you will see that this plot shows a
> pattern. This is not because of a problem in your model, but because your
> variation in x stems mostly from the random effect in the model, which you
> could view as a kind of residual as well, so you are plotting res against
> residual. Therefore, the DHARMa default plot plots only the fixed effect
> predictions on x. You can emulate this via
>
> plotResiduals(predict(Rn_Mod, re.form = ~0), res$scaledResiduals)
>
> Now, we have only two values for the predictions, and this is what seems
> to have caused your problems. Curiously, for me the fitted splines /
> quantile regressions never produced an error, so I don't know if this is a
> problem specific to your R / package version - I'll have to write a test to
> see if this is somehow platform-dependent.
>
> Anyway, this is the reason for the problems. The quick fix for you would
> be to do the plots by hand, and convert the predictions as fact, then
> DHARMa will produce boxplots instead of the scatter plots, so you can do
>
> plotResiduals(as.factor(predict(Rn_Mod, re.form = ~0)),
> res$scaledResiduals)
>
> and the qq plot via
>
> gap::qqunif(res$scaledResiduals,pch=2,bty="n", logscale = F, col =
> "black", cex = 0.6, main = "QQ plot residuals", cex.main = 1)
>
> I will try to implement some kind of fix in the next package version, I'm
> just not sure yet if I should catch the errors or rather switch the plots
> if there are only a few unique values for pred.
>
> Will leave this ticket open until this is done
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#42 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAsYfF-k8HD7G9Ic9BuBRqHthym2-gBgks5tQNMDgaJpZM4Rz2lv>
> .
>
|
test uniformity is doing a KS test for uniformity, see help. You can think of this as the equivalent of a shapiro-test in a linear regression, where test residuals for normality. But in DHARMa, we expect residuals to be uniform (see vignette for explanations), therefore we test uniform. I have no formal test for heteroskedasticity yet, but you should of course look out for it in the res vs. predicted and res vs. variable plots. overdispersion = basically yes, p>0,05 = no overdispersion, althought strictly speaking, has for any null hypothesis, it just means you can't show that H0 = no overdispersion is wrong , doesn't mean that H0 is right |
Excellent! thank you so much.
…________________________________________________
Maxime Cauchoix
PhD, Station d’écologie experimentale du CNRS à Moulis
07 85 23 51 43
???!
(°v°) ? ! ...
(O) (°v°) (°v°) (°v°)
_II_ \\ #############
________________________________________________
2018-02-13 15:07 GMT+01:00 Florian Hartig <notifications@github.com>:
test uniformity is doing a KS test for uniformity, see help. You can think
of this as the equivalent of a shapiro-test in a linear regression, where
test residuals for normality. But in DHARMa, we expect residuals to be
uniform (see vignette for explanations), therefore we test uniform.
I have no formal test for heteroskedasticity yet, but you should of course
look out for it in the res vs. predicted and res vs. variable plots.
overdispersion = basically yes, p>0,05 = no overdispersion, althought
strictly speaking, has for any null hypothesis, it just means you can't
show that H0 = no overdispersion is wrong , doesn't mean that H0 is right
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#42 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAsYfJpUhkrFV0asRUInqquAoxHsWlaCks5tUZcwgaJpZM4Rz2lv>
.
|
OK, I think this is now working, will be included in the 0.1.6 release |
From a user: It's working fine most of the time but sometimes I get this error using:
plotSimulatedResiduals
Error in qrnn::qrnn.fit(x = as.matrix(pred), y = as.matrix(res), n.hidden = 4, :
zero variance column(s) in "x"
The text was updated successfully, but these errors were encountered: