Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested option to use oneStepPredict(..., method="cdf" ) with delta-models #322

Closed
wants to merge 2 commits into from
Closed

Conversation

James-Thorson-NOAA
Copy link

Kasper and all,

As we briefly discussed by email, this pull request is my effort to provide a diff-file for a few changes that seem to provide capability for extending oneStepPredict(.) using method="cdf" to a delta-model, or other continuous distributions with a probability mass at a user-supplied set of locations (e.g., a zero-and-one inflated proportion for stomach content samples). In this case, the user supplies deltaSupport (which is NULL by default), and the limit of deltaSupport = {all supported integers} should perform identically to discrete=TRUE

I very much do not understand the statistical theory underlying oneStepPredict(.), so please review this PR with caution! However, in following the coding logic of method="cdf" there does not appear to be any fundamental distinction between discrete=TRUE and discrete=FALSE options in how they are handled on the R side, so it seems easy enough to simply provide the CDF appropriately in TMB and then evaluate as if its discrete=TRUE at those user-supplied probability-mass-locations, and discrete=FALSE at other locations where a continuous distribution applies. This then simply requires that the user correctly code a CDF for the distribution on the TMB side, which is required of the method anyway.

I have done some limited testing of this modification for a delta-model without random effects, and in this case it appeared to behave as expected, i.e., give a uniform distribution for residuals for those observations of response = 0 under the correctly specified model. However, I again emphasize that I cannot vouch for the statistical basis for the suggested modification; it's just based on my reading of its implementation.

Thanks for your time in reviewing the suggestion.

@kaskr
Copy link
Owner

kaskr commented Aug 14, 2020

Some preliminary comments:

The proposed PR adds an option to apply the missing randomization step in the continuous case when atoms are present.
I agree that this is useful and the implementation appears to be correct.

However, it's important to keep the oneStepPredict interface as simple as possible, in particular avoid adding options that are targeting special cases. Delta distributions are important, but what about cases where the deltaSupport varies among observations?

There are already many options and I think the existing ones can be tweaked to provide the same effect as the PR.
E.g. for a delta distribution one can do two passes:

## First pass. Residuals valid for 'delta suppport' only
res1 <- oneStepPredict(obj, method="cdf", discrete=TRUE)
## Second pass. Residuals valid for the rest
res2 <- oneStepPredict(obj, method="cdf", discrete=FALSE)
## Combine
resid <- ifelse(obs %in% deltaSuppport, res1$residual, res2$residual)

@James-Thorson-NOAA
Copy link
Author

James-Thorson-NOAA commented Aug 14, 2020 via email

@kaskr
Copy link
Owner

kaskr commented Aug 15, 2020

Sounds great.

FWIW Roxygen is located around here:

https://github.com/kaskr/adcomp/blob/master/TMB/R/validation.R#L95

Alternatively it could fit in the book around here:

https://github.com/kaskr/adcomp/blob/master/dox/06-Validation.Rmd#L104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants