Rewrite functions for sampling from discrete truncated distributions #77

rho62 · 2022-02-21T10:03:13Z

Approach pursued so far

Sample from original (not truncated) distribution, followed by a truncation. In-efficient approach: Samples a surplus of unnecessary elements and difficult to predict the sample size required to achieve the target sample size.

Solution

Sample directly from the truncated distribution:

$$ f_T(x; \theta) = \frac{f(x; \theta)}{[F(b) - F(a)]} $$

Use sample() to sample from $a, ..., b$ with weights $f_T(a, ..., b; \theta)$

Implemented for binomial. See code there. Needs to be implemented for other discrete distributions: Poisson, Neg. bin. others?

Binomial
Negative binomial
Poisson

OBS: weights $f_T(x; \theta)$ are already implemented as dtrunc.XXXX() functions

The text was updated successfully, but these errors were encountered:

wleoncio · 2022-02-21T10:10:53Z

This is the same as #72, isn't it? Also, is there any impediments for implementing this for continuous distributions as well?

rho62 · 2022-02-21T10:16:25Z

Perhaps... not sure... Seems to me, that there is a coding issue (only calling rtrunc vs sampleFromUntruncated) and a content/solution issue: How do we actually do it? /R Fra: Waldir Leoncio ***@***.***> Svar til: ocbe-uio/TruncExpFam ***@***.***> Dato: mandag 21. februar 2022 kl. 11:11 Til: ocbe-uio/TruncExpFam ***@***.***> Kopi: Rene Holst ***@***.***>, Author ***@***.***> Emne: Re: [ocbe-uio/TruncExpFam] Rewrite functions for sampling from discrete truncated distributions (Issue #77) This is the same as #72<#72>, isn't it? Also, is there any impediments for implementing this for continuous distributions as well? — Reply to this email directly, view it on GitHub<#77 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFPRUPWKSFLENKXSPDR3Z53U4IFTRANCNFSM5O6C6QLA>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

wleoncio · 2022-02-21T10:21:10Z

On second thought, I think you're right. Seems wise to separate things and leave #72 for the duplicated coding issue and #77 and #78 for the slow-sampling issue.

wleoncio · 2023-01-03T12:22:57Z

If I understood correctly, the Binomial implementation is here:

TruncExpFam/R/binomial.R

Lines 15 to 31 in 087ad1b

    
           dtrunc.trunc_binomial <- function( 
        
             y, size, prob, eta, a = 0, b = attr(y, "parameters")$size, ... 
        
           ) { 
        
             if (missing(eta)) { 
        
               eta <- parameters2natural.trunc_binomial(c("size" = size, "prob" = prob)) 
        
             } 
        
             nsize <- attr(y, "parameters")$size 
        
             my.dbinom <- function(nsize) dbinom(y, size = nsize, prob = proba) 
        
             my.pbinom <- function(z, nsize) pbinom(z, size = nsize, prob = proba) 
        
             proba <- 1 / (1 + exp(-eta)) 
        
             dens <- ifelse((y < a) | (y > b), 0, my.dbinom(nsize)) 
        
             F.a <- my.pbinom(a - 1, nsize) 
        
             F.b <- my.pbinom(b, nsize) 
        
             dens <- dens / (F.b - F.a) 
        
             attributes(dens) <- attributes(y) 
        
             return(dens) 
        
           }

The f(x) / [F(b) - F(a)] part is clearly defined on L28. f(x) (i.e., dens) is transformed on L25 using my.dbinom(). If that is correct, then this idea could be replicated for rtrunc(), but unless I'm missing something there's no sampling involved on the function above, only rescaling of the densities (as expected, since resampling is only part of the r* fucntions).

So a DRY solution might involve the following steps:

Extract the calculation of f_T(x) from the dtrunc methods into its own function. Could be a generic, since the x argument would have different rtrunc_ classes
Use the extracted function from the previous step on a new version of rtrunc(), temporarily coexistent with the current implementation
Phase out the old function in favor of the new implementation
Adjust test unit expectations

One thing that worries me about this approach is that this will probably make the output of rtrunc() not match their stats counterparts anymore, since the untruncated distribution will no longer be the base for the sampling. Is this acceptable?

wleoncio · 2023-02-06T05:59:51Z

An alternative to phasing out the old algotirhm is to have rtrunc() contain an argument (like a boolean legacy) that will run the old stats-compatible algo. This gives the user control over comparability with stats results vs speed of the new implementation.

Still defaults to the old algo, at least until #77 and #78 are implemented and the old algo is superseded.

Function is similar to `rescaledDensities()`, though the latter is supposed to serve #77 (i.e., discrete distributions). Some eventual merging may be in order.

wleoncio added the duplicate This issue or pull request already exists label Feb 21, 2022

wleoncio removed the duplicate This issue or pull request already exists label Feb 21, 2022

wleoncio mentioned this issue Feb 21, 2022

Remove duplication of calls to resampling functions #72

Closed

wleoncio added the enhancement New feature or request label Feb 21, 2022

rho62 mentioned this issue Feb 21, 2022

dtrunc() uses eta as argument instead of distribution parameters #79

Closed

wleoncio added a commit that referenced this issue Jan 3, 2023

Added function to calc scaled densities (#77, #78)

941faac

wleoncio added a commit that referenced this issue Jan 3, 2023

Unified common dtrunc bits (#77, #78)

180ace7

wleoncio added a commit that referenced this issue Feb 6, 2023

Incorporated rtrunc_direct() into rtrunc() (#78)

3a883af

Still defaults to the old algo, at least until #77 and #78 are implemented and the old algo is superseded.

wleoncio self-assigned this Feb 6, 2023

wleoncio added a commit that referenced this issue Feb 22, 2023

Added function to rescale quantiles (#78)

1f1cf61

Function is similar to `rescaledDensities()`, though the latter is supposed to serve #77 (i.e., discrete distributions). Some eventual merging may be in order.

wleoncio added this to the MVP 1.2.0 milestone Apr 13, 2023

wleoncio added a commit that referenced this issue Sep 25, 2023

Implemented direct sampling for poisson (#77)

7925ce1

wleoncio added a commit that referenced this issue Sep 25, 2023

Separating b and practical_b (#77)

79606be

wleoncio added a commit that referenced this issue Sep 25, 2023

Adding unit tests for direct poisson (#77)

9947c5e

wleoncio added a commit that referenced this issue Sep 25, 2023

Added direct sampling for binomial (#77)

fc00b33

wleoncio added a commit that referenced this issue Sep 25, 2023

Added direct sampling for negative binomial (#77)

2576265

wleoncio added a commit that referenced this issue Sep 25, 2023

Test faster = TRUE truncation limits (#77, #78)

8e39e13

wleoncio added a commit that referenced this issue Sep 25, 2023

Updated NEWS (#77)

e25fcd7

wleoncio closed this as completed in ac9f4f4 Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite functions for sampling from discrete truncated distributions #77

Rewrite functions for sampling from discrete truncated distributions #77

rho62 commented Feb 21, 2022 •

edited by wleoncio

wleoncio commented Feb 21, 2022

rho62 commented Feb 21, 2022 via email

wleoncio commented Feb 21, 2022

wleoncio commented Jan 3, 2023 •

edited

wleoncio commented Feb 6, 2023

Rewrite functions for sampling from discrete truncated distributions #77

Rewrite functions for sampling from discrete truncated distributions #77

Comments

rho62 commented Feb 21, 2022 • edited by wleoncio

Approach pursued so far

Solution

wleoncio commented Feb 21, 2022

rho62 commented Feb 21, 2022 via email

wleoncio commented Feb 21, 2022

wleoncio commented Jan 3, 2023 • edited

wleoncio commented Feb 6, 2023

rho62 commented Feb 21, 2022 •

edited by wleoncio

wleoncio commented Jan 3, 2023 •

edited