Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML estimation not working for some cases of the Beta distribution #85

Closed
wleoncio opened this issue Feb 25, 2022 · 8 comments
Closed

ML estimation not working for some cases of the Beta distribution #85

wleoncio opened this issue Feb 25, 2022 · 8 comments
Labels
bug Something isn't working critical This issue should be prioritized

Comments

@wleoncio
Copy link
Member

See example below:

bilde

@wleoncio wleoncio added the bug Something isn't working label Feb 25, 2022
@wleoncio
Copy link
Member Author

wleoncio commented Apr 7, 2022

This might be caused by a runaway delta, see example below:

r$> sample.beta <- rtruncbeta(1000, shape1 = 15, shape2 = 4, a = .7, b = .9)

r$> ml_beta <- mlEstimationTruncDist(
      sample.beta, print.iter = TRUE, tol = 1e-7, max.it = 1e3
    )
Estimating parameters for the beta distribution
it:  1 delta:  24.87277  - parm:  43.313 10.702 
it:  2 delta:  61.23992  - parm:  48.136 11.969 
it:  3 delta:  200.9796  - parm:  55.718 13.908 
it:  4 delta:  1332.409  - parm:  69.465 17.373 
it:  5 delta:  376016.3  - parm:  104.87 26.254 
it:  6 delta:  NaN  - parm:  -489.678 -123.842 
Error in while ((delta.L2 > tol) & (it < max.it)) { : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In dbeta(y, shape1 = parm[1], shape2 = parm[2]) : NaNs produced
2: In pbeta(a, shape1 = parm[1], shape2 = parm[2]) : NaNs produced
3: In pbeta(b, shape1 = parm[1], shape2 = parm[2]) : NaNs produced

I'll write Rene on this, but I found this function to be strange:

natural2parameters.trunc_beta <- function(eta) {
  # eta: The natural parameters in a beta distribution
  # returns (alpha,beta)
  parms <- c(shape1 = eta[1], shape2 = eta[2])
  class(parms) <- class(eta)
  return(parms)
}

Usually parms does a transformation on eta. Here it just splits it. I suspect this could be causing the issue (not the case, see comment below).

@wleoncio wleoncio added the critical This issue should be prioritized label Apr 7, 2022
@wleoncio
Copy link
Member Author

wleoncio commented Apr 7, 2022

There's a nice table on https://en.wikipedia.org/wiki/Exponential_family#Table_of_distributions that suggests two variants of the Beta: one has no transformation and the other has a ±1 transformation between natural and inverted mapping. Neither solution solves this problem, so it might be somewhere else.

@wleoncio
Copy link
Member Author

wleoncio commented May 3, 2022

Hi @rho62,

Is there a reference to the mothodology behind the getYseq() and getGradETinv() functions I could read? Trying to check if the following code chunk is correct:

TruncExpFam/R/beta.R

Lines 77 to 95 in 4ec917e

getYseq.trunc_beta <- function(y, y.min = 0, y.max = 1, n = 100) {
# needs chekking
mean <- mean(y, na.rm = TRUE)
sd <- var(y, na.rm = TRUE)^0.5
lo <- max(y.min, mean - 5 * sd)
hi <- min(y.max, mean + 5 * sd)
out <- seq(lo, hi, length = n)
class(out) <- class(y)
return(out)
}
getGradETinv.trunc_beta <- function(eta) {
# eta: Natural parameter
# return the inverse of E.T differentiated with respect to eta' : p x p matrix
term.1 <- sum(1 / (((1:10000) + eta[1]))^2)
term.2 <- sum(1 / (((1:10000) + eta[2]))^2)
term.12 <- sum(1 / (((1:10000) + eta[1] + eta[2]))^2)
return(A = solve(matrix(c(term.1 - term.12, -term.12, -term.12, term.2 - term.12), ncol = 2)))
}

wleoncio added a commit that referenced this issue May 3, 2022
wleoncio added a commit that referenced this issue May 3, 2022
Most commented out until #85 and #90 are resolved.
wleoncio added a commit that referenced this issue May 10, 2022
This reaches 100% coverage on all files except those related to
the five distributions under investigation (see issues #85 and #90).
wleoncio added a commit that referenced this issue May 10, 2022
@wleoncio
Copy link
Member Author

Hi René,

Great news regarding the beta estimation.

I didn't manage to confirm the previous implementation, so I recoded the whole function following this procedure:

  1. Calculated the dE(T)/d(eta) matrix by hand:
    image

  2. Got the moments from here and used the approximation for the digamma described here

  3. Calculated the derivatives using Wolfram Alpha.

This seems to solve this issue, the last of those critical ML problems we found. The issues list currently contain only one small annoying bug regarding parameter naming (#74), and the rest are new/improved features. Therefore, I think we should release version 1.0.1 with the current fixes before starting work on the next thing, what do you say?

@rho62
Copy link
Contributor

rho62 commented Aug 23, 2022 via email

@wleoncio
Copy link
Member Author

Sure thing, we can go over this on Friday.

In the meantime, you can install the latest development version with remotes::install_github("ocbe-uio/TruncExpFam") if you wish to take it for a spin.

W

@rho62
Copy link
Contributor

rho62 commented Aug 23, 2022 via email

@rho62
Copy link
Contributor

rho62 commented Oct 11, 2022 via email

wleoncio added a commit that referenced this issue Feb 6, 2023
When y.seq is 0 or 1, it generates infinite densities which make the calculation of T.f on getTminusET() (which is part of mlEstimationTruncDist()) generate NaN values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working critical This issue should be prioritized
Projects
None yet
Development

No branches or pull requests

2 participants