Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Optimizing seems to get stuck at boundaries, dependent on seed #1928
I'm using pystan version 2.9.0. I have a model based on a truncated incomplete gamma distribution.
When running optimizing(), the results depend very heavily on which seed I choose, and it seems to have something to do with the boundary values. That is, for one seed (121010), the optimization terminates after 2 iterations with all parameters at their boundaries (there are four parameters). I was firstly getting "Convergence detected: gradient norm is below tolerance", then when I set
When I set the algorithm to "Newton" it works (doesn't seem to hit the boundaries). When setting to "BFGS" it has the same outcome as LBFGS.
When using seed of 1234 (without any of those options specified), the result comes out nicely.
I've verified that the lp__ values are correct at several of the iterations (modulo constant).
The LBFGS method seems to work every time when using the function from scipy.
I can list my stan code:
I am using data with ~2e5 variates:
from a distribution with Expected Values:
I set the bounds as
and V is 45794539.9853329.
I was previously setting the bounds to be wider on each parameter, but contracting the support has not solved the problem.
I'm hoping I've provided enough information. Unfortunately I haven't tested if this happens for simpler problems -- so I don't know if it is problem-specific, or a more general problem within Stan. Since it doesn't always happen for my own problem, I would be hesitant to try to randomly find it in a simpler problem.
Thanks for reporting. It doesn't look like anyone has responded. Did the comparison with the other L-BFGS implementation use the same transformations implied by the constrained parameters?
I marked this a feature and a bug, at least until someone does some more investigating.
So for the other L-BFGS implementation, I believe there is no transformation performed. In that case, it seems to evaluate first p0 which you give it, and then explicitly evaluates the edges, then works its way in (sorry I'm not too well acquainted with the exact procedure).
I would love to be more helpful in this (constructing a more minimal WE and doing a proper comparison), but I haven't really got the time at the moment, as is frustratingly often the case. I'm more than happy to trial things for you on my end though.
You can try removing the bounds from the Stan program and
It'd be interesting to see if the Stan model would be