New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check that initial_state
allows full exploration of parameter space
#306
Comments
The test is that the determinant of the variance tensor of the walker positions (minus the mean position) is nonzero. It's easy to implement. |
ps. It will be violated when live_dangerously=True |
This is what I am using for this check before I send stuck = [p for p in range(dim) if len(np.unique(p0[:, p])) == 1]
assert not stuck, f"Direction(s) {stuck} can not evolve." |
I don't think that check is quite right. Even if there are non-unique entries, you still might be fine. |
Maybe I made a mistake! My logic was that if To frame it the same way as your earlier comment, I should have written stuck = (np.std(p0, axis=0) == 0)
assert not stuck.any(), f"Direction(s) {np.arange(dim, dtype=int)[stuck]} can not evolve." |
Okay got it! I did mis-read the code. But still, I think that the right check is something like
or maybe something more robust. |
@Alexis-Prel that is actually roughly what I had in mind before @davidwhogg made his suggestion. But after thinking about it, I don't think that is enough. Consider in a 2d space This passes your test, but using the stretch move, those walkers can never explore anything other than the |
Maybe this is too expensive, but don't you want to take the SVD of (all walkers - walker 0) and make sure all the eigenvalues are significantly non-zero? You want to make sure they span the eigenspace (no nullspace). |
I think @Christopher-Bradshaw's example does not pass my test (they lie on a line, so the determinant of the variance tensor will be zero). And my test is equivalent to @dstndstn's test. The determinant is the product of the eigenvalues. But @dstndstn's test is perhaps stronger in some sense. One challenge is to define "significantly" non-zero. |
Oh yeah, I see now, you're right :) For "significantly non-zero", you
probably want something like comparing the square root of the determinant
of the variance (which is something like a hypervolume) to, like, the
product of the ranges in each dimension (an axis-aligned hypervolume).
…On Sat, Sep 14, 2019 at 7:54 AM David W. Hogg ***@***.***> wrote:
I think @Christopher-Bradshaw <https://github.com/Christopher-Bradshaw>'s
example does not pass my test (they lie on a line, so the determinant of
the variance tensor will be zero). And my test is equivalent to @dstndstn
<https://github.com/dstndstn>'s test. The determinant is the product of
the eigenvalues. But @dstndstn <https://github.com/dstndstn>'s test is
perhaps stronger in some sense. One challenge is to define "significantly"
non-zero.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#306?email_source=notifications&email_token=AAIEH7I24VXDKSAKAANQKPDQJTGFVA5CNFSM4IUVHD32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6W2KHA#issuecomment-531473692>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIEH7MMXPGJEB63D6YQG7DQJTGFVANCNFSM4IUVHD3Q>
.
|
I understand my mistake now! Another thing: This is perhaps not the simplest thing to put in emcee 3's move class hierarchy. A move-agnostic idea could be to issue a warning rather than an error, before each iteration if the current positions do not pass the check. |
If you're using the SVD, there is a natural notion of "independent enough" - the condition number of the matrix, that is, the ratio of largest to smallest singular values, can be compared to the numerical accuracy and the size of the matrix. As far as computation goes, a compact SVD is supposed to take several times as long as a determinant, but I don't think emcee works well with large enough dimensionality for this cost to be important (as it's a startup cost). If you wanted to be ultra-cautious you could do this after every set of proposed jumps and throw out proposals that lose dimensionality, but I'm not sure what this would do to your MCMC statistics (and then you might care about the speed of your test). |
I don't think condition number is quite sufficient. Because you can have a bad condition number because your ball is not a good shape, or you can have a bad condition number because your different parameters are measured in very different units (imagine one velocity is in cm/s and another is in km/s). In this latter case, emcee should be fine, but the condition number will be terrible. |
You're quite right, and I have seen this cause problems with fitting code. This particular problem can be mollified by normalizing all the rows of the matrix to have l2-norm of 1. But I'm not sure that covers all possible cases where linear algebra falls down but emcee's affine operations are fine. |
I've updated the wording of the warning to be softer, but I'm going to suggest that we close this issue for now and call it a day. If it gets too annoying in practice, let's revisit! |
When running the ensemble sampler, I occasionally accidentally initialize all walkers to have the same initial position in some of the dimensions. This is entirely my fault! But, it would be easy (I think) to add a check that the initial state allows the sampler to explore all of parameter space, rather than have me find out my mistake after running my chains for a while.
If you think this would be a useful thing to have, I'm happy to make the change and submit a PR, but wanted to check it was something you'd be interested in/I haven't missed some reason why you might not want this.
Thanks!
The text was updated successfully, but these errors were encountered: