Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check that initial_state allows full exploration of parameter space #306

Closed
Christopher-Bradshaw opened this issue Sep 8, 2019 · 15 comments

Comments

@Christopher-Bradshaw
Copy link
Contributor

When running the ensemble sampler, I occasionally accidentally initialize all walkers to have the same initial position in some of the dimensions. This is entirely my fault! But, it would be easy (I think) to add a check that the initial state allows the sampler to explore all of parameter space, rather than have me find out my mistake after running my chains for a while.

If you think this would be a useful thing to have, I'm happy to make the change and submit a PR, but wanted to check it was something you'd be interested in/I haven't missed some reason why you might not want this.

Thanks!

@davidwhogg
Copy link
Collaborator

The test is that the determinant of the variance tensor of the walker positions (minus the mean position) is nonzero. It's easy to implement.

@davidwhogg
Copy link
Collaborator

ps. It will be violated when live_dangerously=True

@Alexis-Prel
Copy link

Alexis-Prel commented Sep 13, 2019

This is what I am using for this check before I send p0 to emcee.

stuck = [p for p in range(dim) if len(np.unique(p0[:, p])) == 1]
assert not stuck, f"Direction(s) {stuck} can not evolve."

@davidwhogg
Copy link
Collaborator

I don't think that check is quite right. Even if there are non-unique entries, you still might be fine.

@Alexis-Prel
Copy link

Alexis-Prel commented Sep 13, 2019

Maybe I made a mistake!
You seem to think the check is passed if elements are all different. Actually, the check is passed if elements are not all identical.

My logic was that if np.unique(p0[:, p]) has only one element u, then all stretch moves will return the same position because the proposal is q = c[rint] - (c[rint] - s) * zz[:, None] and then q[p] = u - (u - u) * zz[p] = u.

To frame it the same way as your earlier comment, I should have written

stuck = (np.std(p0, axis=0) == 0)
assert not stuck.any(), f"Direction(s) {np.arange(dim, dtype=int)[stuck]} can not evolve."

@davidwhogg
Copy link
Collaborator

Okay got it! I did mis-read the code. But still, I think that the right check is something like

  dp0 = p0 - np.mean(p0, axis=0)[None, :]
  var = (1. / nwalkers) * np.sum(dp[:, None, :] * dp[:, :, None], axis=0)
  sgn, logdet = np.linalg.slogdet(var)
  assert logdet > 0.

or maybe something more robust.

@Christopher-Bradshaw
Copy link
Contributor Author

@Alexis-Prel that is actually roughly what I had in mind before @davidwhogg made his suggestion. But after thinking about it, I don't think that is enough. Consider in a 2d space p0 = [(1, 1), (2, 2), (3, 3), ... (n, n)].

This passes your test, but using the stretch move, those walkers can never explore anything other than the x = y line. Let me know if that makes sense or if I have missed something!

@dstndstn
Copy link
Collaborator

Maybe this is too expensive, but don't you want to take the SVD of (all walkers - walker 0) and make sure all the eigenvalues are significantly non-zero? You want to make sure they span the eigenspace (no nullspace).

@davidwhogg
Copy link
Collaborator

I think @Christopher-Bradshaw's example does not pass my test (they lie on a line, so the determinant of the variance tensor will be zero). And my test is equivalent to @dstndstn's test. The determinant is the product of the eigenvalues. But @dstndstn's test is perhaps stronger in some sense. One challenge is to define "significantly" non-zero.

@dstndstn
Copy link
Collaborator

dstndstn commented Sep 14, 2019 via email

@Alexis-Prel
Copy link

Alexis-Prel commented Sep 14, 2019

I understand my mistake now!

Another thing: This is perhaps not the simplest thing to put in emcee 3's move class hierarchy.
The criteria for acceptable starting positions may vary on the type of move(s) allowed. And if there are several of them, some of which possibly user-defined, I am not sure how the test could factor that in.

A move-agnostic idea could be to issue a warning rather than an error, before each iteration if the current positions do not pass the check.

@aarchiba
Copy link
Contributor

aarchiba commented Oct 5, 2019

Oh yeah, I see now, you're right :) For "significantly non-zero", you probably want something like comparing the square root of the determinant of the variance (which is something like a hypervolume) to, like, the product of the ranges in each dimension (an axis-aligned hypervolume).

If you're using the SVD, there is a natural notion of "independent enough" - the condition number of the matrix, that is, the ratio of largest to smallest singular values, can be compared to the numerical accuracy and the size of the matrix.

As far as computation goes, a compact SVD is supposed to take several times as long as a determinant, but I don't think emcee works well with large enough dimensionality for this cost to be important (as it's a startup cost). If you wanted to be ultra-cautious you could do this after every set of proposed jumps and throw out proposals that lose dimensionality, but I'm not sure what this would do to your MCMC statistics (and then you might care about the speed of your test).

@davidwhogg
Copy link
Collaborator

I don't think condition number is quite sufficient. Because you can have a bad condition number because your ball is not a good shape, or you can have a bad condition number because your different parameters are measured in very different units (imagine one velocity is in cm/s and another is in km/s). In this latter case, emcee should be fine, but the condition number will be terrible.

@aarchiba
Copy link
Contributor

I don't think condition number is quite sufficient. Because you can have a bad condition number because your ball is not a good shape, or you can have a bad condition number because your different parameters are measured in very different units (imagine one velocity is in cm/s and another is in km/s). In this latter case, emcee should be fine, but the condition number will be terrible.

You're quite right, and I have seen this cause problems with fitting code. This particular problem can be mollified by normalizing all the rows of the matrix to have l2-norm of 1. But I'm not sure that covers all possible cases where linear algebra falls down but emcee's affine operations are fine.

@dfm
Copy link
Owner

dfm commented Oct 28, 2019

I've updated the wording of the warning to be softer, but I'm going to suggest that we close this issue for now and call it a day. If it gets too annoying in practice, let's revisit!

@dfm dfm closed this as completed Oct 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants