Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does `math.invlogit` actually use the `eps` argument (and does this mean `t_stick_breaking` actually cannot be different from default `StickBreaking`)? #3001

Open
spearsem opened this issue Jun 3, 2018 · 11 comments

Comments

@spearsem
Copy link

commented Jun 3, 2018

In the code from the math submodule for the function invlogit (linked here), the function has a keyword argument eps, but the function body does not use it:

def invlogit(x, eps=sys.float_info.epsilon):
    """The inverse of the logit function, 1 / (1 + exp(-x))."""
    return tt.nnet.sigmoid(x)

This variable eps is not in a lexical scope such that internally to tt.nnet.sigmoid it would find the local variable eps, and there don't appear to be any other metaprogramming facilities at play here that could somehow provide this eps context to the call to tt.nnet.sigmoid.

As a result, I conjecture that eps is never used for invlogit.

This matters as a side question because I am looking into the transform t_stick_breaking for the so-called "numerically stable" version of stick breaking for sampling from a Dirichlet distribution. But from the source code for t_stick_breaking, it looks like it only results in a change to eps from the regular StickBreaking implementation. And then inside the class definition of StickBreaking, eps is only ever used twice, as the keyword argument to invlogit (here and here).

Meaning that I believe t_stick_breaking cannot ever actually be different from StickBreaking with its default eps value.

I don't know of a way to set up a specific sampling experiment that would definitely demonstrate if I am wrong by showing t_stick_breaking definitely sampling differently due specifically to a different floating point precision threshold. But if anyone can point me in the right direction, I am happy to give that a try.

Thanks!

@junpenglao

This comment has been minimized.

Copy link
Member

commented Jun 3, 2018

I think invlogit ought to have use eps, for example see implementation here:

def invlogit(x, eps=sys.float_info.epsilon):
return (1. - 2. * eps) / (1. + np.exp(-x)) + eps

Would you like to send a PR to fix this?

As for the stick_breaking one, cc @AustinRochford

@AustinRochford

This comment has been minimized.

Copy link
Member

commented Jun 3, 2018

Agree about invlogit. For details on the stick breaking transform, check out the Stan manual. Nothing obvious comes to mind as a sampling experiment, but I'll give it some thought.

@spearsem

This comment has been minimized.

Copy link
Author

commented Jun 3, 2018

@junpenglao I am happy to work on a PR for this. For the analogous implementation of invlogit, will it be a straight translation of the test_distributions.py example you shared, just adjusted to use Theano operations? Or is there any other pitfall I need to look out for.

(I assume it is OK to move away from using tt.nnet.sigmoid in order to get this feature of using eps-- but if that decision needs to involve others, let me know.)

@junpenglao

This comment has been minimized.

Copy link
Member

commented Jun 3, 2018

Yes please go ahead with porting the test_distributions.py version into theano - If there is any problem the test should catch it.

@spearsem

This comment has been minimized.

Copy link
Author

commented Jun 3, 2018

@junpenglao Thanks for following up. I am currently dealing with a few issues with Dockerfile for pymc3 from the scripts/ folder (it is giving me errors with installing libav-tools). Once I figure them out, I should have a PR very shortly.

@AustinRochford

This comment has been minimized.

Copy link
Member

commented Jun 4, 2018

@spearsem I'll see if I can reproduce and help you out. Feel free to post the Docker problem as a separate issue and @ me

@AustinRochford

This comment has been minimized.

Copy link
Member

commented Jun 4, 2018

See #3003

@junpenglao

This comment has been minimized.

Copy link
Member

commented Jun 5, 2018

@spearsem the invlogit current implementation is causing another error so I am reimplementing the eps version in #3006 - thanks again for reporting the problem ;-)

@spearsem

This comment has been minimized.

Copy link
Author

commented Jun 5, 2018

@junpenglao I finally got my local environment working for running the pytest suite. I am seeing significant test errors in two specific test file:

  • pymc3/tests/sampler_fixtures.py
  • pymc3/tests/backend_fixtures.py

It amounts to 23 test failures overall, though none of them seem to involve any use of invlogit (which is my only source code change).

Do you have any hints for someone new to this part of the test code, for why the "fixture" modules could have a lot of errors.

One note is that I needed to provide an environment variable for $FLOATX (I am testing with this set to 'float32' currently) in order to get the tests to work from the test.sh script. I am not sure if this has anything to do with it.

My Python environment is using Python 3.6.2 as well (everything else is setup using conda and pip exactly as in the script create_testenv.sh).

I'll continue looking into it, but if there are some fixtures I need to build or something, let me know. (And apologies if this is a naive question!)

@spearsem

This comment has been minimized.

Copy link
Author

commented Jun 5, 2018

@junpenglao Ah bummer. I just saw your comment about #3006 -- sorry I wasn't faster on this PR.

If the feedback is worth anything, I am proficient with Docker and pytest from many Python projects, but still found it very difficult to get tests running for pymc3, with some undocumented edge cases like the FLOATX thing and on-going problems with the Dockerfile not correctly COPYing the pymc3 folder into the container.

Maybe I can work on cleaning up some of those items and updating documentation for instructions and shell commands that make the tests run.

Either way, thanks for your help and for implementing that fix to invlogit.

@junpenglao

This comment has been minimized.

Copy link
Member

commented Jun 5, 2018

Thanks for reporting back. You are right there are some edge cases in the test suit (eg, the domain of the distribution were not covered completely, discovered by @gBokiau). The fixture are not directly runable I think. For example sampler_fixtures.py is used in test_posteriors.py.

FYI, to reproduce the test on the travis enviorment, you can run eg:
THEANO_FLAGS="floatX='float32',gcc.cxxflags='-march=core2'" pytest -xv --cov=pymc3 pymc3/tests/test_distributions_random.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.