Make slice sampler sample from 1D conditionals as it should #2446

madanh · 2017-07-26T12:12:54Z

Previous implementation would sample from non-scalar variables (size>1) jointly. Judging from code this was not intended. When sampling from high-dimensional variables it would hang due to small probability of getting the joint sample right when sampling uniformly from a hypercube enclosing the slice.

This PR fixes this by sampling from 1D components of multidimensional variables one at a time.

Admittedly not-ideal since it only returns the last sample, while all the intermediate samples are also valid, effectively performing thinning, but at least it does not hang.

Done while discussing #2189

In the previous implementation it would sample jointly from non-scalar variables, and hang for when the size is high (due to low probability to get a joint sample within the slice in high-D).

Fix broken indentation due to copypaste

junpenglao · 2017-07-26T12:18:49Z

LGTM. @madanh could you please format it according to PEP08? (see here)

junpenglao · 2017-07-26T12:27:54Z

It would be ideal to also add a test for sampling from a high-dimensional variable.

ColCarroll · 2017-07-26T12:28:44Z

I'd also guess you'll need to update test_step.py, particularly the samples from TestStepMethods.master_samples. Those make sure the first 100 samples from various step methods do not change. I'll try to do some testing locally to make sure the sampler isn't obviously biased.

madanh · 2017-07-26T12:30:50Z

@junpenglao
I'm an absolute zero wrt automated testing. Could you give me a starting point (i.e. an example of a simple test in pymc3 code).

junpenglao · 2017-07-26T12:34:08Z

The first step is to check whether the test passes locally.
@ColCarroll If I am not mistaken we only have test for the Slice sampler here? It is a fixture test so I am not sure what is the best way to edit it.
[edit] just saw your comment, yes in test_step.py as well.

ColCarroll

this looks good! thanks for the helpful comments

ColCarroll · 2017-07-26T12:29:43Z

pymc3/step_methods/slicer.py

@@ -34,8 +34,8 @@ def __init__(self, vars=None, w=1., tune=True, model=None, **kwargs):
        self.model = modelcontext(model)
        self.w = w
        self.tune = tune
-        self.w_sum = 0
-        self.n_tunes = 0
+        # self.w_sum = 0 #probably we don't need it now


i'd rather you just delete this (and line 48) -- tests should fail if they are needed.

ColCarroll · 2017-07-26T12:31:52Z

pymc3/step_methods/slicer.py

+        q = np.copy(q0)  # TODO: find out if we need this
+        ql = np.copy(q0)  # l for left boundary
+        qr = np.copy(q0)  # r for right boudary
+        for i in range(len(q0)):


Feel free to ignore, but this loop might be cleaner using

for i, (point, width) in enumerate(zip(q, self.w)):

ColCarroll · 2017-07-26T12:32:54Z

pymc3/step_methods/slicer.py

+                # breaking markovianness. Can we do it regardless of self.tune?(@madanh)
+                self.w[i] = self.w[i] * (self.n_tunes / (self.n_tunes + 1)) +\
+                    (qr[i] - ql[i]) / (self.n_tunes + 1)  # same as before
+            # unobvious and important: return qr and ql to the same point


thanks for this comment

ColCarroll · 2017-07-26T12:33:57Z

pymc3/step_methods/slicer.py

+                    (qr[i] - ql[i]) / (self.n_tunes + 1)  # same as before
+            # unobvious and important: return qr and ql to the same point
+                qr[i] = q[i]
+                ql[i] = q[i]
        if self.tune:


can remove this second if self.tune, right?

In principle, yes. Might also leave it there for clarity. I'm sure that performance effects are atomic-scale.

my objection is that it makes reading it a little funny -- I had to go back and see what the condition had been above.

yes, but if it's not there then we will increment the tune counter even when not tuning, which is logically wrong and could detract somebody who's reading the code even more!

ah shoot! you're absolutely right. i was looking at the diff, and it made it look like you were doing

if self.tune: some_logic if self.tune: self.n_tunes += 1

Carry on!

junpenglao · 2017-07-26T12:38:45Z

tested locally with a simple model below, yes the samples are much more accurate.

with pm.Model():
    mu = pm.Normal('mu', 0., 10., shape=(100,))
    trace = pm.sample(step=pm.Slice(mu))

ColCarroll · 2017-07-26T12:41:41Z

A good place to start on local testing is to just run the test_sample_exact, probably using

py.test -vsx tests/test_step.py::TestStepMethods::test_sample_exact

the three flags (v, s, x) say to give verbose output from pytest, print standard out, and to quit on first failure, respectively.

Once that passes, try running the whole file with

py.test -vx tests/test_step.py

(note I dropped the -s, since it tends to make things noisy). Finally you might

py.test -vx tests/

to run the whole test suite (which takes 10s of minutes).

madanh · 2017-07-26T13:01:39Z

@ColCarroll

py.test -vsx tests/test_step.py::TestStepMethods::test_sample_exact fails with


(theanobleeding)madanh@madanh ~/.virtualenvs/theanobleeding/lib/python3.4/site-packages/pymc3 $ py.test -vsx tests/test_step.py::TestStepMethods::test_sample_exact
======================================================================================================= test session starts ========================================================================================================
platform linux -- Python 3.4.3, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /home/madanh/.virtualenvs/theanobleeding/bin/python3
cachedir: .cache
rootdir: /home/madanh/.virtualenvs/theanobleeding/lib/python3.4/site-packages/pymc3, inifile:
collected 1 item 

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 3856.37it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 7535.58it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 2697.30it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 4217.33it/s]
FAILED

============================================================================================================= FAILURES =============================================================================================================
________________________________________________________________________________________________ TestStepMethods.test_sample_exact _________________________________________________________________________________________________

self = <pymc3.tests.test_step.TestStepMethods object at 0x7f01a223b208>

    @pytest.mark.xfail(condition=(theano.config.floatX == "float32"), reason="Fails on float32")
    def test_sample_exact(self):
        for step_method in self.master_samples:
>           self.check_trace(step_method)

tests/test_step.py:144: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pymc3.tests.test_step.TestStepMethods object at 0x7f01a223b208>, step_method = <class 'pymc3.step_methods.slicer.Slice'>

    def check_trace(self, step_method):
        """Tests whether the trace for step methods is exactly the same as on master.
    
            Code changes that effect how random numbers are drawn may change this, and require
            `master_samples` to be updated, but such changes should be noted and justified in the
            commit.
    
            This method may also be used to benchmark step methods across commits, by running, for
            example
    
            ```
            BENCHMARK=100000 ./scripts/test.sh -s pymc3/tests/test_step.py:TestStepMethods
            ```
    
            on multiple commits.
            """
        n_steps = 100
        with Model():
            x = Normal('x', mu=0, sd=1)
            if step_method.__name__ == 'SMC':
                Deterministic('like', - 0.5 * tt.log(2 * np.pi) - 0.5 * x.T.dot(x))
                trace = smc.ATMIP_sample(n_steps=n_steps, step=step_method(random_seed=1),
                                         n_jobs=1, progressbar=False,
                                         homepath=self.temp_dir)
            else:
                trace = sample(0, tune=n_steps,
                               discard_tuned_samples=False,
                               step=step_method(), random_seed=1)
    
        assert_array_almost_equal(
            trace.get_values('x'),
            self.master_samples[step_method],
>           decimal=select_by_precision(float64=6, float32=4))
E       AssertionError: 
E       Arrays are not almost equal to 6 decimals
E       
E       (mismatch 100.0%)
E        x: array([ -5.952524e-01,  -1.818949e-01,  -4.982115e-01,  -1.022628e-01,
E               -4.267260e-01,   1.754469e+00,  -1.300225e+00,   8.356580e-01,
E                8.958796e-01,  -8.852145e-01,  -6.635309e-01,  -8.393031e-01,...
E        y: array([ -8.130874e-01,  -3.089219e-01,  -6.793771e-01,   6.508126e-01,
E               -7.635776e-01,  -8.131998e-01,  -1.638235e+00,  -7.038637e-02,
E                2.051078e+00,   1.685982e+00,   6.924637e-01,  -7.751208e-01,...

tests/test_step.py:178: AssertionError
========================================================================================================= warnings summary =========================================================================================================
tests/test_step.py::TestStepMethods::()::test_sample_exact
  /home/madanh/.virtualenvs/theanobleeding/lib/python3.4/site-packages/pymc3/step_methods/hmc/nuts.py:418: UserWarning: Chain 0 contains only 0 samples.
    % (self._chain_id, n))
  /home/madanh/.virtualenvs/theanobleeding/lib/python3.4/site-packages/pymc3/step_methods/hmc/nuts.py:420: UserWarning: Step size tuning was enabled throughout the whole trace. You might want to specify the number of tuning steps.
    warnings.warn('Step size tuning was enabled throughout the whole '
  /home/madanh/.virtualenvs/theanobleeding/lib/python3.4/site-packages/numpy/core/fromnumeric.py:2909: RuntimeWarning: Mean of empty slice.
    out=out, **kwargs)

-- Docs: http://doc.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================================== 1 failed, 3 warnings in 16.04 seconds ===============================================================================================

but it might be that I'm not testing the exact PR I'm sending, because I didn't clone the repo locally and testing whatever I had when I added the change. Anyway I'm out for investigation.

ColCarroll · 2017-07-26T14:17:59Z

Yes, that's an expected failure -- we have those tests since doing rigorous statistical analysis is not really in line with automated testing, so it flags if the sampling method changes at all (this is called a "regression test", which is confusing in the machine learning world :) ).

You should be able to change the first key/value pair in test_step.TestStepMethods.master_samples to

Slice: np.array([                                       
    -5.95252353e-01,  -1.81894861e-01,  -4.98211488e-01,
    -1.02262800e-01,  -4.26726030e-01,   1.75446860e+00,
    -1.30022548e+00,   8.35658004e-01,   8.95879638e-01,
    -8.85214481e-01,  -6.63530918e-01,  -8.39303080e-01,
     9.42792225e-01,   9.03554344e-01,   8.45254684e-01,
    -1.43299803e+00,   9.04897201e-01,  -1.74303131e-01,
    -6.38611581e-01,   1.50013968e+00,   1.06864438e+00,
    -4.80484421e-01,  -7.52199709e-01,   1.95067495e+00,
    -3.67960104e+00,   2.49291588e+00,  -2.11039152e+00,
     1.61674758e-01,  -1.59564182e-01,   2.19089873e-01,
     1.88643940e+00,   4.04098154e-01,  -4.59352326e-01,
    -9.06370675e-01,   5.42817654e-01,   6.99040611e-03,
     1.66396391e-01,  -4.74549281e-01,   8.19064437e-02,
     1.69689952e+00,  -1.62667304e+00,   1.61295808e+00,
     1.30099144e+00,  -5.46722750e-01,  -7.87745494e-01,
     7.91027521e-01,  -2.35706976e-02,   1.68824376e+00,
     7.10566880e-01,  -7.23551374e-01,   8.85613069e-01,
    -1.27300146e+00,   1.80274430e+00,   9.34266276e-01,
     2.40427061e+00,  -1.85132552e-01,   4.47234196e-01,
    -9.81894859e-01,  -2.83399706e-01,   1.84717533e+00,
    -1.58593284e+00,   3.18027270e-02,   1.40566006e+00,
    -9.45758714e-01,   1.18813188e-01,  -1.19938604e+00,
    -8.26038466e-01,   5.03469984e-01,  -4.72742758e-01,
     2.27820946e-01,  -1.02608915e-03,  -6.02507158e-01,
     7.72739682e-01,   7.16064505e-01,  -1.63693490e+00,
    -3.97161966e-01,   1.17147944e+00,  -2.87796982e+00,
    -1.59533297e+00,   6.73096114e-01,  -3.34397247e-01,
     1.22357427e-01,  -4.57299104e-02,   1.32005771e+00,
    -1.29910645e+00,   8.16168850e-01,  -1.47357594e+00,
     1.34688446e+00,   1.06377551e+00,   4.34296696e-02,
     8.23143354e-01,   8.40906324e-01,   1.88596864e+00,
     5.77120694e-01,   2.71732927e-01,  -1.36217979e+00,
     2.41488213e+00,   4.68298379e-01,   4.86342250e-01,
    -8.43949966e-01]),

madanh · 2017-07-27T09:40:15Z

@ColCarroll OK, travis also fails on this test. I was thinkng - if it's an expected failure, should we remove it, should I raise an issue?
Also ,since all other tests on travis pass, I think we are good to merge.

madanh · 2017-07-27T09:46:31Z

@ColCarroll Oh, sorry, I did not get that I have to change the "master sample" in the test itself before now! Will look into it.

madanh · 2017-07-28T09:15:41Z

@junpenglao @ColCarroll Important update: I will have no spare time next days, so if minimum requirements are satisfied, please consider merging, as I'm not going to add tests or anything.

junpenglao · 2017-07-28T09:17:26Z

@madanh No problem. Thank you for the contribution!

ColCarroll · 2017-07-28T10:14:02Z

Thanks!

madanh · 2017-07-29T12:06:05Z

Thanks for the package! Cheers :)

…s#2446) * Make Slice sampler sample from 1D conditionals In the previous implementation it would sample jointly from non-scalar variables, and hang for when the size is high (due to low probability to get a joint sample within the slice in high-D). * slicer.py Fix broken indentation due to copypaste * Apply autopep8 * Delete a superfluous commented line * Update the master sample for Slice in test_step.py

madanh added 2 commits July 26, 2017 13:58

Make Slice sampler sample from 1D conditionals

b2e7413

In the previous implementation it would sample jointly from non-scalar variables, and hang for when the size is high (due to low probability to get a joint sample within the slice in high-D).

slicer.py

8805f1d

Fix broken indentation due to copypaste

Apply autopep8

8e769f7

ColCarroll reviewed Jul 26, 2017

View reviewed changes

Delete a superfluous commented line

9accbb7

Update the master sample for Slice in test_step.py

a4c1b16

junpenglao merged commit b988ba9 into pymc-devs:master Jul 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make slice sampler sample from 1D conditionals as it should #2446

Make slice sampler sample from 1D conditionals as it should #2446

madanh commented Jul 26, 2017

junpenglao commented Jul 26, 2017

junpenglao commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 26, 2017

junpenglao commented Jul 26, 2017 •

edited

Loading

ColCarroll left a comment

ColCarroll Jul 26, 2017

ColCarroll Jul 26, 2017

ColCarroll Jul 26, 2017

ColCarroll Jul 26, 2017

madanh Jul 26, 2017

ColCarroll Jul 26, 2017

madanh Jul 26, 2017

ColCarroll Jul 26, 2017

junpenglao commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 27, 2017

madanh commented Jul 27, 2017

madanh commented Jul 28, 2017

junpenglao commented Jul 28, 2017

ColCarroll commented Jul 28, 2017

madanh commented Jul 29, 2017

Make slice sampler sample from 1D conditionals as it should #2446

Make slice sampler sample from 1D conditionals as it should #2446

Conversation

madanh commented Jul 26, 2017

junpenglao commented Jul 26, 2017

junpenglao commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 26, 2017

junpenglao commented Jul 26, 2017 • edited Loading

ColCarroll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junpenglao commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 26, 2017

ColCarroll commented Jul 26, 2017

madanh commented Jul 27, 2017

madanh commented Jul 27, 2017

madanh commented Jul 28, 2017

junpenglao commented Jul 28, 2017

ColCarroll commented Jul 28, 2017

madanh commented Jul 29, 2017

junpenglao commented Jul 26, 2017 •

edited

Loading