New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sampling Output checks and Normality tests to check for bugs #1618
Conversation
@@ -39,26 +41,26 @@ class TestStepMethods(object): # yield test doesn't work subclassing unittest.T | |||
7.04959179e-01, 8.37863464e-01, -5.24200836e-01, 1.28261340e+00, 9.08774240e-01, | |||
8.80566763e-01, 7.82911967e-01, 8.01843432e-01, 7.09251098e-01, 5.73803618e-01]), | |||
HamiltonianMC: np.array([ | |||
-0.74925631, -0.2566773 , -2.12480977, 1.64328926, -1.39315913, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove these from the diff?
assert np.isclose(np.median(trace.beta, 0), beta_true, rtol=0.1).all() | ||
assert np.isclose(np.median(trace.alpha), alpha_true, rtol=0.1) | ||
assert np.isclose(np.median(trace.sigma), sigma_true, rtol=0.1) | ||
np.random.seed(987654321) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we inherit from SeededTest, we shouldn't need to set it again here. Sorry for being confusing.
test_normal = stats.kstest(trace.alpha, 'norm', alternative='greater') | ||
test_normal_beta = stats.kstest(trace.beta[0], 'norm', alternative='greater') | ||
test_uniform = stats.kstest(trace.sigma, 'uniform', alternative='greater') | ||
assert np.less(np.median(test_normal[1]), 0.05) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the p-values? Could we make the test more stringent, perhaps by increasing the number of samples?
Also, I think https://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.testing.assert_array_less.html is better here.
assert np.isclose(np.median(trace.alpha), alpha_true, rtol=0.1) | ||
assert np.isclose(np.median(trace.sigma), sigma_true, rtol=0.1) | ||
np.random.seed(987654321) | ||
test_normal = stats.kstest(trace.alpha, 'norm', alternative='greater') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about: _, p_normal = ...
, can then drop the index below.
test_normal_beta = stats.kstest(trace.beta[0], 'norm', alternative='greater') | ||
test_uniform = stats.kstest(trace.sigma, 'uniform', alternative='greater') | ||
assert np.less(np.median(test_normal[1]), 0.05) | ||
assert np.less(np.median(test_normal_beta[1]) / 2, 0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why / 2
? and .5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was to hack it together. I'm fixing this now. Thanks for the feedback.
b380975
to
e1dcb23
Compare
@twiecki Thanks for the feedback. I changed the tests slightly based on your suggestions, thanks. The tests are now hopefully a lot more robust. We may have to relax the conditions slightly, but they passed on my computer. |
[Slice([alpha, sigma]), Metropolis([beta])]): | ||
trace = sample(1000, step=step_method, progressbar=False) | ||
|
||
assert np.isclose(np.median(trace.beta, 0), beta_true, rtol=0.1).all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.testing.assert_array_almost_equal()
_, test_normal = stats.kstest(trace.alpha, 'norm', alternative='greater') | ||
_, test_normal_beta = stats.kstest(trace.beta[0], 'norm', alternative='greater') | ||
_, test_uniform = stats.kstest(trace.sigma, 'uniform', alternative='greater') | ||
np.testing.assert_array_almost_equal(np.median(test_normal), 4.9960036108132044e-15, decimal=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this testing? shouldn't those be just p-values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe something went wrong. Let me find out...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually i'm using the wrong variant of the test, https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.kstest.html - must have gotten mixed up in my experiments.
assert np.isclose(np.median(trace.sigma), sigma_true, rtol=0.1) | ||
_, test_normal = stats.kstest(trace.alpha, 'norm', alternative='greater') | ||
_, test_normal_beta = stats.kstest(trace.beta[0], 'norm', alternative='greater') | ||
_, test_uniform = stats.kstest(trace.sigma, 'uniform', alternative='greater') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also test for a normal distribution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigma = Uniform('sigma', lower=0.0, upper=1.0) - surely I need to test for the uniform distribtion? Or am i mistaken..n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you're testing the posterior which should be Normal-ish. Log-normal I guess. Maybe we'll leave that one out here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok fair point :) I'll add in something normalish.
There were some errors caused during my rebase. Fixing now... |
Having trouble getting the p-values to work as we intended... seems to be quite tricky. |
I'm having trouble diagnosing this error. I'll try to write up something though - even though it's over the holidays and I'm sleep deprived. |
Ahh I think I get it. From the *Testing t distributed random variables against normal distribution*
With 100 degrees of freedom the t distribution looks close to the normal
distribution, and the K-S test does not reject the hypothesis that the
sample came from the normal distribution:
>>> np.random.seed(987654321)
>>> stats.kstest(stats.t.rvs(100,size=100),'norm')
(0.072018929165471257, 0.67630062862479168)
With 3 degrees of freedom the t distribution looks sufficiently different
from the normal distribution, that we can reject the hypothesis that the
sample came from the normal distribution at the 10% level:
>>> np.random.seed(987654321)
>>> stats.kstest(stats.t.rvs(3,size=100),'norm')
(0.131016895759829, 0.058826222555312224) So this is testing if the two distributions are different, the p-value being below say 10% proves that they are different distributions - not if they are the same distribution. Does anyone know another test or am I misunderstanding this? |
Ohh, right, I forgot about that. How high are the p-values? |
I'll include the diagnostics later - boarding a plane atm :) |
And Submitting an updated PR now. |
d59e535
to
f7e3a7d
Compare
@twiecki Ready to merge? |
# See `https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352` | ||
_, test_normal_beta = stats.normaltest(trace.beta[:,1]) | ||
_, test_uniform = stats.normaltest(trace.sigma) | ||
np.testing.assert_array_less(test_normal_beta, 0.005) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I thought we realized that these should not be significant, so this means the posteriors are not normal, indicating a problem with our sampling. CC @fonnesbeck @jsalvatier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html: "This function tests the null hypothesis that a sample comes from a normal distribution."
I think this test indicates it is normal. because this is a different test.
nevertheless can someone else check this? @colcarrol @fonnesbeck.
…On 26 Dec 2016 6:13 AM, "Thomas Wiecki" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pymc3/tests/test_step.py
<#1618 (review)>:
> + sigma = Uniform('sigma', lower=0.0, upper=1.0)
+ mu = alpha + beta[0] * X1 + beta[1] * X2
+ Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
+
+ for step_method in (NUTS(), Metropolis(),
+ [Slice([alpha, sigma]), Metropolis([beta])]):
+ trace = sample(1000, step=step_method, progressbar=False)
+
+ np.testing.assert_array_almost_equal(np.median(trace.beta, 0), beta_true, decimal=1)
+ np.testing.assert_array_almost_equal(np.median(trace.alpha), alpha_true, decimal=1)
+ np.testing.assert_array_almost_equal(np.median(trace.sigma), sigma_true, decimal=1)
+ # Using the Normal test from SciPy `scipy.stats.normaltest`
+ # See `https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352` <https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352>
+ _, test_normal_beta = stats.normaltest(trace.beta[:,1])
+ _, test_uniform = stats.normaltest(trace.sigma)
+ np.testing.assert_array_less(test_normal_beta, 0.005)
But I thought we realized that these should *not* be significant, so this
means the posteriors are *not* normal, indicating a problem with our
sampling. CC @fonnesbeck <https://github.com/fonnesbeck> @jsalvatier
<https://github.com/jsalvatier>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1618 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiGpxmBaofOsuSw9Lctj-UUtl2jRzks5rL1r-gaJpZM4LSb5I>
.
|
Ok we've got problems then with the samplers. These tests are worth doing.
Good point.
…On 26 Dec 2016 12:56 PM, "Thomas Wiecki" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pymc3/tests/test_step.py <#1618>
:
> + sigma = Uniform('sigma', lower=0.0, upper=1.0)
+ mu = alpha + beta[0] * X1 + beta[1] * X2
+ Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
+
+ for step_method in (NUTS(), Metropolis(),
+ [Slice([alpha, sigma]), Metropolis([beta])]):
+ trace = sample(1000, step=step_method, progressbar=False)
+
+ np.testing.assert_array_almost_equal(np.median(trace.beta, 0), beta_true, decimal=1)
+ np.testing.assert_array_almost_equal(np.median(trace.alpha), alpha_true, decimal=1)
+ np.testing.assert_array_almost_equal(np.median(trace.sigma), sigma_true, decimal=1)
+ # Using the Normal test from SciPy `scipy.stats.normaltest`
+ # See `https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352` <https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352>
+ _, test_normal_beta = stats.normaltest(trace.beta[:,1])
+ _, test_uniform = stats.normaltest(trace.sigma)
+ np.testing.assert_array_less(test_normal_beta, 0.005)
From https://docs.scipy.org/doc/scipy/reference/generated/
scipy.stats.normaltest.html: "This function tests the null hypothesis
that a sample comes from a normal distribution."
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1618>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiJqdCiWulwDuZ7bT4I2G2sK0Tt-bks5rL7l7gaJpZM4LSb5I>
.
|
|
||
for step_method in (NUTS(), Metropolis(), | ||
[Slice([alpha, sigma]), Metropolis([beta])]): | ||
trace = sample(1000, step=step_method, progressbar=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would draw more samples and add generous burn-in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add more generous burn-in.
# Using the Normal test from SciPy `scipy.stats.normaltest` | ||
# See `https://github.com/scipy/scipy/blob/v0.18.1/scipy/stats/stats.py#L1352` | ||
_, test_normal_beta = stats.normaltest(trace.beta[:,1]) | ||
_, test_uniform = stats.normaltest(trace.sigma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I would test for alpha, not sigma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I'll do that.
Refactored this into two tests - one for NUTS and one for metropolis, added in Burn-in, a longer sampler and changed the statistical test. I'll push in a minute or two. These tests pass when the sampler is longer etc. Let me know if you need anything else in this. Update - Tests pass. |
f7e3a7d
to
27191f8
Compare
# We define very small as 0.05, for the sake of this argument. | ||
# We test this by evaluating if the p-value is above 0.05. | ||
_, test_normal = stats.normaltest(trace[-3000:].alpha) | ||
np.testing.assert_array_less(0.05, test_normal, verbose=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this tests that .05 < p_normal
which is what we want, so we get a nice, normal posterior. I would add this test for alpha, beta_0 and beta_1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also rename test_normal
to p_normal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion :)
96fd65f
to
f57947d
Compare
I made some commits - adding nose_parameterized stopped me being able to test it locally in my current conda environment. However it should work on travis. |
This is the last PR with a 3.0 milestone. Once this is done, I will do one more RC and (hopefully) release if all goes well. |
Do we need another rc or can we go straight to a release? |
We did do some fixes so probably best to do another RC. |
[Slice([alpha, sigma]), Metropolis([beta])]): | ||
trace = sample(100000, step=step_method, progressbar=False) | ||
|
||
np.testing.assert_array_almost_equal(np.median(trace.beta, 0), beta_true, decimal=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move these tests into the one below and remove this whole test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we remove this test I think this is ready.
# We define very small as 0.05, for the sake of this argument. | ||
# We test this by evaluating if the p-value is above 0.05. | ||
_, test_p = stats.normaltest(trace[-3000:].alpha) | ||
_, test_p_beta_0 = stats.normaltest(trace[-3000:].beta[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that return an array or just a single sample? Thought it would have to be something like .beta[0, :]
or some such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
|
||
from numpy.testing import assert_array_almost_equal | ||
from nose_parameterized import parameterized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add nose_parameterized
to travis and dependencies. Or just revert to the list...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to Travis and the dependencies. Getting context errors though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it to the dependencies, for future use even though I'm probably not going to use it for this. Perhaps we should remove it though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to use a list instead.
f57947d
to
2acf88d
Compare
@@ -11,3 +11,4 @@ recommonmark | |||
sphinx | |||
nbsphinx | |||
numpydoc | |||
nose_parameterized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need this anymore
Is this older test not needed then?
…On 30 Dec 2016 10:29 PM, "Thomas Wiecki" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pymc3/tests/test_step.py <#1618>
:
> + X1 = np.random.randn(size)
+ X2 = np.random.randn(size) * 0.2
+ Y = alpha_true + beta_true[0] * X1 + beta_true[1] * X2 + np.random.randn(size) * sigma_true
+
+ with Model() as model:
+ alpha = Normal('alpha', mu=0, sd=10)
+ beta = Normal('beta', mu=0, sd=10, shape=2)
+ sigma = Uniform('sigma', lower=0.0, upper=1.0)
+ mu = alpha + beta[0] * X1 + beta[1] * X2
+ Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
+
+ for step_method in (NUTS(), Metropolis(),
+ [Slice([alpha, sigma]), Metropolis([beta])]):
+ trace = sample(100000, step=step_method, progressbar=False)
+
+ np.testing.assert_array_almost_equal(np.median(trace.beta, 0), beta_true, decimal=1)
Once we remove this test I think this is ready.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1618>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiOa2QR3ucCf7uvtjwie7GSu0kfNoks5rNYXWgaJpZM4LSb5I>
.
|
It is but you can just move the asserts inside the other one. |
2acf88d
to
b54ffcf
Compare
Ok I added those changes, moved the asserts - removed an argument that wasn't needed anymore and removed references to nose_parameterized. We should (if all the tests pass) be ready to go. |
# We test this by evaluating if the p-value is above 0.05. | ||
_, test_p = stats.normaltest(trace[-3000:].alpha) | ||
_, test_p_beta_0 = stats.normaltest(trace[-3000:].beta[0, :]) | ||
_, test_p_beta_1 = stats.normaltest(trace[-3000:].beta[:, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should both be sliced along the same dimension. What's the shape of beta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah oops - I meant to change that.
b54ffcf
to
1aabf65
Compare
Updating tests Updating tests again Added in one test for each sampler Updating tests Sampling errors Trying this commit Removing parameterized nose tests as they didn't work with the model context Moving asserts and removing arguments which are unnecessary - removing requirements.txt reference to nose_parameterized Changing shape Removing unnecessary test, and checking they run locally
1aabf65
to
7854624
Compare
Here you go. This should be mergeable now. |
Ok the tests pass I'm going to close the other PR that's related. |
It was quicker for me to do it. I think this is what it should look like, but it doesn't pass yet for Slice. |
Yeah it's fair that Slice should be there too. Maybe Slice applied to alpha
and beta only?
…On 31 Dec 2016 5:00 PM, "Thomas Wiecki" ***@***.***> wrote:
It was quicker for me to do it. I think this is what it should look like,
but it doesn't pass yet for Slice.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1618 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA8DiO3-IMNNG0DAcLYYxGZ4L0nBm_uyks5rNooogaJpZM4LSb5I>
.
|
Ahh I never thought of thinning |
I'll merge this later if it passes |
Based on #1424 - I couldn't for some reason rebase that correctly - added in some KS-tests etc.
Needs a review.