-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seed HSGP bootstrap tests #6679
base: main
Are you sure you want to change the base?
Conversation
fe336b9
to
48c4b25
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6679 +/- ##
==========================================
- Coverage 91.99% 89.52% -2.47%
==========================================
Files 94 94
Lines 15944 15944
==========================================
- Hits 14667 14274 -393
- Misses 1277 1670 +393 |
weird that a different seed it still fails. maybe very unlucky...? I couldnt get it to fail in the original PR. Does it fail for like, any seed? |
My last commit tries 5 seeds, it fails in one of them. The failure rate seems higher than 0.01. Even with 5 tests should only fail 5% of the time |
yup somethings wrong, will pull and give it a try |
seeds = np.arange(5) + 197 # 5 possible seeds | ||
rng = np.random.default_rng(np.random.choice(seeds)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems cleaner to use @parameterize
like for test_prior
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @parametrize was just to show what seed caused the test to fail. The idea was to revert back to choosing one of 5 possible seeds once the root cause was found.
I imagine we don't want to run each test 5 times, but instead allow some variation across tests.
"""Compare HSGP prior to unapproximated GP prior, pm.gp.Latent. Draw samples from the | ||
prior and compare them using MMD two sample test. Tests both centered and non-centered | ||
parameterizations. | ||
""" | ||
rng = np.random.default_rng(seed) | ||
|
||
with model: | ||
hsgp = pm.gp.HSGP(m=[200], c=2.0, parameterization=parameterization, cov_func=cov_func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try increasing m
here to 500
with model: | ||
hsgp = pm.gp.HSGP(m=[200], c=2.0, parameterization=parameterization, cov_func=cov_func) | ||
f1 = hsgp.prior("f1", X=X1) | ||
|
||
gp = pm.gp.Latent(cov_func=cov_func) | ||
f2 = gp.prior("f2", X=X1) | ||
|
||
idata = pm.sample_prior_predictive(samples=1000) | ||
idata = pm.sample_prior_predictive(samples=1000, random_seed=rng) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its ok to decrease samples
here to 500, will run faster which will be nice for trying the different random seeds
@@ -194,17 +197,20 @@ def test_conditional(self, model, cov_func, X1, parameterization): | |||
prior and compare them using MMD two sample test. Tests both centered and non-centered | |||
parameterizations. The conditional should match the prior when no data is observed. | |||
""" | |||
seeds = np.arange(5) + 197 # 5 possible seeds | |||
rng = np.random.default_rng(np.random.choice(seeds)) | |||
|
|||
with model: | |||
hsgp = pm.gp.HSGP(m=[100], c=2.0, parameterization=parameterization, cov_func=cov_func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try increasing m
here to 500 too. This got rid of the random failures for me locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasing m
seems to have fixed things for me locally. Then, decreasing the number of samples in pm.sample_posterior_predictive
helps it go a bit faster. I think that should fix it
Failed in #6678
📚 Documentation preview 📚: https://pymc--6679.org.readthedocs.build/en/6679/