Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampler hanging above some hazy number of samples #97

Open
leerosenthalj opened this issue Nov 22, 2019 · 6 comments
Open

Sampler hanging above some hazy number of samples #97

leerosenthalj opened this issue Nov 22, 2019 · 6 comments

Comments

@leerosenthalj
Copy link

leerosenthalj commented Nov 22, 2019

I am running the sampler using MultiPool(), and keeping an eye on my activity monitor. I see eight python processes run for a few minutes, and then they stop -- but the notebook cell in which I am running the process does not complete. Is it possible that the joker.rejection_sample() code is stalling somewhere at the end of the process, after the actual sampling has been completed?

EDIT: code snippet share via email

with pm.Model():
    dv0_1 = xu.with_unit(pm.Normal('dv0_1', 0, 5.), u.m/u.s)
    s = xu.with_unit(pm.Lognormal('s', 0, 0.5), u.m/u.s)
    baseline = np.amax(time_j) - np.amin(time_k)
    params = tj.JokerPrior.default(P_min=baseline, P_max=8*baseline, s=s,
                                   v0_offsets=[dv0_1], sigma_K0=1*u.km/u.s, sigma_v = 10*u.km/u.s)

%%time
prior_samples = params.sample(2*10**8)
with schwimmbad.MultiPool() as pool:
    joker = tj.TheJoker(params, pool=pool)
    samples = joker.rejection_sample([rvdata_j, rvdata_k], prior_samples)
@leerosenthalj
Copy link
Author

Note: sampling using a SerialPool process does not reproduce this issue.

@adrn
Copy link
Owner

adrn commented Nov 25, 2019

If you use conda, could you dump your environment info with: conda list > conda_env.txt and send that to me? Also, it'd be handy if you could send me a small snippet of code that reproduces the behavior you are seeing. Thanks!

@adrn adrn changed the title Sampler hanging above some hazy number of samples, ~5*10^6. Sampler hanging above some hazy number of samples Nov 26, 2019
@adrn
Copy link
Owner

adrn commented Nov 26, 2019

I think there are a few things going on here.

(1) I fixed some minor speed issues in prior.sample(), so please try updating with: pip install git+https://github.com/adrn/thejoker

(2) Generating that many prior samples is always going to be somewhat slow, so I would do it once and store the samples in a file on disk. I would use something like this:

import os
prior_cache_file = 'prior_samples.hdf5'
if not os.path.exists(prior_cache_file):
    prior_samples = prior.sample(2*10**8)
    prior_samples.write(prior_cache_file)

with schwimmbad.MultiPool() as pool:
    joker = tj.TheJoker(params, pool=pool)
    samples = joker.rejection_sample([rvdata_j, rvdata_k], prior_cache_file)

(3) When you pass in a JokerSamples object to rejection_sample(), the first thing it does (by default) is write it out to a temporary file. In this case, that is a ~7 GB file, so that is probably slow. This will be fixed if you use the code snippet above.

@leerosenthalj
Copy link
Author

leerosenthalj commented Nov 27, 2019 via email

@leerosenthalj
Copy link
Author

I am still seeing the theano lock error message when attempting to run with MultiPool(). I am using a dataset with ~60 RVs; could that be part of the issue?

@adrn
Copy link
Owner

adrn commented Jul 28, 2020

@leerosenthalj Coming back to this after a long time, I'm realizing that this is likely an issue with running from a Jupyter notebook. If you switch to a script, and set the compiledir via $THEANO_FLAGS for each process (see #105), it should work.

@AstroSong I made a new issue to discuss your question #106

Repository owner deleted a comment from AstroSong Jul 28, 2020
Repository owner deleted a comment from AstroSong Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants