-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowdown when fitting fBHB #187
Comments
These are my settings for the fBHB parameter. fbhb_param = {"name": "fbhb", I saw in another issue that the MIST isochrones can be slow when changing the metallicity but I was already fitting metallicity and I've been using Padova isochrones so I can't really see that being the problem. I did a drastically reduced test run (16 walkers, 100 iterations) and it completed one source after 4 hours. |
There are a set of parameters -- |
you might consider conducting multiple fits with different fixed values of |
Can you elaborate on what you mean by combining the posteriors normalized by the evidence from dynesty? As I understand it, the evidence is the model-independent factor in the denominator of Bayes theorem but isn't actually involved in an MCMC run. My interpretation is that the prior changes between the runs with different fbhb and dynesty provides a tool to calculate the evidence to correct for this so the posteriors can be fairly compared. Is this correct or am I missing something? Does it mean I need to do the fit with dynesty rather than emcee? |
Yes, that's what I meant, and yes, you'd need dynesty fits rather than emcee, since the posterior PDFs from the latter are unnormalized, and can't be easily compared in absolute terms once the model (not necessarily even the priors) has changed (by changing fBHB.) The dynesty docs or paper may provide useful info. To be honest, I've never done this before, but I think in principle it should be possible. tagging @joshspeagle who may have thoughts or suggestions. |
Yes, in principle this should be possible. The terminology would be “Bayesian Model Averaging”, and you’d just be assuming that each model (with fixed fbhb) should be weighted according to the the expectation value of the likelihood of that model averaged over the prior (i.e. the evidence). This would implicitly be marginalizing over fbhb. |
Thanks! And then the spacing of the grid in |
I was assuming the spacing would be uniform but yea — I guess so! You could also just impose a prior weighting scheme too. |
Thanks, this gives me a good jumping-off point. I'll leave this issue open for now in case anything else comes up. |
I've been trying different things for this and I've found that the dynesty runs fail on some sources with some fBHB values. They go through the initial dynesty run then hang at the end of the final run (typically iter >5000). There's no exception thrown and as far as I can tell it doesn't seem to be exceeding the resources of the server. It doesn't seem like a problem with the data either since it occurs for some fBHB values and not others, even on the same object. Do you have any ideas? As for merging the ones that finished, my thoughts are to manually edit the ['sampling']['chain'] and ['bestfit']['parameter'] fields to append the fBHB in order to try to fool dynesty.utils.merge_runs(result_list) so hopefully it will think that they are just parallel runs of the same prior so I can merge them as described here https://dynesty.readthedocs.io/en/latest/quickstart.html#combining-runs. What do you think of this plan? |
Hmm, there was a problem with excessive memory usage causing runs to fail but that was fixed a few months ago (4d62832). Otherwise I'm not sure. Are you fitting spectroscopic data? I don't know enough about how |
Yes, I'm fitting spectra. |
I would discourage you from trying this since I'm not 100% sure it would work. I would instead recommend just using |
For the stalling issue, a couple things. Do you see the efficiency (as listed in the progress line) dropping to very low values? Is the hang actually at the "end", i.e. has the stopping criterion been satisfied and the problem is somehow with final computations? Is it in the dynamic part (batch > 0) or the initial pass? You might try setting a maximum number of model calls or sampling iterations by adding the Some possible mitigation strategies: make |
Here are my dynesty run params
run_params = {'verbose': True,
'debug': True,
# dynesty Fitter parameters
'nested_bound': 'multi', # bounding method
'nested_sample': 'unif', # sampling method
'nested_nlive_init': 100,
'nested_nlive_batch': 100,
'nested_bootstrap': 0,
'nested_dlogz_init': 0.05,
'nested_weight_kwargs': {"pfrac": 1.0},
}
run_params["nested_posterior_thresh"] = 0.05
run_params["nested_maxcall"] = int(1e6)
run_params["nested_maxiter_call"] = int(7e5)
Here's the output from the last frozen one.
<my home
directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104:
RuntimeWarning: divide by zero encountered in log
iter: 5637 | batch: 0 | nc: 1 | ncall: 12821 | eff(%): 43.967 | logz:
-3763882.020 +/- 0.996 | dlogz: 0.000 > 0.050
done dynesty (initial) in 155.2293312549591s
iter: 6382 | batch: 2 | nc: 1 | ncall: 13936 | eff(%): 45.336 | loglstar:
-3763834.355 < -3763828.209 < -3763830.886 | logz: -3763881.842 +/- 0.992
| stop: 1.758
In contrast, here's the last one that succeeded
<my home
directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104:
RuntimeWarning: divide by zero encountered in log
iter: 5684 | batch: 0 | nc: 1 | ncall: 12131 | eff(%): 46.855 | logz:
-3746314.831 +/- 1.000 | dlogz: 0.000 > 0.050
done dynesty (initial) in 149.3077552318573s
iter: 7854 | batch: 5 | nc: 3 | ncall: 15092 | eff(%): 50.889 | loglstar:
-3746268.283 < -3746262.956 < -3746264.188 | logz: -3746314.553 +/- 0.988
| stop: 1.163 done dynesty (dynamic) in 58.89828610420227s
They are on the same source, with spectra included and recale_spectrum on
(by the way is the rescaling factor stored in the output? I can't find it
for making my plots) with different fbhb values (0.1 and 0.2 respectively,
I'm trying to figure out the problem on reduced runs of just fbhb=0, 0.1,
and 0.20) and I think I've masked out all the bad values from my spectra.
The successful ones on this object do seem to be bumping up against my age
prior and I'm suspicious of the logzsol (it's ~0.18 and the object's
supposed to be a globular cluster) but my other objects seem fine and also
are less prone to freezing.
It's worth noting that we have a slightly negative redshift for this object
in some archival spectral redshift data while all the others are positive.
Maybe that's relevant? The fbhb = 0 run for this object required about
6x10^5 of its maximum 7x10^5 iterations. For completeness, that one's
output was
<my home
directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104:
RuntimeWarning: divide by zero encountered in log
iter: 5421 | batch: 0 | nc: 1 | ncall: 12030 | eff(%): 45.062 | logz:
-3728262.062 +/- 0.972 | dlogz: 0.000 > 0.050
done dynesty (initial) in 134.1306962966919s
iter: 666470 | batch: 3 | nc: 1 | ncall: 1000116 | eff(%): 66.635 |
loglstar: -3728217.568 < -3728213.324 < -3728213.090 | logz: -3728261.898
+/- 0.964 | stop: 1.418 done dynesty (dynamic) in 8021.169002532959s
…On Fri, Jul 31, 2020 at 6:27 PM Ben Johnson ***@***.***> wrote:
For the stalling issue, a couple things. Do you see the efficiency (as
listed in the progress line) dropping to very low values? Is the hang
actually at the "end", i.e. has the stopping criterion been satisfied and
the problem is somehow with final computations? Is it in the dynamic part
(batch > 0) or the initial pass? You might try setting a maximum number of
model calls or sampling iterations by adding the nested_maxiter_call or
nested_maxiter_batch argument and setting it to something like int(5e6)
(or 3000 for number of iterations). For the objects that stall only on some
fBHB values, do the fits look reasonable for the fBHB values where they
don't stall? Or are they pegged against the edge of some prior?
Some possible mitigation strategies: make nested_posterior_thresh a
larger number, inflate the uncertainties on the spectra, make sure all
possible bad pixels (residual sky emission, NsD ISM absorption, etc.) are
masked. Beyond that you might need a spectroscopic noise jitter term or
spectroscopic outlier model, but try the others first.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#187 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSN65DYOS72D2NXU6PSEXDR6MEHNANCNFSM4OZ6KJDA>
.
|
something seems off; the efficiency is very high, and 6e5 iterations is very large. You should probably be fitting for redshift, and using a The Can you send the model description (i.e. the result of |
assuming this is resolve, please repopen if there are still problems. |
@joshspeagle |
I realized in past test runs I had forgotten to pass the model parameter for describing the fraction of blue horizontal branch stars (fBHB) in my stellar population so it had been fixed at the default of zero. Now that I'm actually fitting the fBHB parameter rather than just mass, logzsol, and tage (as in the demo) my runs have slowed drastically --- from ~40 minutes for 17 object fits to taking days to do one object. This is without changing the number of emcee walkers or iterations. I also added in a spectrum (previously I only had photometry) but there was a test run in between with just the fBHB changed which seemed incredibly slow as well.
Why do you think this is happening? If this is unavoidable I can submit my fits to a computing cluster but it seems strange enough that there may be something weird going on. It doesn't throw any exceptions and it does show that it's progressing, just very slowly.
The text was updated successfully, but these errors were encountered: