Slowdown when fitting fBHB #187

RGleis · 2020-07-14T22:20:02Z

I realized in past test runs I had forgotten to pass the model parameter for describing the fraction of blue horizontal branch stars (fBHB) in my stellar population so it had been fixed at the default of zero. Now that I'm actually fitting the fBHB parameter rather than just mass, logzsol, and tage (as in the demo) my runs have slowed drastically --- from ~40 minutes for 17 object fits to taking days to do one object. This is without changing the number of emcee walkers or iterations. I also added in a spectrum (previously I only had photometry) but there was a test run in between with just the fBHB changed which seemed incredibly slow as well.

Why do you think this is happening? If this is unavoidable I can submit my fits to a computing cluster but it seems strange enough that there may be something weird going on. It doesn't throw any exceptions and it does show that it's progressing, just very slowly.

RGleis · 2020-07-15T00:48:01Z

These are my settings for the fBHB parameter.

fbhb_param = {"name": "fbhb",
"N": 1,
"isfree": True,
"init": 0.1,
"prior": priors.TopHat(mini=0.0, maxi=0.5),
"init_disp": 0.05,
"disp_floor": 0.01,
"units": "unitless fraction",
}

I saw in another issue that the MIST isochrones can be slow when changing the metallicity but I was already fitting metallicity and I've been using Padova isochrones so I can't really see that being the problem. I did a drastically reduced test run (16 walkers, 100 iterations) and it completed one source after 4 hours.

bd-j · 2020-07-15T02:14:33Z

There are a set of parameters -- fsps.StellarPopulation().params.ssp_params -- that if changed trigger a regeneration of the SSPs (as opposed to simply changing how they are combined). The fbhb parameter is one of these, since it modifies the isochrones. There is nothing to do to mitigate this. Note that if you switch to MIST the runtime will likely increase much more, since those SSPs take even longer to generate.

bd-j · 2020-07-15T02:17:40Z

you might consider conducting multiple fits with different fixed values of fbhb (i.e., a grid in fbhb) and then combining the posteriors, normalized by the evidence from dynesty.

RGleis · 2020-07-17T04:12:29Z

Can you elaborate on what you mean by combining the posteriors normalized by the evidence from dynesty? As I understand it, the evidence is the model-independent factor in the denominator of Bayes theorem but isn't actually involved in an MCMC run. My interpretation is that the prior changes between the runs with different fbhb and dynesty provides a tool to calculate the evidence to correct for this so the posteriors can be fairly compared. Is this correct or am I missing something? Does it mean I need to do the fit with dynesty rather than emcee?

bd-j · 2020-07-17T13:51:14Z

Yes, that's what I meant, and yes, you'd need dynesty fits rather than emcee, since the posterior PDFs from the latter are unnormalized, and can't be easily compared in absolute terms once the model (not necessarily even the priors) has changed (by changing fBHB.) The dynesty docs or paper may provide useful info. To be honest, I've never done this before, but I think in principle it should be possible. tagging @joshspeagle who may have thoughts or suggestions.

joshspeagle · 2020-07-17T15:00:14Z

Yes, in principle this should be possible. The terminology would be “Bayesian Model Averaging”, and you’d just be assuming that each model (with fixed fbhb) should be weighted according to the the expectation value of the likelihood of that model averaged over the prior (i.e. the evidence). This would implicitly be marginalizing over fbhb.

bd-j · 2020-07-17T15:04:51Z

Thanks! And then the spacing of the grid in fbhb would constitute an implicit prior on that parameter, yes?

joshspeagle · 2020-07-17T15:48:03Z

I was assuming the spacing would be uniform but yea — I guess so! You could also just impose a prior weighting scheme too.

RGleis · 2020-07-17T18:23:11Z

Thanks, this gives me a good jumping-off point. I'll leave this issue open for now in case anything else comes up.

RGleis · 2020-07-29T08:21:17Z

I've been trying different things for this and I've found that the dynesty runs fail on some sources with some fBHB values. They go through the initial dynesty run then hang at the end of the final run (typically iter >5000). There's no exception thrown and as far as I can tell it doesn't seem to be exceeding the resources of the server. It doesn't seem like a problem with the data either since it occurs for some fBHB values and not others, even on the same object. Do you have any ideas?

As for merging the ones that finished, my thoughts are to manually edit the ['sampling']['chain'] and ['bestfit']['parameter'] fields to append the fBHB in order to try to fool dynesty.utils.merge_runs(result_list) so hopefully it will think that they are just parallel runs of the same prior so I can merge them as described here https://dynesty.readthedocs.io/en/latest/quickstart.html#combining-runs. What do you think of this plan?

bd-j · 2020-07-29T23:04:55Z

Hmm, there was a problem with excessive memory usage causing runs to fail but that was fixed a few months ago (4d62832). Otherwise I'm not sure. Are you fitting spectroscopic data?

I don't know enough about how merge_runs works to be sure, but that might well work. bat signal for @joshspeagle
(Note that "bestfit" is not important for dynesty; and to fool dynesty you'll need to create a dictionary that renames some of the vectors in "sampling": "lnprobability" -> "logl", "chain" -> "samples", and I think you need the ["sampling"]["logz"] array too.

RGleis · 2020-07-30T01:45:58Z

Yes, I'm fitting spectra.

joshspeagle · 2020-07-30T02:17:02Z

in order to try to fool dynesty.utils.merge_runs(result_list) so hopefully it will think that they are just parallel runs of the same prior

I would discourage you from trying this since I'm not 100% sure it would work. I would instead recommend just using resample_equal to get a new set of samples for each run, weight them proportional to the p(fBHB) value from the prior for that run, and then just combine the weighted samples from all the runs as just a giant set of weighted samples.

bd-j · 2020-07-31T18:26:46Z

For the stalling issue, a couple things. Do you see the efficiency (as listed in the progress line) dropping to very low values? Is the hang actually at the "end", i.e. has the stopping criterion been satisfied and the problem is somehow with final computations? Is it in the dynamic part (batch > 0) or the initial pass? You might try setting a maximum number of model calls or sampling iterations by adding the nested_maxiter_call or nested_maxiter_batch argument and setting it to something like int(5e6) (or 3000 for number of iterations). For the objects that stall only on some fBHB values, do the fits look reasonable for the fBHB values where they don't stall? Or are they pegged against the edge of some prior?

Some possible mitigation strategies: make nested_posterior_thresh a larger number, inflate the uncertainties on the spectra, make sure all possible bad pixels (residual sky emission, NsD ISM absorption, etc.) are masked. Beyond that you might need a spectroscopic noise jitter term or spectroscopic outlier model, but try the others first.

RGleis · 2020-08-05T00:25:27Z

Here are my dynesty run params run_params = {'verbose': True, 'debug': True, # dynesty Fitter parameters 'nested_bound': 'multi', # bounding method 'nested_sample': 'unif', # sampling method 'nested_nlive_init': 100, 'nested_nlive_batch': 100, 'nested_bootstrap': 0, 'nested_dlogz_init': 0.05, 'nested_weight_kwargs': {"pfrac": 1.0}, } run_params["nested_posterior_thresh"] = 0.05 run_params["nested_maxcall"] = int(1e6) run_params["nested_maxiter_call"] = int(7e5) Here's the output from the last frozen one. <my home directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104: RuntimeWarning: divide by zero encountered in log iter: 5637 | batch: 0 | nc: 1 | ncall: 12821 | eff(%): 43.967 | logz: -3763882.020 +/- 0.996 | dlogz: 0.000 > 0.050 done dynesty (initial) in 155.2293312549591s iter: 6382 | batch: 2 | nc: 1 | ncall: 13936 | eff(%): 45.336 | loglstar: -3763834.355 < -3763828.209 < -3763830.886 | logz: -3763881.842 +/- 0.992 | stop: 1.758 In contrast, here's the last one that succeeded <my home directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104: RuntimeWarning: divide by zero encountered in log iter: 5684 | batch: 0 | nc: 1 | ncall: 12131 | eff(%): 46.855 | logz: -3746314.831 +/- 1.000 | dlogz: 0.000 > 0.050 done dynesty (initial) in 149.3077552318573s iter: 7854 | batch: 5 | nc: 3 | ncall: 15092 | eff(%): 50.889 | loglstar: -3746268.283 < -3746262.956 < -3746264.188 | logz: -3746314.553 +/- 0.988 | stop: 1.163 done dynesty (dynamic) in 58.89828610420227s They are on the same source, with spectra included and recale_spectrum on (by the way is the rescaling factor stored in the output? I can't find it for making my plots) with different fbhb values (0.1 and 0.2 respectively, I'm trying to figure out the problem on reduced runs of just fbhb=0, 0.1, and 0.20) and I think I've masked out all the bad values from my spectra. The successful ones on this object do seem to be bumping up against my age prior and I'm suspicious of the logzsol (it's ~0.18 and the object's supposed to be a globular cluster) but my other objects seem fine and also are less prone to freezing. It's worth noting that we have a slightly negative redshift for this object in some archival spectral redshift data while all the others are positive. Maybe that's relevant? The fbhb = 0 run for this object required about 6x10^5 of its maximum 7x10^5 iterations. For completeness, that one's output was <my home directory>/.local/lib/python3.8/site-packages/prospect-0.4.0-py3.8.egg/prospect/models/priors.py:104: RuntimeWarning: divide by zero encountered in log iter: 5421 | batch: 0 | nc: 1 | ncall: 12030 | eff(%): 45.062 | logz: -3728262.062 +/- 0.972 | dlogz: 0.000 > 0.050 done dynesty (initial) in 134.1306962966919s iter: 666470 | batch: 3 | nc: 1 | ncall: 1000116 | eff(%): 66.635 | loglstar: -3728217.568 < -3728213.324 < -3728213.090 | logz: -3728261.898 +/- 0.964 | stop: 1.418 done dynesty (dynamic) in 8021.169002532959s

…

On Fri, Jul 31, 2020 at 6:27 PM Ben Johnson ***@***.***> wrote: For the stalling issue, a couple things. Do you see the efficiency (as listed in the progress line) dropping to very low values? Is the hang actually at the "end", i.e. has the stopping criterion been satisfied and the problem is somehow with final computations? Is it in the dynamic part (batch > 0) or the initial pass? You might try setting a maximum number of model calls or sampling iterations by adding the nested_maxiter_call or nested_maxiter_batch argument and setting it to something like int(5e6) (or 3000 for number of iterations). For the objects that stall only on some fBHB values, do the fits look reasonable for the fBHB values where they don't stall? Or are they pegged against the edge of some prior? Some possible mitigation strategies: make nested_posterior_thresh a larger number, inflate the uncertainties on the spectra, make sure all possible bad pixels (residual sky emission, NsD ISM absorption, etc.) are masked. Beyond that you might need a spectroscopic noise jitter term or spectroscopic outlier model, but try the others first. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#187 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSN65DYOS72D2NXU6PSEXDR6MEHNANCNFSM4OZ6KJDA> .

bd-j · 2020-08-09T12:14:38Z

something seems off; the efficiency is very high, and 6e5 iterations is very large.

You should probably be fitting for redshift, and using a lumdist parameter for the luminosity distance (so that it is not taken from the redshift + cosmology, which might be undefined for negative z?)

The rescale_spectrum parameter simply scales the observed spectrum (and uncertainties) to have a median = 1; the factor is stored in the obs dictionary as "rescale" since it is fundamentally an operation on the data. I would actually recommend against using it. Are you also fitting photometry? Are you using PolySedModel or PolySpecModel, both of which can optimize/fit for a normalization factor? Are your spectra flux calibrated? Are you handling the lines spread function?

Can you send the model description (i.e. the result of print(model))? Feel free to email as well if that's better. See also https://github.com/bd-j/GCPro for some tips on fitting GCs with prospector (this also uses the more modern interaction via command line options + parser instead of an explicitly edited run_params dictionary.)

bd-j · 2020-11-02T01:04:27Z

assuming this is resolve, please repopen if there are still problems.

joel-roediger · 2022-08-23T15:23:30Z

@joshspeagle
I am trying to implement the solution that you recommended above for combining the results of Prospector runs with differing (fixed) values of the parameter fbhb. I am not sure what you meant above by "... weight them proportional to the p(fBHB) value from the prior for that run, then just combine the weighted samples from all the runs as just a giant set of weighted samples". Am I to weight the likelihoods of the samples or something else? And would it be straightforward to use the plotting tools offered in dynesty on the combined set of weighted samples?

bd-j closed this as completed Nov 2, 2020

bd-j mentioned this issue Jul 10, 2022

Varying IMF Power Law Indices in Prospector #251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown when fitting fBHB #187

Slowdown when fitting fBHB #187

RGleis commented Jul 14, 2020

RGleis commented Jul 15, 2020 •

edited

Loading

bd-j commented Jul 15, 2020

bd-j commented Jul 15, 2020 •

edited

Loading

RGleis commented Jul 17, 2020

bd-j commented Jul 17, 2020

joshspeagle commented Jul 17, 2020 •

edited

Loading

bd-j commented Jul 17, 2020

joshspeagle commented Jul 17, 2020

RGleis commented Jul 17, 2020

RGleis commented Jul 29, 2020

bd-j commented Jul 29, 2020

RGleis commented Jul 30, 2020

joshspeagle commented Jul 30, 2020

bd-j commented Jul 31, 2020

RGleis commented Aug 5, 2020 via email

bd-j commented Aug 9, 2020

bd-j commented Nov 2, 2020

joel-roediger commented Aug 23, 2022

Slowdown when fitting fBHB #187

Slowdown when fitting fBHB #187

Comments

RGleis commented Jul 14, 2020

RGleis commented Jul 15, 2020 • edited Loading

bd-j commented Jul 15, 2020

bd-j commented Jul 15, 2020 • edited Loading

RGleis commented Jul 17, 2020

bd-j commented Jul 17, 2020

joshspeagle commented Jul 17, 2020 • edited Loading

bd-j commented Jul 17, 2020

joshspeagle commented Jul 17, 2020

RGleis commented Jul 17, 2020

RGleis commented Jul 29, 2020

bd-j commented Jul 29, 2020

RGleis commented Jul 30, 2020

joshspeagle commented Jul 30, 2020

bd-j commented Jul 31, 2020

RGleis commented Aug 5, 2020 via email

bd-j commented Aug 9, 2020

bd-j commented Nov 2, 2020

joel-roediger commented Aug 23, 2022

RGleis commented Jul 15, 2020 •

edited

Loading

bd-j commented Jul 15, 2020 •

edited

Loading

joshspeagle commented Jul 17, 2020 •

edited

Loading