Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown when fitting fBHB #187

Closed
RGleis opened this issue Jul 14, 2020 · 18 comments
Closed

Slowdown when fitting fBHB #187

RGleis opened this issue Jul 14, 2020 · 18 comments

Comments

@RGleis
Copy link

RGleis commented Jul 14, 2020

I realized in past test runs I had forgotten to pass the model parameter for describing the fraction of blue horizontal branch stars (fBHB) in my stellar population so it had been fixed at the default of zero. Now that I'm actually fitting the fBHB parameter rather than just mass, logzsol, and tage (as in the demo) my runs have slowed drastically --- from ~40 minutes for 17 object fits to taking days to do one object. This is without changing the number of emcee walkers or iterations. I also added in a spectrum (previously I only had photometry) but there was a test run in between with just the fBHB changed which seemed incredibly slow as well.

Why do you think this is happening? If this is unavoidable I can submit my fits to a computing cluster but it seems strange enough that there may be something weird going on. It doesn't throw any exceptions and it does show that it's progressing, just very slowly.

@RGleis
Copy link
Author

RGleis commented Jul 15, 2020

These are my settings for the fBHB parameter.

fbhb_param = {"name": "fbhb",
"N": 1,
"isfree": True,
"init": 0.1,
"prior": priors.TopHat(mini=0.0, maxi=0.5),
"init_disp": 0.05,
"disp_floor": 0.01,
"units": "unitless fraction",
}

I saw in another issue that the MIST isochrones can be slow when changing the metallicity but I was already fitting metallicity and I've been using Padova isochrones so I can't really see that being the problem. I did a drastically reduced test run (16 walkers, 100 iterations) and it completed one source after 4 hours.

@bd-j
Copy link
Owner

bd-j commented Jul 15, 2020

There are a set of parameters -- fsps.StellarPopulation().params.ssp_params -- that if changed trigger a regeneration of the SSPs (as opposed to simply changing how they are combined). The fbhb parameter is one of these, since it modifies the isochrones. There is nothing to do to mitigate this. Note that if you switch to MIST the runtime will likely increase much more, since those SSPs take even longer to generate.

@bd-j
Copy link
Owner

bd-j commented Jul 15, 2020

you might consider conducting multiple fits with different fixed values of fbhb (i.e., a grid in fbhb) and then combining the posteriors, normalized by the evidence from dynesty.

@RGleis
Copy link
Author

RGleis commented Jul 17, 2020

Can you elaborate on what you mean by combining the posteriors normalized by the evidence from dynesty? As I understand it, the evidence is the model-independent factor in the denominator of Bayes theorem but isn't actually involved in an MCMC run. My interpretation is that the prior changes between the runs with different fbhb and dynesty provides a tool to calculate the evidence to correct for this so the posteriors can be fairly compared. Is this correct or am I missing something? Does it mean I need to do the fit with dynesty rather than emcee?

@bd-j
Copy link
Owner

bd-j commented Jul 17, 2020

Yes, that's what I meant, and yes, you'd need dynesty fits rather than emcee, since the posterior PDFs from the latter are unnormalized, and can't be easily compared in absolute terms once the model (not necessarily even the priors) has changed (by changing fBHB.) The dynesty docs or paper may provide useful info. To be honest, I've never done this before, but I think in principle it should be possible. tagging @joshspeagle who may have thoughts or suggestions.

@joshspeagle
Copy link
Contributor

joshspeagle commented Jul 17, 2020

Yes, in principle this should be possible. The terminology would be “Bayesian Model Averaging”, and you’d just be assuming that each model (with fixed fbhb) should be weighted according to the the expectation value of the likelihood of that model averaged over the prior (i.e. the evidence). This would implicitly be marginalizing over fbhb.

@bd-j
Copy link
Owner

bd-j commented Jul 17, 2020

Thanks! And then the spacing of the grid in fbhb would constitute an implicit prior on that parameter, yes?

@joshspeagle
Copy link
Contributor

I was assuming the spacing would be uniform but yea — I guess so! You could also just impose a prior weighting scheme too.

@RGleis
Copy link
Author

RGleis commented Jul 17, 2020

Thanks, this gives me a good jumping-off point. I'll leave this issue open for now in case anything else comes up.

@RGleis
Copy link
Author

RGleis commented Jul 29, 2020

I've been trying different things for this and I've found that the dynesty runs fail on some sources with some fBHB values. They go through the initial dynesty run then hang at the end of the final run (typically iter >5000). There's no exception thrown and as far as I can tell it doesn't seem to be exceeding the resources of the server. It doesn't seem like a problem with the data either since it occurs for some fBHB values and not others, even on the same object. Do you have any ideas?

As for merging the ones that finished, my thoughts are to manually edit the ['sampling']['chain'] and ['bestfit']['parameter'] fields to append the fBHB in order to try to fool dynesty.utils.merge_runs(result_list) so hopefully it will think that they are just parallel runs of the same prior so I can merge them as described here https://dynesty.readthedocs.io/en/latest/quickstart.html#combining-runs. What do you think of this plan?

@bd-j
Copy link
Owner

bd-j commented Jul 29, 2020

Hmm, there was a problem with excessive memory usage causing runs to fail but that was fixed a few months ago (4d62832). Otherwise I'm not sure. Are you fitting spectroscopic data?

I don't know enough about how merge_runs works to be sure, but that might well work. bat signal for @joshspeagle
(Note that "bestfit" is not important for dynesty; and to fool dynesty you'll need to create a dictionary that renames some of the vectors in "sampling": "lnprobability" -> "logl", "chain" -> "samples", and I think you need the ["sampling"]["logz"] array too.

@RGleis
Copy link
Author

RGleis commented Jul 30, 2020

Yes, I'm fitting spectra.

@joshspeagle
Copy link
Contributor

in order to try to fool dynesty.utils.merge_runs(result_list) so hopefully it will think that they are just parallel runs of the same prior

I would discourage you from trying this since I'm not 100% sure it would work. I would instead recommend just using resample_equal to get a new set of samples for each run, weight them proportional to the p(fBHB) value from the prior for that run, and then just combine the weighted samples from all the runs as just a giant set of weighted samples.

@bd-j
Copy link
Owner

bd-j commented Jul 31, 2020

For the stalling issue, a couple things. Do you see the efficiency (as listed in the progress line) dropping to very low values? Is the hang actually at the "end", i.e. has the stopping criterion been satisfied and the problem is somehow with final computations? Is it in the dynamic part (batch > 0) or the initial pass? You might try setting a maximum number of model calls or sampling iterations by adding the nested_maxiter_call or nested_maxiter_batch argument and setting it to something like int(5e6) (or 3000 for number of iterations). For the objects that stall only on some fBHB values, do the fits look reasonable for the fBHB values where they don't stall? Or are they pegged against the edge of some prior?

Some possible mitigation strategies: make nested_posterior_thresh a larger number, inflate the uncertainties on the spectra, make sure all possible bad pixels (residual sky emission, NsD ISM absorption, etc.) are masked. Beyond that you might need a spectroscopic noise jitter term or spectroscopic outlier model, but try the others first.

@RGleis
Copy link
Author

RGleis commented Aug 5, 2020 via email

@bd-j
Copy link
Owner

bd-j commented Aug 9, 2020

something seems off; the efficiency is very high, and 6e5 iterations is very large.

You should probably be fitting for redshift, and using a lumdist parameter for the luminosity distance (so that it is not taken from the redshift + cosmology, which might be undefined for negative z?)

The rescale_spectrum parameter simply scales the observed spectrum (and uncertainties) to have a median = 1; the factor is stored in the obs dictionary as "rescale" since it is fundamentally an operation on the data. I would actually recommend against using it. Are you also fitting photometry? Are you using PolySedModel or PolySpecModel, both of which can optimize/fit for a normalization factor? Are your spectra flux calibrated? Are you handling the lines spread function?

Can you send the model description (i.e. the result of print(model))? Feel free to email as well if that's better. See also https://github.com/bd-j/GCPro for some tips on fitting GCs with prospector (this also uses the more modern interaction via command line options + parser instead of an explicitly edited run_params dictionary.)

@bd-j
Copy link
Owner

bd-j commented Nov 2, 2020

assuming this is resolve, please repopen if there are still problems.

@joel-roediger
Copy link

@joshspeagle
I am trying to implement the solution that you recommended above for combining the results of Prospector runs with differing (fixed) values of the parameter fbhb. I am not sure what you meant above by "... weight them proportional to the p(fBHB) value from the prior for that run, then just combine the weighted samples from all the runs as just a giant set of weighted samples". Am I to weight the likelihoods of the samples or something else? And would it be straightforward to use the plotting tools offered in dynesty on the combined set of weighted samples?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants