Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong nubar_estimators (potentially also inconsistencies in transition probabilities) -- follow-up of #455 #464

Closed
unoebauer opened this issue Jan 16, 2016 · 7 comments

Comments

@unoebauer
Copy link
Contributor

This is a re-launch of #455, reviewing the current status of the debugging process and outlining the next steps. The reason for this re-launch was the increase in confusion in the vast discussion of #455.

Important Conclusions from #455

Here are the most important conclusions from the discussion of #455:

  • for certain setups, wrong values of the nubar_estimators are produced at the end of a Monte Carlo cycle
  • this leads to non-converging plasma-state iterations
  • this affects the current Tardis version
  • 45eca24 still produces the correct values of nubar_estimators
  • problems in the calculation of the transition_probabilities have been discovered during the debugging process; however: it is currently not clear whether this triggers the nubar_estimator behaviour

Immediate Steps

I propose the following strategy:

Immediate Steps:

  • [ ]: confirm that PR Fixing plasma in 41d8fcd (not for merging) #462 creates an identical plasma state (tau_sobolevs, electron_densities, transition_probabilities, ion_populations, level_populations) as 45eca24 on the python level when using plasma_fail_test.yml; @aoifeboyle found that there are no differences (based on 2e7a85f ), @unoebauer found differences in the transition_probabilities (based on 85c0c1e)
    • this may be due to problems introduced between 2e7a85f and 85c0c1e
    • another possibility is the use of different metrics to determine differences (see guidelines below)
  • [ ]: once a version of Tardis after the plasma restructure has been found/produced, which produces the same plasma state as 45eca24 and still produces different nubar_estimators, one should compare the plasma inputs on the C level -- this is clearly a task for @yeganer

Ultimate Goals:

  • [ ]: fix the nubar_estimator behaviour
  • [ ]: understand why the issue did not manifest in any of the Travis test problems and in the tardis_example

Guidelines

  • Please only use the setups provided by PR Setups for debugging #455 and (not for merging) #463 to debug this problem. For now, only plasma_fail_test setup is provided. If additional configurations are absolutely necessary, they will be added to the PR. Always refer to the setups by the name given to them in the PR to avoid confusion
  • we should use a consistent metric for determining differences in plasma state quantities:
    • I am currently using the following prescription when determining differences of pandas.DataFrames (I am not saying that this is the best solution):
np.fabs(((plasma_property_old - plasma_property_new) / plasma_property_old).fillna(0)).max()
  • to avoid bloating the discussion here, only append few lines of python or ipython output to your posts (I am the first one to blame for this). If a detailed output is necessary, save it and append it as a zip archive
  • It may be useful to also provide information about your python setup, in particular about the versions of the following packages: pandas, astropy, numpy; Using different versions of these packages has lead to some confusion during Inconsistencies between Tardis versions #455
@unoebauer
Copy link
Contributor Author

Update 1:

Next Steps:

  • establish, whether the "correct" plasma state fixes the nubar_estimators problem:
    • explore whether differences in nubar_estimators are compatible with the expected MC noise, i.e. rerun with different seeds, and with a higher number of packets

@unoebauer
Copy link
Contributor Author

Update 2:

@unoebauer
Copy link
Contributor Author

Update 3:

@unoebauer
Copy link
Contributor Author

Update 4:

  • the nubar_estimator deviations seem to be of a statistical nature:
    • the appear in the commit history (see discussion of PR Fixing plasma in 41d8fcd (not for merging) #462) after switching from the old black body sampling scheme to the new one (f7ca576)
    • we explicitly verified that the new black body sampling scheme produces the expected packet nu distribution - thus, implementation errors are excluded
    • naturally, the new scheme does not produce the same nus as the old one, but the overall distribution of packet frequencies is the same
    • when increasing the number of packets used in the spectral run of plasma_fail_test, the discrepancies decrease (expected Monte Carlo behaviour)
    • when using 1e5 packets in the spectral run, the plasma_fail_test virtual spectra are identical on the MC noise level between 45eca24 and PR Probably fix for #455. ( Based on current master) #466 -- strong indication, that @aoifeboyle was on the right track with here Lines quick-fix

@unoebauer
Copy link
Contributor Author

Update 5:

Looks as the issue is finally resolved

When running the original problem (initially referred to as tardis_00173_2 in issue #455), no differences above the MC noise level are detected between 45eca24 and PR #466:

virt_spec_cmp.pdf

Remaining tasks:

  • double-check that the Lines fix solves the problem, i.e. re-run all setups posted under Inconsistencies between Tardis versions #455:
  • make a clean PR which introduces the fix into the current master (with working tests, etc.)
  • understand why the re-indexing of the lines only posed a problem under certain circumstances:
    • why did it not manifest in any of the Travis tests?
    • why did it not show up in the tardis_example?

@unoebauer
Copy link
Contributor Author

Update 6:

We seem to have identified and understood the problem (see discussion at PR #466). It will take a bit of time to propagate the fixes properly into the master since some legacy code should be removed at the same time.

For now, if you are planning to use the current development version of Tardis, be sure to get the quickfix of PR #466.

yeganer pushed a commit to yeganer/tardis that referenced this issue Feb 1, 2016
Deleted atom_data._lines and atom_data._levels
This change ensures that the BasePlasma always gets prepared data.
A proper fix will take more time and will be done later.
yeganer pushed a commit to yeganer/tardis that referenced this issue Feb 3, 2016
Changed behavior to raise a custom exception instead of a generic one.
This is done to better reflect the error.

Related Issues: tardis-sn#455 tardis-sn#464
Pull Request tardis-sn#471
unoebauer added a commit that referenced this issue Feb 3, 2016
@unoebauer
Copy link
Contributor Author

This issue is officially fixed by PR #471 - closing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant