Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak test for issue 717 to be repeatable #802

Merged
merged 2 commits into from Aug 14, 2020

Conversation

DougBurke
Copy link
Contributor

@DougBurke DougBurke commented Apr 29, 2020

Summay

Make a test more robust.

Details

The following is from before #772 was merged (adding XSPEC 12.11.0), which ended up making most of the changes that were in this PR.

While working on other issues I found that the fix for issue #717 - added in #728 - was not sufficiently isolated, so that adding other tests (or running tests in parallel) could cause this test to fail because of changes to the Sherpa state made by other tests.

There are two commits that

a) ensure that the test is run with a "known" environment (e.g. statistic)
b) switches from an APEC model (from the XSPEC library) to a power-law model (from Sherpa's model library), which means the test can be run without XSPEC, and has benefits because the powerlaw model is a much-better description of the data, and is also much-less likely to change over time (the APEC model depends on external data tables and other things that have been known to change between XSPEC releases)

As the original problem (#717) was to do with evaluating any model - rather than a particular model - the change from APEC to powerlaw shouldn't change the validity of the test.

@DougBurke
Copy link
Contributor Author

I've pulled this out of #754 since it is a small standalone update, unrelated to sample_flux.

@DougBurke
Copy link
Contributor Author

For completeness, I have also run the new version of the test in CIAO 4.12 and confirm that with the power-law model the fit call fails, so it is still a good test of the fix for #717

sherpa In [20]: fit()
TypeError: startup() takes 1 positional argument but 2 were given

sherpa In [21]: get_source(1)
Out[21]: <PowLaw1D model instance 'powlaw1d.pl'>

sherpa In [22]: get_source(2)
Out[22]: <PowLaw1D model instance 'powlaw1d.pl'>

sherpa In [23]: get_rmf(1, 1).name
Out[23]: '3c273.rmf'

sherpa In [24]: get_rmf(1, 2).name
Out[24]: '3c273.rmf'

sherpa In [25]: get_rmf(2, 1).name
Out[25]: '3c273.rmf'

sherpa In [26]: get_rmf(2, 2).name
Out[26]: '3c273.rmf'

sherpa In [27]: import sherpa

sherpa In [28]: sherpa._version.get_versions()
Out[28]: {'full': '4eda546e4266197581a029770910996ba91aa187', 'version': '4.12.0'}

@DougBurke
Copy link
Contributor Author

The reason for this PR (that is, making this yrdy "repeatable" by enforcing the method/stat/...) was found in developing #754 - where a run of pytest -n 4 fails with the following error for this test

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

data = array([17., 15., 16., 15., 16., 15., 18., 18., 15., 18., 15., 15., 19.,
       15., 15., 17., 16., 16., 17., 15., 19.,...17.,
       15., 18., 16., 15., 15., 16., 15., 15., 15., 16., 16., 15., 15.,
       16., 16., 15., 16., 15., 15., 20.])

    @staticmethod
    def calc_staterror(data):
>       raise StatErr('chi2noerr')
E       sherpa.utils.err.StatErr: If you select chi2 as the statistic, all datasets must provide a staterror column

sherpa/stats/__init__.py:543: StatErr

presumably because a previous test had changed the statistic to chi2. Note that this doesn't happen if you run all the tests in serial (e.g. pytest or python setup.py test), which is why we haven't seen this on travis.

@wmclaugh wmclaugh added this to the 4.12.2 milestone Jul 2, 2020
@DougBurke DougBurke added the note: easy review Indicates PR requires simple review label Aug 11, 2020
Copy link
Contributor

@dtnguyen2 dtnguyen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, the levmar optimization method will be used but I guess it doesn't hurt to have it just in case we ever change the default opt method (although I verified that neldermead does give the same answer)

    ui.set_method('levmar')

@DougBurke
Copy link
Contributor Author

Thanks for the review. I'll look at rebasing this to fix whatever the conflict is.

The test did not set up the method and statistic, and this caused
problems when the order of tests was changed (when running in
parallel) and an unwanted change to the default statistic caused
the test to fail. This commit now tries to ensure a clean session
and forces a particular optimised and statistic.
@DougBurke
Copy link
Contributor Author

@dtnguyen2 - I've had to rebase this because I made almost the same in a different PR (#772, adding XSPEC 12.11.0 support), so this makes the changes even smaller (now just ensuring the test gets cleaned before/after running and that the method is set).

@codecov
Copy link

codecov bot commented Aug 11, 2020

Codecov Report

Merging #802 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #802      +/-   ##
==========================================
- Coverage   62.98%   62.96%   -0.02%     
==========================================
  Files          70       70              
  Lines       24315    24315              
  Branches     3509     3509              
==========================================
- Hits        15315    15311       -4     
- Misses       7862     7866       +4     
  Partials     1138     1138              
Impacted Files Coverage Δ
sherpa/astro/sim/pragbayes.py 84.23% <0.00%> (-2.18%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf66c00...2961ae7. Read the comment docs.

@DougBurke
Copy link
Contributor Author

The apparent failure of the codecov/project run just indicates some "randomness" in our tests. The pragbayes.py module tests have decreased, even though none of the tests in this PR make use of the code in sherpa.astro.sim.pragbayes. In this case a test no-longer raises a LimitError in some Monte-Carlo run so we don't cover the error case any more. It's nothing to do with this PR.

@wmclaugh wmclaugh merged commit 463c270 into sherpa:master Aug 14, 2020
@DougBurke DougBurke deleted the tweak-test-for-issue-717 branch August 17, 2020 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:tests note: easy review Indicates PR requires simple review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants