-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfixes for the save_all command #98
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The sherpa.astro.ui.utils module is large enough, so the two routines involved with state serialization (save_all and save_session) have been moved out to a separate module (sherpa.astro.ui.serialize).
Remove ` == True` from boolean checks and replace `x == False` by `not x` (the latter form was also used in the Python output and has been changed there too).
This change was also done to the Python code that is created by these routines.
More refactoring of common code. This revealed a bug in the serialization of PSF models when the wrong is was assigned to the load_psf call in the ASCII output if multiple data sets are loaded and the PSF is not for the last model.
This adds a warning to the user when the serialization code is run, as well as the existing message added to the output file. The message now includes the model name.
Change the order of the checks on the model typename to reduce redundancy and make it clearer the logic of the checks.
The original code referred to self.the_source in two places; these were changed to state.the_source in an earlier commit. This change removes the "state." prefix as it's obviously a local variable, rather than taken from an object.
Ensure that Sherpa is loaded, so that there are more ways that the script can be used (e.g. "python foo.py").
Refactored code for serializing PHA files and fixed what appears to be a bug, where the grouping and quality arrays for background files were being taken from the source data set instead. This is only an issue for people who fit the background, rather than subtract it (since in the latter case the backgroun grouping and quality arrays are ignored). The Python serialization is also a bit-more explicit about the optional function arguments, in that the form name=value is now used in several places related to PHA data files, rather than relying on position.
This should enable easier testing (but this has not been implemented yet).
The OGIP standard has these two columns as 2-byte ints, whereas the default serialization tends to write them out as 8-byte floats, presumably because this is how the DataPHA object stores them.
Allow the save_all to write to a file-like object, primarily for testing but it may be useful for advanced use cases. Includes a minor fix, going from `type(foo) == string` to `isinstance(foo, basestring)`.
Change `type(foo) == str` to `isintance(foo, basestring)`.
This adds minimal tests: - store/restore the basic Sherpa configuration - ditto but after changing a few statistic and method settings - a very-basic PHA case The `test_astro.py` script was changed so as to restore the XSPEC settings (if XSPEC is enabled), since the threads it runs change the default settings, which can cause the serialize tests to fail.
This is CXC ticket #12146.
First attempt at storing data set with load_arrays, rather than read in from a file. This does not handle all cases, but is a start.
This is a fix for 40b6d9d - we want add_user_pars not add_user_model - and also adds in a test of loading in the save_all output for this test.
Determining hoe the model was set should probably be recorded in the state itself, rather than trying to reconstruct it here.
The output of save_all is not correct in this case, which means that the output can not be run (the model expression for the background component is not written out correctly). The test checks for the current, incorrect, output. The "restore" test - which loads in the output of save_all - is skipped for now.
The background model wasn't being queried correctly (incorrect use of arguments) and so could be written out as using set_bkg_full_ model (which does not serialize correctly for PHA data sets at this time) instead of set_bkg_source. This was seen for data sets with a non-default id (or if multiple backgronud components).
Remove the skipif constraint on the test_restore_pha_back now that the background model is restored correctly. Clean up the test. The background filtering has been updated to match the current behavior (since setting the source filter also sets the background filter).
Oops: looks like I need a few |
I have created a greatly simplified version of this patch that only addresses two of these changes, but it does fix a bug which renders save_all fairly useless for most cases. See PR #100 |
Merged
Replaced by #138 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note
This PR is not going to be merged as is, since #100 has been merged (which includes a small number of the changes here, as well as removing the
set_session
code). The plan is to add a new PR with the remaining changes, once the CIAO 4.8 release has settled down.Release Notes
This commit fixes and enhances the output of the
save_all
command.User-visible changes:
save_all
: ifoutfile
isNone
then theoutfh
argument is usedto define the output handle (the argument can be any file-like argument, such as a file
handle like
sys.stdout
or the output ofopen
, or aStringIO
object)clobber
argument tosave_all
now means that the output file (theoutfile
argument, if not
None
) is deleted if it already exists; prior to this, the file would beappended to instead
set_full_model
); this is bug save_all does not save the source model for PHA data #97 but also affects non-PHA data setsdata sets
than a floating-point value (this has no affect on the results, but matches the OGIP standard)
be an issue if the background is being fit, rather than subtracted)
load_arrays
function are now written out bysave_all
as part of the script; this is intended for small datasets and may have problems with
precision if used with floating-point arrays
load_psf
are now correctly restored (they may not have been written out correctlyif multiple data sets were loaded)
not, a place-holder function is added to the output and a warning displayed).
load_user_model
andadd_user_pars
are now included inthe output
save
all has undergone several minor changes:sherpa.astro.ui
module, so that it can be run from theIPython prompt using the
%run <filename>
command, or directly aspython <filename>
create_model_component
function rather thaneval
to create modelcomponents (this is CXC bug 12146)
name=value
rather thanbeing a positional argument, to make it clearer what the script is doing.
load_data
have been replaced by more-specific versions - e.g.load_pha
and
load_image
- if appropriateWhen writing out code that defines a user-model, there is no attempt to make sure that
modules used by the function are available. These will need to be added, either directly
or imported, manually to the output.
Internal changes:
test_astro.py
test script just to makesure that the default XSPEC settings are restored on test tear-down (to avoid the new tests
from failing). The test suite covers some of the basic situations expected to be served by
save_all
- such as basic arrays, PHA datasets - but does not cover all functionalitysherpa/astro/ui/serialize.py
fromsherpa/astro/ui/utils.py
, as thelatter is rather large
x == True
andy == False
byx
andnot y
isinstance(x, basestring)
rather thantype(x) == str
, which has been a problem forother parts of the Sherpa UI when given a string stored in a numpy array (since these
aren't treated as
str
but do match theisinstance
check)Notes
I believe the
save_session
command is either redundant or an experiment that was never finished. It can probably be marked as deprecated, but that can be considered later. This is why the release notes do not mentionsave_session
, even though virtually-all the changes are relevant for it.The test suite has only been tested on one machine. It is likely that it will need changing to account for running on different configurations (in particular 32 vs 64 bit or different OS values); in particular, the regression tests that ensure the textual output is correct may need updating to allow numerical differences or changed to a different approach.
ADDED NOTE the tests include regression tests, where the output is compared against an expected string. These strings are currently stored within
test_serialize.py
itself, but it may make sense to move them into thesherpa-test-data
repository, once the patch has been accepted (and the approach vindicated by more testing, to make sure it's worth approach).Not all functionality is tested - e.g. serializing sessions that use image data, or an iterative fit, or more corner cases in various options. This set of changes is large enough as is, and significantly improves the overall coverage of Sherpa (from around 50% to 55%) that I think it's worth looking at now. The coverage of
serialize.py
is 68% (when the equivalent before this PR was 0% for the code that is now inserialize.py
).A lot of the logic in
serialize.py
should really be moved into the classes, so that rather than this code being changed whenever a new statistic needs to be serialized (say), the object itself can be queried. I'm wondering about aSerializable
class which could be imported from to provide aserialize
method (names open to discussion), or if not a class, just make sure that each object we want to serialize provides a method. This is way outside the scope of this PR, but some of the refactoring done in this PR was done to highlight places where this would be useful.The code will conflict once #91 is accepted (and possibly other commits). The commit history is also not ideal (since the tests are not added straight away, and there's some changes that could perhaps be squashed together). I submitted this on my own branch to provide more freedom in rebase-ing the commit.
Example
With the following code
the difference in the output of
save_all
from CIAO 4.7 (out.ciao47
, which is whatmaster
gives) and this PR (out.pr98
) is:which shows several of the enhancements and bug fixes, namely:
sherpa.astro.ui
moduleload_arrays
are now written out to filecreate_model_component
is used rather thaneval
set_source
and notset_full_model
)