Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save_all improvements for set_psf and load_xstable_model #1874

Merged
merged 21 commits into from Sep 28, 2023

Conversation

DougBurke
Copy link
Contributor

@DougBurke DougBurke commented Sep 6, 2023

Summary

Ensure that save_all includes set_psf calls, supports XSPEC table models, and improve the support for data sets created with load_arrays or dataspace1d/dataspace2d. Fixes #1873.

Details

This was meant to be a simple fix for #1873... It does not solve all known issues with save_all (e.g. #320 #1882), but they would require more time than I have to spend now (and they are not new problems),

There are a number of code clean ups here, including removing code that made no sense (bkg_id is only a parameter for PHA data sets), and changes to improve coverage testing of the save_all code (as this was the reason why this bug was not fixed, as the code was intended to work, it's just it was in the wrong place and ended up doing nothing, and with no tests we couldn't see it).

There is a minor change to the serialization output (set_analysis calls now include the parameter names, so set_analysis(1, quantity="energy"", type='rate', factor=0), which does not change the behaviour of the script. I also added bkg_id argument names to group calls for background datasets (again no semantic difference).

The actual fix for #1873 was "simple", and just required re-thinking the code organization slightly.

I also decided to simplify notice_id(1, None, None) to notice_id(1), and also to not include a notice range if all the data is included (for backgrounds it's a bit different). Again, these do not change the behaviour of the script.

Then there are changes to improves support for data that was created with load_arrays or dataspace1d/dataspace2d rather than read from a file. There's two major changes

  • checking for whether it is read from a file - previously we assumed that if a name field was not the empty string then we had a file, but that is way too simple. We now actually check if a file exists with that name
    • this is not going to catch the case where we used a relative path name and the directory has changed, but it is better than the old behaviour
    • there's some funky-ness with crates, as that can add "Virtual File Syntax", so to check if a name is valid we use crates to try and read in the file (rather than try to remove any VFS syntax)
    • there's some funkiness for pyfits dealing with automatically finding .gz files
  • I've re-worked how it creates data with load_arrays, as it was currently untested and now has better coverage

Then there are commits to remove "banners" which are not needed (these are comments, so there's no semantic changes to the output). One slightly-larger change is that we only include the XSPEC section if we actually use any XSPEC models.

There is clean up of grouping code, since the old version said "if you can group this data then group it" whereas we actually know that the data can be grouped, so we can ignore the check.

The internals keep on getting changed - but that's okay because the only routine we guarantee is save_all and everything else is private. There are more changes that could be done, but I'd want to do that once we work out how to deal with

The last commit adds support for XSPEC table models (which I don't think we have an issue about).

I am relying on the relatively good coverage of this module to check the code clean ups and re-orderings I did. The mypy work actually found one or two minor issues, and I used it as a check of things, but it is somewhat limited in scope without adding annotations to other parts of the system. It's also got some interesting issues regarding testing optional behaviour (in this case related to XSPEC), which limit the expressiveness of the mypy tests.


Examples

These are to show the change in the comments more than anything.

load_arrays 1D

CIAO 4.15

>>> from sherpa.astro import ui
>>> ui.load_arrays(1, [10, 20, 30], [2, 7, 4])
>>> ui.set_stat("cash")
>>> ui.set_source(ui.const1d.foo)
>>> ui.fit()
Dataset               = 1
Method                = levmar
Statistic             = cash
Initial fit statistic = 6
Final fit statistic   = -12.1248 at function evaluation 9
Data points           = 3
Degrees of freedom    = 2
Change in statistic   = 18.1248
   foo.c0         4.33333      +/- 1.20985     
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_arrays(1,
            [10, 20, 30],
            [2, 7, 4],
            Data1D)

######### Set Image Coordinates


######### Data Spectral Responses


######### Load Background Data Sets


######### Set Energy or Wave Units


######### Filter Data

notice_id(1, "10.0000:30.0000")


######### Set Statistic

set_stat("cash")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)


######### Set Model Components and Parameters

create_model_component("const1d", "foo")
foo.integrate = True

foo.c0.default_val = 1.0
foo.c0.default_min = -3.4028234663852886e+38
foo.c0.default_max = 3.4028234663852886e+38
foo.c0.val     = 4.333327498564645
foo.c0.min     = -3.4028234663852886e+38
foo.c0.max     = 3.4028234663852886e+38
foo.c0.units   = ""
foo.c0.frozen  = False



######### Set Source, Pileup and Background Models

set_source(1, const1d.foo)


######### XSPEC Module Settings

set_xschatter(10)
set_xsabund("angr")
set_xscosmo(70, 0, 0.73)
set_xsxsect("vern")

PR

>>> from sherpa.astro import ui
WARNING: imaging routines will not be available, 
failed to import sherpa.image.ds9_backend due to 
'RuntimeErr: DS9Win unusable: Could not find ds9 on your PATH'
>>> ui.load_arrays(1, [10, 20, 30], [2, 7, 4])
>>> ui.set_stat("cash")
>>> ui.set_source(ui.const1d.foo)
>>> ui.fit()
Dataset               = 1
Method                = levmar
Statistic             = cash
Initial fit statistic = 6
Final fit statistic   = -12.1248 at function evaluation 9
Data points           = 3
Degrees of freedom    = 2
Change in statistic   = 18.1248
   foo.c0         4.33333      +/- 1.20985     
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_arrays(1,
            [10, 20, 30],
            [2, 7, 4],
            Data1D)


######### Set Statistic

set_stat("cash")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)


######### Set Model Components and Parameters

create_model_component("const1d", "foo")
foo.integrate = True

foo.c0.default_val = 1.0
foo.c0.default_min = -3.4028234663852886e+38
foo.c0.default_max = 3.4028234663852886e+38
foo.c0.val     = 4.333327498564645
foo.c0.min     = -3.4028234663852886e+38
foo.c0.max     = 3.4028234663852886e+38
foo.c0.units   = ""
foo.c0.frozen  = False



######### Set Source, Pileup and Background Models

set_source(1, const1d.foo)

>>> 

Simple PHA

There's things I'd like to improve on this (we don't need to load the responses or background) but that's significantly more work, is partly related to #320 and #1882 but also #1885.

What you can see here are

  • a lot less unused comments
  • less logic in the script (e.g. for the group(1) line
  • no XSPEC section as this doesn't use an XSPEC model

CIAO 4.15

>>> from sherpa.astro import ui
>>> ui.load_pha("sherpa-test-data/sherpatest/3c273.pi")
WARNING: systematic errors were not found in file 'sherpa-test-data/sherpatest/3c273.pi'
statistical errors were found in file 'sherpa-test-data/sherpatest/3c273.pi' 
but not used; to use them, re-read with use_errors=True
read ARF file sherpa-test-data/sherpatest/3c273.arf
read RMF file sherpa-test-data/sherpatest/3c273.rmf
WARNING: systematic errors were not found in file 'sherpa-test-data/sherpatest/3c273_bg.pi'
statistical errors were found in file 'sherpa-test-data/sherpatest/3c273_bg.pi' 
but not used; to use them, re-read with use_errors=True
read background file sherpa-test-data/sherpatest/3c273_bg.pi
>>> ui.notice(0.5, 6)
dataset 1: 0.00146:14.9504 -> 0.4672:6.57 Energy (keV)
>>> ui.ignore(hi=1)
dataset 1: 0.4672:6.57 -> 1.0366:6.57 Energy (keV)
>>> ui.set_source(ui.powlaw1d.pl)
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_pha(1, "sherpa-test-data/sherpatest/3c273.pi")

######### Set Image Coordinates

if get_data(1).grouping is not None and not get_data(1).grouped:
    ######### Group Data
    group(1)

######### Data Spectral Responses

load_arf(1, "sherpa-test-data/sherpatest/3c273.arf", resp_id=1)
load_rmf(1, "sherpa-test-data/sherpatest/3c273.rmf", resp_id=1)

######### Load Background Data Sets

load_bkg(1, "sherpa-test-data/sherpatest/3c273_bg.pi", bkg_id=1)
if get_bkg(1, 1).grouping is not None and not get_bkg(1, 1).grouped:
    ######### Group Background
    group(1, 1)

######### Background Spectral Responses

load_arf(1, "sherpa-test-data/sherpatest/3c273.arf", resp_id=1, bkg_id=1)
load_rmf(1, "sherpa-test-data/sherpatest/3c273.rmf", resp_id=1, bkg_id=1)

######### Set Energy or Wave Units

set_analysis(1, 'energy', "rate", 0)

######### Filter Data

notice_id(1, "1.036599993706:6.570000171661")
notice_id(1, None, None, bkg_id=1)
notice_id(1, "1.036599993706:6.570000171661", bkg_id=1)


######### Set Statistic

set_stat("chi2gehrels")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)


######### Set Model Components and Parameters

create_model_component("powlaw1d", "pl")
pl.integrate = True

pl.gamma.default_val = 1.0
pl.gamma.default_min = -10.0
pl.gamma.default_max = 10.0
pl.gamma.val     = 1.0
pl.gamma.min     = -10.0
pl.gamma.max     = 10.0
pl.gamma.units   = ""
pl.gamma.frozen  = False

pl.ref.default_val = 1.0
pl.ref.default_min = -3.4028234663852886e+38
pl.ref.default_max = 3.4028234663852886e+38
pl.ref.val     = 1.0
pl.ref.min     = -3.4028234663852886e+38
pl.ref.max     = 3.4028234663852886e+38
pl.ref.units   = ""
pl.ref.frozen  = True

pl.ampl.default_val = 1.0
pl.ampl.default_min = 0.0
pl.ampl.default_max = 3.4028234663852886e+38
pl.ampl.val     = 1.0
pl.ampl.min     = 0.0
pl.ampl.max     = 3.4028234663852886e+38
pl.ampl.units   = ""
pl.ampl.frozen  = False



######### Set Source, Pileup and Background Models

set_source(1, powlaw1d.pl)




######### XSPEC Module Settings

set_xschatter(10)
set_xsabund("angr")
set_xscosmo(70, 0, 0.73)
set_xsxsect("vern")

this PR

>>> from sherpa.astro import ui
WARNING: imaging routines will not be available, 
failed to import sherpa.image.ds9_backend due to 
'RuntimeErr: DS9Win unusable: Could not find ds9 on your PATH'
>>> ui.load_pha("sherpa-test-data/sherpatest/3c273.pi")
WARNING: systematic errors were not found in file 'sherpa-test-data/sherpatest/3c273.pi'
statistical errors were found in file 'sherpa-test-data/sherpatest/3c273.pi' 
but not used; to use them, re-read with use_errors=True
read ARF file sherpa-test-data/sherpatest/3c273.arf
read RMF file sherpa-test-data/sherpatest/3c273.rmf
WARNING: systematic errors were not found in file 'sherpa-test-data/sherpatest/3c273_bg.pi'
statistical errors were found in file 'sherpa-test-data/sherpatest/3c273_bg.pi' 
but not used; to use them, re-read with use_errors=True
read background file sherpa-test-data/sherpatest/3c273_bg.pi
>>> ui.notice(0.5, 6)
dataset 1: 0.00146:14.9504 -> 0.4672:6.57 Energy (keV)
>>> ui.ignore(hi=1)
dataset 1: 0.4672:6.57 -> 1.0366:6.57 Energy (keV)
>>> ui.set_source(ui.powlaw1d.pl)
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_pha(1, "sherpa-test-data/sherpatest/3c273.pi")
group(1)

######### Data Spectral Responses

load_arf(1, "sherpa-test-data/sherpatest/3c273.arf", resp_id=1)
load_rmf(1, "sherpa-test-data/sherpatest/3c273.rmf", resp_id=1)

######### Load Background Data Sets

load_bkg(1, "sherpa-test-data/sherpatest/3c273_bg.pi", bkg_id=1)
group(1, bkg_id=1)

######### Background Spectral Responses

load_arf(1, "sherpa-test-data/sherpatest/3c273.arf", resp_id=1, bkg_id=1)
load_rmf(1, "sherpa-test-data/sherpatest/3c273.rmf", resp_id=1, bkg_id=1)

######### Set Energy or Wave Units

set_analysis(1, quantity="energy", type="rate", factor=0)

######### Filter Data

notice_id(1, "1.036599993706:6.570000171661")
notice_id(1, bkg_id=1)
notice_id(1, "1.036599993706:6.570000171661", bkg_id=1)


######### Set Statistic

set_stat("chi2gehrels")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)


######### Set Model Components and Parameters

create_model_component("powlaw1d", "pl")
pl.integrate = True

pl.gamma.default_val = 1.0
pl.gamma.default_min = -10.0
pl.gamma.default_max = 10.0
pl.gamma.val     = 1.0
pl.gamma.min     = -10.0
pl.gamma.max     = 10.0
pl.gamma.units   = ""
pl.gamma.frozen  = False

pl.ref.default_val = 1.0
pl.ref.default_min = -3.4028234663852886e+38
pl.ref.default_max = 3.4028234663852886e+38
pl.ref.val     = 1.0
pl.ref.min     = -3.4028234663852886e+38
pl.ref.max     = 3.4028234663852886e+38
pl.ref.units   = ""
pl.ref.frozen  = True

pl.ampl.default_val = 1.0
pl.ampl.default_min = 0.0
pl.ampl.default_max = 3.4028234663852886e+38
pl.ampl.val     = 1.0
pl.ampl.min     = 0.0
pl.ampl.max     = 3.4028234663852886e+38
pl.ampl.units   = ""
pl.ampl.frozen  = False



######### Set Source, Pileup and Background Models

set_source(1, powlaw1d.pl)


dataspace1d

In this case the old code was unable to save the "state" correctly:

CIAO 4.15

>>> from sherpa.astro import ui
>>> ui.dataspace1d(1, 6)
>>> ui.set_dep([12, 2, 3, 5, 7])
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_data(1, "dataspace1d")

######### Set Image Coordinates


######### Data Spectral Responses


######### Load Background Data Sets


######### Set Energy or Wave Units


######### Filter Data

notice_id(1, "1.0000:6.0000")


######### Set Statistic

set_stat("chi2gehrels")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)


######### Set Model Components and Parameters



######### Set Source, Pileup and Background Models




######### XSPEC Module Settings

set_xschatter(10)
set_xsabund("angr")
set_xscosmo(70, 0, 0.73)
set_xsxsect("vern")

This PR

>>> from sherpa.astro import ui
WARNING: imaging routines will not be available, 
failed to import sherpa.image.ds9_backend due to 
'RuntimeErr: DS9Win unusable: Could not find ds9 on your PATH'
>>> ui.dataspace1d(1, 6)
>>> ui.set_dep([12, 2, 3, 5, 7])
>>> ui.save_all()
import numpy
from sherpa.astro.ui import *

######### Load Data Sets

load_arrays(1,
            [1.0, 2.0, 3.0, 4.0, 5.0],
            [2.0, 3.0, 4.0, 5.0, 6.0],
            [12.0, 2.0, 3.0, 5.0, 7.0],
            Data1DInt)


######### Set Statistic

set_stat("chi2gehrels")


######### Set Fitting Method

set_method("levmar")

set_method_opt("epsfcn", 1.1920928955078125e-07)
set_method_opt("factor", 100.0)
set_method_opt("ftol", 1.1920928955078125e-07)
set_method_opt("gtol", 1.1920928955078125e-07)
set_method_opt("maxfev", None)
set_method_opt("numcores", 1)
set_method_opt("verbose", 0)
set_method_opt("xtol", 1.1920928955078125e-07)

@DougBurke DougBurke marked this pull request as draft September 6, 2023 13:13
@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Merging #1874 (cadb965) into main (585c709) will increase coverage by 0.26%.
Report is 9 commits behind head on main.
The diff coverage is 87.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1874      +/-   ##
==========================================
+ Coverage   81.38%   81.64%   +0.26%     
==========================================
  Files          75       75              
  Lines       27155    27321     +166     
  Branches     4119     4163      +44     
==========================================
+ Hits        22099    22307     +208     
+ Misses       4866     4826      -40     
+ Partials      190      188       -2     
Files Coverage Δ
sherpa/astro/ui/utils.py 97.50% <100.00%> (+0.02%) ⬆️
sherpa/astro/ui/serialize.py 87.80% <87.75%> (+10.34%) ⬆️

... and 9 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 585c709...cadb965. Read the comment docs.

@DougBurke DougBurke changed the title Tests: add an example of issue #1873 Ensure that save_all output includes set_psf calls Sep 6, 2023
@DougBurke DougBurke force-pushed the save-all-psf branch 5 times, most recently from db95a1e to babfdbb Compare September 9, 2023 15:32
@DougBurke DougBurke marked this pull request as ready for review September 9, 2023 21:15
@DougBurke DougBurke force-pushed the save-all-psf branch 4 times, most recently from 7126697 to 8fb77e9 Compare September 11, 2023 18:53
@DougBurke DougBurke changed the title Ensure that save_all output includes set_psf calls save_all inprovements for set_psf and load_xstable_model Sep 11, 2023
@DougBurke DougBurke changed the title save_all inprovements for set_psf and load_xstable_model save_all improvements for set_psf and load_xstable_model Sep 11, 2023
@DougBurke DougBurke added the CI arch Set on a PR if the arch tests should be run (to test aarch64 and others) label Sep 11, 2023
Add tests of the current behavior related to sherpa#1789 and sherpa#1880.
These are regression tests so will need updating once we address
these issues.
The problem is that we do not include the "set_psf" call in the
ASCII serialization. Since it's not obvious what the text will be
we can't check the output serialization, but we can at least
check that the PSF has not been applied when we restore the
saved file.
Check
  - linked parameters
  - PHA with no response
  - PHA with type=counts (as part of the previous test)
  - PHA from CSC
  - "full model" handling (currently XFAIL)
  - table model
  - xspec table model (currently a regression test as the
    code does not create a valid save file)
  - pileup model (currently XFAIL because of inability to
    recreate data not read from a file)
  - dataspace1d/2d use (currently XFAIL)
  - example of load_data with an ascii file, which has lead
    to viwing interesting differences in the two I/O backends
  - PHA grating dataset (multiple backgrounds) using wavelength
    XFAIL see sherpa#320
  - PHA2 file XFAIL see sherpa#1882 (issue sherpa#320 plus more)

For those tests where we know they fail because the save_all
output is wrong in some way, we ensure they fail in the
expected manner. This is to make sure that they don't suddenly
start failing for a different reason (due to an error elsewhere)
which we don't see. This will only be noted when running with
the --runxfail pytest flag.
In changing save_all I have found it useful to check that we are
still failing in the same way for issues sherpa#320 and sherpa#1882, so make
the current tests regression tests (so they pass, even though the
output is incorrect) rather than mark them as xfail.

We do not do this for all XFAIL tests.
This slightly-changes the save_all output for PHA background data
but semantically it has the same meaning. Internally it cleans up
the code as we were trying to support possibilities that we know
can't happen (e.g. 2D data with backgrounds).
We know expicitly set the arguments (other than the dataset
identifier) when calling set_analysis. There is no semantic
change here, but it make code more future-proof (there are
some changes I'd like to make to set_analysis at some point
so may as well make sure they won't mess up save_all output).
Since we have the same form for the "banner" lines, we can encode
this as the _output_banner routine. Also some f-string changes.
This is mostly f-string changes but there are a few other code-style
changes.
This was a logical error in the existing code, coupled with
incomplete testing.

The fix is simple: we just need to loop through the session's
_psf dictionary and use the key,value as the arguments to
set_psf call(s).

There is still something weird going on with the warning messages
for this test. I don't understand but have got it to a point
where it works for me.
The DataIMG case avoided setting a filter if it didn't filter
any data, so let's apply that to the other data types. We
use the same scheme for the source and background filters.
Add basic support for calling save_all for a session where one
or more datasets were created manually - e.g. with load_arrays,
dataspace1d, dataspace2d, or set_data.

This only provides basic support, as things like header keywords
are not copied over and there are likely PHA settings that are
not handled.

The aim is to provide basic support and hope that is enough
to be usable, with additions added later. This really needs
changes to the Data object class, to identify when a dataset
has been changed from it's original / read-from-file values,
as

a) we can simplify a lot of the pha code if we don't need to
   reset grouping/quality

b) there are cases where a used can have tweaked a file
   and it would be nice to catch that

although I think there will always be possible problems (e.g.
an on-disk change to the file), so this will never be perfect.

The DataIMG restoration is more-complicated since we have to
worry about data where the coord has been changed from logical,
since we need to send in logical coordinates to the DataIMG
constructor. Fortunately this is relatively easy (although
testing shows some unrelated issues like sherpa#1880 and sherpa#1789).
We now make sure that all imports are done first. This is only
relevant for cases which need to write out manually-created image
files, and so need to also import the sherpa.astro.io.wcs.WCS
symbol.

It is not clear whether we require this symbol to be always
present or it is optional (this is only relevant for the
code generation, not the code created by save_all).

This is actually a large code change, as we now create the output
contents before writing it out to a file. There's more that
could be done [*], but for now leave as just this change.

[*] for instance, the "output state" could be built up by a
    class rather than just manually adding to a dictionary,
    but we don't yet need the class (e.g. to over-ride it)
There is no "functional" change to the output as it only removes
header sections that are not needed (related to data input).
This avoids writing out banners for components and model
expressions, and removes some blank lines that were being
added.

The approach taken to avoiding printing out the banned is
rather hacky, but it works. The code could be re-written
to only add the banner if we know we need it, but that is
a larger change I didn't want to make here.
There's no need to add an "if <check>" for grouping and background
subtraction since we know the result of the call, so we just make
them instead.
Separate out some of the code for setting up datasets, and take
advantage of knowing we have a DataPHA object to avoid quite so
many try/except blocks. We know prefer to send the DataPHA
object around, and use that API rather than the UI API, to
access the data we need.

As part of this we have removed more "blank" sections for PHA
files (no semantic changes).
@DougBurke
Copy link
Contributor Author

Rebased to address conflicts in the tests (so no code changes needed).

@DougBurke
Copy link
Contributor Author

@anetasie - can you try this PR out and see if it makes sense (assuming you use save_all?). Hopefully the PR text provides enough context. I understand you wouldn't want to look at the actual code changes!

@anetasie
Copy link
Contributor

@DougBurke I don't use save_all, but I should! I just tried with 2D image models and 2D psf and the save_all file looks good.
I was wondering what is the difference between save_all to a python file and save to binary file and when one would use each option. This question is for the docs and not related to this PR.

@DougBurke
Copy link
Contributor Author

So, save_all is technically the better approach for long-term use, as you can load it into a newer Sherpa 9maybe modifying it if needed) or on a different architecture (e.g. intel vs macOS ARM) and save/restore is just for "quick" saving (e.g. I want to be able to restore this tomorrow of after the weekend on the same machine).

However, save_all does not "save everything": there are some settings we do not try to save, there are features of the UI code we just don't test and hence fail to restore properly (e.g. issue #1873, which this PR fixes, but there are other known problems), and there are some ways of using the UI code that we don't really handle nicely (e.g. creating data "on the fly" rather than reading it in from a file, which this PR makes some small improvements to but it's still not great). So unfortunately save_all isn't perfect. However, we do not guarantee that the save/restore file can be read in on newer (or older) Sherpa versions.

@DougBurke
Copy link
Contributor Author

For @anetasie - you can compare the documentation at

to suggest improvements (technically you should change to the "current" version of the documentation but I don't believe it's changed since 4.15.1 for these two functions).

@anetasie
Copy link
Contributor

@DougBurke my only addition to the docs would be an example of how to 'restore' the session using the save_all file.
I used run -i save.session.

@DougBurke
Copy link
Contributor Author

So, re-running the output of save_all is actually surprisingly tricky, since

%run -i foo.py

only works for ipython. If you are running python it fails. Python 3 explicitly took out an easy way to do what we want here !

Copy link
Contributor

@hamogu hamogu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. I left some suggetions for typos in comments, just to proove to you that I indeed read the code. However, with thousands of lines of string formatting, I could still have missed something, but there are ample tests, so I fill that's sufficient.

There are ~2 actual comments on the code, but both are judgement calls that could be done either this way or that way. So, I don't expct them to accepted, I just wanted to bring it up while I was reading through the code.


# The par parameter is hard to type as it's Union[Parameter,
# XSBaseParameter] but the latter is only defined if XSPEC support is
# enabled. One way around this is to create a Protocol for defining
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read up on Protocols, as I wanted to know if there was a way to make it more obscure and more complicated! Didn't use to be in the old days of Python, but now there is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sherpa/astro/ui/serialize.py Outdated Show resolved Hide resolved
sherpa/astro/ui/serialize.py Outdated Show resolved Hide resolved
cmd = 'load_user_model({}, "{}")'.format(
mod.calc.__name__, modelname)
_output(cmd, fh)
_output(out, _reindent(pycode))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this suggestion is applied it also needs a import textwrap at the top.

Suggested change
_output(out, _reindent(pycode))
textwarp.dedent(out, _reindent(pycode))

The behaviour is not exactly identical to _reindent, e.g.

    def f(a):
# Not sure why I would write this
        return a    

would be detended in line 1 and 3 by _reindent, but not by textwrap.detent. On the other hand, _reindent above (I'm making the comment here because this is the only line that uses it, so if we switch to textwarp.detend the entire definition of _reindent could be removed) also has a comment that it's fragile, just in other ways, for example if the first line does not start with "def":

    @some_decorator
    def func(a):
        pass

or

    class A():
        def __call__(self, a):
            return a

would be treated correctly by textwrap.detend, but not by _reindent. I tried one of the the example, and we could actually get such a case here:

In [3]: inspect.getsource(func)
Out[3]: '@functools.cache\ndef func(a):\n    return a\n'

On balance, I don't care. I'm always for removing code and using the standard library where possible, but I'm not going to argue that this problem is better than that problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this code is that I haven't been able to easily trigger it, so if we can replace code with standard-specific stuff - I was looking when I wrote it originally but missed textwrap. As I always do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not change this at this stage (better to go with the devil you know) given that although this line was changed in this PR it isn't related to the suggested change here. We can do this post the 4.16 release.

sherpa/astro/ui/serialize.py Show resolved Hide resolved
Comment on lines 1158 to 1162
# This case shpuld not happen, but just in case.
#
if xspec is None:
return

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe?

Suggested change
# This case shpuld not happen, but just in case.
#
if xspec is None:
return
# This case shpuld not happen, but just in case.
assert xspec is not None

potentially more disrupting to users if this case happens, but for us, we would know we've run into a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this use case I don't want the user to be disrupted in case something strange has happened.

Comment on lines +1212 to 1213
"""Import the WCS symbol if not done already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Import the WCS symbol if not done already.
"""Import the WCS symbol.

There is no test if it's done already. But there is also little harm in putting out that import statement more thna once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there is a test (implicitly) because out["imports"] is a set, so it only gets added once. So I'm keeping the original text.

Co-authored-by: Hans Moritz Günther <moritz.guenther@gmx.de>
@DougBurke
Copy link
Contributor Author

Argh - one of the suggested changes broke a number of tests. Will have to fix (probably by reverting the problem change).

@DougBurke DougBurke removed the CI arch Set on a PR if the arch tests should be run (to test aarch64 and others) label Sep 27, 2023
Co-authored-by: wmclaugh <wmclaugh@cfa.harvard.edu>
@wmclaugh wmclaugh merged commit 6fa5a0c into sherpa:main Sep 28, 2023
17 checks passed
@DougBurke DougBurke deleted the save-all-psf branch September 28, 2023 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

save_all for 2D dataset does not save set_psf()
4 participants