Importing Bugs #13924 - SegFault with fake_pha #56

anetasie · 2015-06-26T22:02:40Z

SegFault with fake_pha on some systems was reported.
The original bug and the data are located here:
/data/lenin2/Bugs/Sherpa/Nustar/HD_15934

and the script 'test2.sph' fails on MAC.

This bug is specific to X-ray spectral analysis.
In CIAO version the screen output when running on the command line shows the following:

% sherpa 
----------------------------------------------------- 
Welcome to Sherpa: CXC's Modeling and Fitting Package 
----------------------------------------------------- 
CIAO 4.6 Sherpa version 2 Wednesday, January 22, 2014 
sherpa-1> set_source("faked", "xsapec.A") 
sherpa-2> fake_pha("faked", arf="tvcol_A.arf", rmf="tvcol_A.rmf", exposure=1.e8, 
grouped=False) 
/soft/ciao-4.6/binexe/sherpa: line 199: 28184 Segmentation fault (core 
dumped) ipython --profile sherpa

The text was updated successfully, but these errors were encountered:

olaurino · 2015-07-08T15:27:18Z

Might this be related to issue #55, in particular this comment from @DougBurke?

DougBurke · 2015-07-14T17:18:28Z

The data files (tvcol_A.arf and tvcol_A.rmf) plus an example script are available at https://gist.github.com/DougBurke/04a41831a70e7a59d3e4

DougBurke · 2015-07-14T17:51:58Z

So, it appears that this is related to #62. The problem here is that the ARF grid is not contiguous, that is the energ_hi value for row i is not always the same as the energ_lo column for row i+1. It appears that this makes Sherpa send in a non-contiguous grid of lo/hi values to the models. Since the apec model is in use here, it triggers #62.

Using the data from the gist at https://gist.github.com/DougBurke/04a41831a70e7a59d3e4 and the latest master branch (so using astropy for file I/O, but the same result is seen in CIAO 4.7 with crates as the I/O back end):

In [1]: import numpy as np

In [2]: from sherpa.astro import ui
WARNING: imaging routines will not be available, 
failed to import sherpa.image.ds9_backend due to 
'RuntimeErr: DS9Win unusable: Could not find ds9 on your PATH'

In [3]: arf = ui.unpack_arf('tvcol_A.arf')

In [4]: elo = arf.energ_lo

In [5]: ehi = arf.energ_hi

In [6]: elo.size
Out[6]: 4096

Are the energy bins for the ARF (the energ_lo and energ_hi columns) ordered correctly?

a) Are the low edge of each bin less than the high edge?

In [7]: (elo < ehi).all()
Out[7]: True

b) Are the low edges monotonically increasing?

In [8]: (elo[1:] > elo[:-1]).all()
Out[8]: True

c) Does the high edge of bin i match the low edge of the next bin (i+1)?

In [9]: (elo[1:] == ehi[:-1]).all()
Out[9]: False

How many are there like this?

In [10]: (elo[1:] == ehi[:-1]).sum()
Out[10]: 3079

In [11]: idx, = np.where(elo[1:] != ehi[:-1])

In [12]: idx.size
Out[12]: 1016

So there's a lot. Where are they (the first few)?

In [13]: idx[:10]
Out[13]: array([ 1,  5,  7, 13, 18, 27, 30, 40, 43, 54])

Let's look at the first 10 pairs since this contains three problem bins:

In [14]: zip(elo,ehi)[:10]
Out[14]: 
[(1.6000000238418579, 1.6399999856948853),
 (1.6399999856948853, 1.6799999475479126),
 (1.6800000667572021, 1.7200000286102295),
 (1.7200000286102295, 1.7599999904632568),
 (1.7599999904632568, 1.7999999523162842),
 (1.7999999523162842, 1.8399999141693115),
 (1.8400000333786011, 1.8799999952316284),
 (1.8799999952316284, 1.9199999570846558),
 (1.9200000762939453, 1.9600000381469727),
 (1.9600000381469727, 2.0)]

Looking at the second and third pairs, we have

 (1.6399999856948853, 1.6799999475479126),
 (1.6800000667572021, 1.7200000286102295),

and you can see that there is a gap: the first bin ends at 1.679999... and the next one starts at 1.68000....

It appears that the RMF does not have these issues, as shown below (this time using the response-model created by Sherpa to access the lo/hi bins used for evaluating the model, since this is what the unpack_pha example uses, but I manually create a DataPHA dataset). First with just a RMF, where the grid has no gaps:

In [1]: import numpy as np

In [2]: from sherpa.astro import ui
WARNING: imaging routines will not be available, 
failed to import sherpa.image.ds9_backend due to 
'RuntimeErr: DS9Win unusable: Could not find ds9 on your PATH'

In [3]: x = np.arange(1, 4097, 1)

In [4]: y = np.ones(x.size)

In [5]: ui.load_arrays(1, x, y, ui.DataPHA)

In [6]: ui.load_rmf(1, 'tvcol_A.rmf')

In [7]: ui.set_source(1, ui.powlaw1d.mdl1)

In [8]: eval_model = ui.get_model(1)

In [9]: print(eval_model)
apply_rmf(powlaw1d.mdl1)
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   mdl1.gamma   thawed            1          -10           10           
   mdl1.ref     frozen            1 -3.40282e+38  3.40282e+38           
   mdl1.ampl    thawed            1            0  3.40282e+38           

In [10]: xlo = eval_model.xlo

In [11]: xhi = eval_model.xhi

In [12]: (xlo < xhi).all()
Out[12]: True

In [13]: (xlo[1:] == xhi[:-1]).all()
Out[13]: True

The xlo/xhi grid is contiguous because the output of line 13 is True. If I now add in an ARF, we see gaps (the equivalent of line 13, which is line 20, is now False):

In [14]: ui.load_arf(1, 'tvcol_A.arf')

In [15]: eval_model2 = ui.get_model(1)

In [16]: print(eval_model2)
apply_rmf(apply_arf(powlaw1d.mdl1))
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   mdl1.gamma   thawed            1          -10           10           
   mdl1.ref     frozen            1 -3.40282e+38  3.40282e+38           
   mdl1.ampl    thawed            1            0  3.40282e+38           

In [17]: xlo2 = eval_model2.xlo

In [18]: xhi2 = eval_model2.xhi

In [19]: (xlo2 < xhi2).all()
Out[19]: True

In [20]: (xlo2[1:] == xhi2[:-1]).all()
Out[20]: False

DougBurke · 2015-07-14T18:32:20Z

This suggests that there's two issues here

the crash, which is being dealt with in XSpec model evaluation for non-continuous grids (Xspec models that call sumdem, such as apec, can crash) #62 (and is apparently fixed in the next XSpec release)
what to do with ARF and RMF grids that do not match (I think there has previously been discussion of this, a long time ago, but don't have anything to point to). It may be worth a new ticket.

When run with the code and data files given in issue sherpa#56 - e.g. https://gist.github.com/DougBurke/04a41831a70e7a59d3e4 - then the model now fails with the following message ValueError: Grid cells overlap: cell 360 (16 to 16.04) and cell 361 (16.04 to 16.08)

DougBurke · 2015-07-15T13:50:43Z

So, if PR #63 is used - using @6f1884e - then we get

a) with a 32-byte, FORTRAN-style XSpec model

The following works (note: as this was an "occasional crash" previously, just running to the end doesn't necessarily mean it is fixed, but it is now only calling the model once with "gaps" being dealt with):

import numpy as np
from sherpa.astro import ui

# this is a 32-byte float model
ui.set_source("faked", "xsapec.A")

A.abundanc=1.0
A.redshift=0.0

arfdata = ui.unpack_arf("tvcol_A.arf")
rmfdata = ui.unpack_rmf("tvcol_A.rmf")

A.kT=1.0
A.norm=1.0
ui.fake_pha("faked", arf=arfdata, rmf=rmfdata, exposure=1.e8, grouped=False)

b) with a 64-byte, C++ style XSpec model

If the set_source line is changed to

ui.set_source("faked", "xspowerlaw.mdl2")

then the fake_pha call will now error out (in fact, any attempt to evaluate the model will do so) with the following error message

ValueError: Grid cells overlap: cell 360 (16 to 16.04) and cell 361 (16.04 to 16.08)

This particular row in the ARF is

% dmlist tvcol_A.arf data,clean rows=359:363
#  ENERG_LO             ENERG_HI             SPECRESP
        15.9200000763        15.9600000381       199.3535614014
        15.9600000381                 16.0       198.8935852051
                 16.0        16.0400009155       198.4293060303
        16.0399990082        16.0799999237       197.9608459473
        16.0799999237        16.1200008392       197.4918365479

So you can see that there is an overlap: 15.96 - 16.0, 16.0 - 16.040xxx, 16.0399xxx - 16.07999

The output format used by the error message doesn't show the overlap, since it has rounded down to 2 decimal places, so you just see 16.04 repeated.

So, as there is this overlap, why doesn't the xsapec model trigger the warning? Well, the code for the 32-byte float models now deals with the energy array as 32-byte floats not 64-byte floats, including checking for overlap - so using FLT_EPSILON rather than DBL_EPSILON. For this ARF, the overlap bins, such as shown below, are all within FLT_EPSILON, but not DBL_EPSILON, hence the difference in behaviour.

Note that the columns are stored as 32-byte floats in the ARF:

% dmlist tvcol_A.arf cols

--------------------------------------------------------------------------------
Columns for Table Block SPECRESP
--------------------------------------------------------------------------------

ColNo  Name                 Unit        Type             Range
   1   ENERG_LO             keV          Real4          -Inf:+Inf            label for field 1
   2   ENERG_HI             keV          Real4          -Inf:+Inf            label for field 2
   3   SPECRESP             cm**2        Real4          -Inf:+Inf            label for field 3

DougBurke · 2015-10-09T20:48:37Z

Fixed by #101

anetasie added type:bug priority:high area:code note:cxc-ticket labels Jun 26, 2015

anetasie added this to the 4.8rc1 milestone Jun 26, 2015

DougBurke mentioned this issue Jul 8, 2015

XSpec model evaluation for non-continuous grids (Xspec models that call sumdem, such as apec, can crash) #62

Closed

olaurino added the dep:xspec label Jul 9, 2015

olaurino assigned DougBurke Jul 10, 2015

DougBurke mentioned this issue Jul 14, 2015

Importing Bugs # 13901 - Seg.Fault with group/ignore #55

Closed

DougBurke mentioned this issue Jul 15, 2015

Fix and clean up of the interface to XSpec models #63

Closed

DougBurke mentioned this issue Sep 11, 2015

Fix handling of non-contiguous grids in XSPEC models #101

Merged

DougBurke closed this as completed Oct 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Importing Bugs #13924 - SegFault with fake_pha #56

Importing Bugs #13924 - SegFault with fake_pha #56

anetasie commented Jun 26, 2015

olaurino commented Jul 8, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 15, 2015

DougBurke commented Oct 9, 2015

Importing Bugs #13924 - SegFault with fake_pha #56

Importing Bugs #13924 - SegFault with fake_pha #56

Comments

anetasie commented Jun 26, 2015

olaurino commented Jul 8, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 14, 2015

DougBurke commented Jul 15, 2015

DougBurke commented Oct 9, 2015