PSF rebinning (fix #43) #607

olaurino · 2019-03-13T21:15:06Z

Release Note

Sherpa now supports using a PSF with a finer resolution than 2D images. If Sherpa detects that the PSF has a smaller pixel size than the data, it will evaluate the model on a "PSF Space" that has the same resolution as the PSF and the same footprint as the data, then rebin the evaluated model back to the data space for calculating the statistic.

This PR fixes #43.

See the comments for more details

olaurino · 2019-03-18T21:28:04Z

I just pushed a basic implementation of the fix. Some of the tests will fail for a number of reasons:

With the fix, there will be no need for warnings, so the old tests for the warnings are now failing.
Other fixes we added for regressions introduced by Support model evaluation on an arbitrary grid #392 may also be obsoleted by this PR.
More importantly, some of the test cases exercise the case where the PSF has a coarser resolution than the data. Sherpa will issue a NotImplementedError in that case, and we'll need to specify the desired behavior in that case. See also the notes below.

The bulk of the change is in instrument.py where the pixel sizes for the image and the PSF are compared. If they are the same, everything should go ahead as usual. If the PSF has smaller pixels then the convolution will happen in a space defined by the data space boundaries and the PSF resolution. If the PSF has a coarser resolution then Sherpa will issue an Exception, as that behavior is not currently defined (I tried rebinning the PSF to the data space, but for some reason that was not getting the expected results, so I'll leave that for later).

Two other interesting places to visit are in the same file, in calc and in the new PSFSpace2D class (not sure it will remain a class, but that's a detail).

It will be important to add a mechanism for disabling this behavior, because the performance impact is significant. We are also using a rebin_flux method that might be really inefficient, so we should look into that as well.

Generally speaking, I think I should get better results. I had to set the tolerance to 3% for all the tests I expect to pass to actually pass. Nothing obviously wrong is happening, and maybe that is all I can hope for considering the effects of the convolution, but the feeling is that the results are somewhat biased, with the source's variance always ending up being overestimated by as much as 3%. I'll see if I can find some diagnostics for Sherpa to produce to give me a better idea of where things might be getting off, but again maybe it's all good to begin with.

When comparing pixel sizes I am assuming pixels are squares, and I am only using the first component of cdelt for the comparisons. I am not sure whether this is true in general or I should do something else.

PSFSpace2D might be a little bit too simplistic. I am not sure whether setting the end of the linspace to data_space.end + 1 makes sense in all cases (I have a feeling it only works when the fit is in pixel coordinates). And again, only one pixel dimension is used to calculate the step in linspace.

I would be surprised if I didn't introduced any issues that have to with WCS or any other thing. I will need to collect more realistic test cases for assessing this.

@juramaga and @anetasie I'd appreciate comments on this, and I'll get in touch to get additional requirements and test cases.

dtnguyen2 · 2019-03-18T22:20:02Z

I was under the impression that convolution is commutative, see for example:

https://ccrma.stanford.edu/~jos/st/Commutativity_Convolution.html

so I don't understand the following statement:

"...If the PSF has a coarser resolution then Sherpa will issue an Exception, as that behavior is not currently defined...."

but hey what do I know

olaurino · 2019-03-18T22:51:57Z

@dtnguyen2 I don't understand what commutativity has to do with my statement. The implementation evaluates the model and convolves it with the kernel in the "PSF space" (as per comments) when the PSF has a better resolution. If the PSF has a worse resolution the implementation would evaluate the model in a space that has a worse resolution than the data, which would be wrong anyway, so I just stopped there and hard-coded an exception: https://github.com/sherpa/sherpa/pull/607/files#diff-35f9d411d670ea181d14083553f381e5R565

dtnguyen2 · 2019-03-19T11:53:37Z

The convolution is commutative means that there is nothing special about psf vs data-space, you can switch the models around and you should get the same result. If you were able to get satisfactory result if the data-space is more coarse then then the psf, then switch the names around then you should get the same result. If not then I would say there is an issue with the implementation, if you don't think that is true then perhaps you should publish it, just make sure to leave my name off the paper.

DougBurke · 2019-03-19T13:37:03Z

Dan, it's not just convolution that is going on here, but rebinning of the result to match the data space. This is where the constraint on the pixel size comes in: we support degrading the resolution of the convolution (in other words, the convolution is done with smaller pixels than the data) but do not support the case where the convolution is to be done with larger pixels than the data (where we'd have to interpolate or come up with some other scheme). It is this latter case where we the code is going to throw an exception.

dtnguyen2 · 2019-03-20T07:43:12Z

That is my point: whatever you do when the data-space is more coarse then PSF then you should do the same when the PSF is more coarse then data-space. PSF folding wrt to data-space is the same as the other way

As for the 3% off, if you are not adding noise to you model and PSF then it doesn't sound right. BTW, I you are using LevMar try setting epsfcn to double_epsilon, the default is float_epsilon cause Xspec models are mostly single precision, at least they used to be

olaurino · 2019-03-20T14:10:58Z

Dan, if I understand you are suggesting that when the PSF is coarser than the data we rebin the PSF to match the data space and evaluate the model (with the convolution) in that space. Well yes, I expect that to be one of the options. I had even tried to implement it, as mentioned in my notes:

(I tried rebinning the PSF to the data space, but for some reason that was not getting the expected results, so I'll leave that for later).

Note that this is not really symmetrical, because the evaluation+convolution in the higher resolution PSF space and the rebinning back to the data space must happen at each step of the minimization, while the rebinning of the coarser PSF to the higher resolution data space can be done only once at the beginning of the fit.

If instead you are suggesting that we should just apply the same logic to both cases, I don't think that's the right approach, because then the model would be evaluated in a much coarser space and then rebinned to match the "high-res" data. I'd expect that to introduce a lot of noise and in any case to reduce the accuracy of the fit. Maybe for a noiseless Gaussian I should still get good results, but that is why I am publishing the WIP. Note that I only get "perfect" results when I have plenty of bins in the image, so the source and the PSF are effectively 0 at the edges. But then the convolution and the rebinning would take forever, so I reduced the number of bins, and so boundary effects might become more significant. I'll run some more tests.

It's really up to SDS to define the behavior anyway. Sherpa might even just error out and ask the user to do something, e.g. rebin the PSF themselves or instruct Sherpa to assume the PSF has the same pixel size as the data.

In any case yes, I am concerned that I only get the fit parameters to be 3% off the true ones. It could be something in the rebinning function, or some wrong assumptions in my implementation, or it could be expected given rebinning is involved. And yes, as I mentioned I am looking for a critical review of the current implementation, because there might be errors that a) introduce that 3% "error" and b) make the inverse logic of the coarser PSF to work even worse than it should, whatever the desired behavior in that case should be.

anetasie · 2019-06-07T17:30:29Z

Is the failure an indication that the PR is still a work in progress? Is this ready for review?

juramaga · 2019-06-07T18:46:17Z

As far as I know this is still in progress. Omar was going to implement it based on the conclusion from my report.

On Fri, Jun 7, 2019 at 12:30 PM Aneta Siemiginowska < ***@***.***> wrote: Is the failure an indication that the PR is still a work in progress? Is this ready for review? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#607?email_source=notifications&email_token=ABJ7SA64Y6EJH4SYT2W6E5LPZKLLLA5CNFSM4G6AXJLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXGPNDA#issuecomment-499971724>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJ7SA4ARG76C3ANBIINH7TPZKLLLANCNFSM4G6AXJLA> .

-- Rafael Martínez-Galarza Staff Astrophysicist Chandra X-Ray Data Center Center for Astrophysics | Harvard & Smithsonian Office: (617) 4957 7027 | Cell: (617) 945 6397 60 Garden Street | MS 66 | Cambridge, MA 02138 cfa.harvard.edu | Facebook <http://cfa.harvard.edu/facebook> | Twitter <http://cfa.harvard.edu/twitter> | YouTube <http://cfa.harvard.edu/youtube> | Newsletter <http://cfa.harvard.edu/newsletter>

olaurino · 2019-06-11T18:55:58Z

Is the failure an indication that the PR is still a work in progress? Is this ready for review?

@anetasie for the description of the failure, and for a number of open questions, please see the first comment I added above. The only thing not current in there is that we now think the loss in precision is expected because of the rebinning. All the other comments, questions, and the request for review still stand and should be addressed.

anetasie · 2019-06-17T20:56:21Z

I get AssertionError when I tried to use the higher resolution psf in sherpa build in this branch.
It works fine when I use the psf which is matching the resolution of the image.

In [23]: fit()                                                                                                         
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-23-2e8ead0159f0> in <module>
----> 1 fit()

<string> in fit(id, *otherids, **kwargs)

/data/qednew/git/sherpa/sherpa/astro/ui/utils.py in fit(self, id, *otherids, **kwargs)
   9777         """
   9778         kwargs['bkg_only'] = False
-> 9779         self._fit(id, *otherids, **kwargs)
   9780 
   9781     def fit_bkg(self, id=None, *otherids, **kwargs):

/data/qednew/git/sherpa/sherpa/astro/ui/utils.py in _fit(self, id, *otherids, **kwargs)
   9881                     i).mask & numpy.isfinite(self.get_data(i).get_x())
   9882 
-> 9883         res = f.fit(**kwargs)
   9884         res.datasets = ids
   9885         self._fit_results = res

/data/qednew/git/sherpa/sherpa/fit.py in run(fit, *args, **kwargs)
     51     def run(fit, *args, **kwargs):
     52         fit.model.startup()
---> 53         result = func(fit, *args, **kwargs)
     54         fit.model.teardown()
     55         return result

/data/qednew/git/sherpa/sherpa/fit.py in fit(self, outfile, clobber)
   1228             raise FitErr('binhas0')
   1229 
-> 1230         init_stat = self.calc_stat()
   1231         # output = self.method.fit ...
   1232         output = self._iterfit.fit(self._iterfit._get_callback(outfile, clobber),

/data/qednew/git/sherpa/sherpa/fit.py in calc_stat(self)
   1113         """
   1114 
-> 1115         return self._calc_stat()[0]
   1116 
   1117     def calc_chisqr(self):

/data/qednew/git/sherpa/sherpa/fit.py in _calc_stat(self)
   1094         # TODO: is there anything missing here that
   1095         #       self._iterfit.get_extra_args calculates?
-> 1096         return self.stat.calc_stat(self.data, self.model)
   1097 
   1098     def calc_stat(self):

/data/qednew/git/sherpa/sherpa/stats/__init__.py in calc_stat(self, data, model)
    328 
    329     def calc_stat(self, data, model):
--> 330         fitdata, modeldata = self._get_fit_model_data(data, model)
    331 
    332         return self._calc(fitdata[0], modeldata, None,

/data/qednew/git/sherpa/sherpa/stats/__init__.py in _get_fit_model_data(self, data, model)
    202         data, model = self._validate_inputs(data, model)
    203         fitdata = data.to_fit(staterrfunc=self.calc_staterror)
--> 204         modeldata = data.eval_model_to_fit(model)
    205 
    206         return fitdata, modeldata

/data/qednew/git/sherpa/sherpa/data.py in eval_model_to_fit(self, modelfuncs)
    519 
    520         for func, data in izip(modelfuncs, self.datasets):
--> 521             total_model.append(data.eval_model_to_fit(func))
    522 
    523         return numpy.concatenate(total_model)

/data/qednew/git/sherpa/sherpa/data.py in eval_model_to_fit(self, modelfunc)
    188 
    189     def eval_model_to_fit(self, modelfunc):
--> 190         return modelfunc(*self.get_indep(filter=True, model=modelfunc))
    191 
    192     def _get_indep(self, filter=False):

/data/qednew/git/sherpa/sherpa/models/model.py in __call__(self, *args, **kwargs)
    262         if (len(args) == 0 and len(kwargs) == 0):
    263             return self
--> 264         return self.calc([p.val for p in self.pars], *args, **kwargs)
    265 
    266     def _get_thawed_pars(self):

/data/qednew/git/sherpa/sherpa/instrument.py in calc(self, p, *args, **kwargs)
     71         nlhs = len(self.lhs.pars)
     72         return self.psf.calc(p[:nlhs], p[nlhs:],
---> 73                              self.lhs.calc, self.rhs.calc, *args, **kwargs)
     74 
     75 

/data/qednew/git/sherpa/sherpa/instrument.py in calc(self, *args, **kwargs)
    527 
    528         if self._must_rebin:
--> 529             return rebin_2d(psf_space_evaluation, self.psf_space, self.data_space).ravel()
    530         else:
    531             return psf_space_evaluation

/data/qednew/git/sherpa/sherpa/models/regrid.py in rebin_2d(y, from_space, to_space)
    570     reshaped_y = y.reshape(from_x_dim, from_y_dim)
    571 
--> 572     return rebin_flux(reshaped_y, dimensions=(to_x_dim, to_y_dim))
    573 
    574 

/data/qednew/git/sherpa/sherpa/models/regrid.py in rebin_flux(array, dimensions, scale)
    648     assert array.sum() == 0 or \
    649            (array.sum() < result.sum() * (1 + allowError)) and \
--> 650            (array.sum() > result.sum() * (1 - allowError))
    651     return result

AssertionError:

anetasie · 2019-06-17T20:57:13Z

The test data set is in local directory poo14/aneta/PSF

juramaga · 2019-06-17T20:58:30Z

With the fix, there will be no need for warnings, so the old tests for the warnings are now failing.

Other fixes we added for regressions introduced by Support model evaluation on an arbitrary grid #392 may also be obsoleted by this PR.

More importantly, some of the test cases exercise the case where the PSF has a coarser resolution than the data. Sherpa will issue a NotImplementedError in that case, and we'll need to specify the desired behavior in that case. See also the notes below.

Regarding point 3 above: I believe the PSF resolution is always at least as good as the data resolution. But the user could always (in purpose or by mistake) input a PSF that is coarser. In that case we have two options:

We return an error saying that the PSF resolution should be at least the same as the data.
We return a warning noting that the PSF is coarser, but we still perform the fit by rebining the data to match the PSF resolution (kind of the opposite of what we are trying to achieve in this PR).

I prefer the second option, as I can imagine a case in which an user would like to know how the fit would behave in the case of a poor resolution (with another instrument, for example).

Also, I just discussed with Omar that we should also be testing that the MCMC part is improved by the use of a finer grid. I showed this in my report, but Omar has been checking on the optimization part only, not the sampling. I believe, however, that we should get the improvement in MCMC sampling for free, because the sampling is performed on the convolved model. Am I right?

Finally, I do believe that we should keep the finer grid fit as optional, due to concerns in computing time. Namely, if the PSF resolution is finer than the data, we should ask the user if they want to perform the convolution in the PSF grid, but warning them that this might take a while to compute. We should also ask them to provide a PSF whose resolution is an integer multiple of the data resolution, as the rebining code we have implemented requires this.

olaurino · 2019-06-17T20:59:00Z

@anetasie that code is in the rebin function and it checks that the rebinning is not "leaking" flux. You can reduce the error by changing this line: https://github.com/sherpa/sherpa/pull/607/files#diff-7bd52aa2f9422f0c42ca70db9ef0038bR647

The original code had 0.1

anetasie · 2019-06-17T21:31:27Z

@olaurino I updated the line and retested. I don't get the error when I run fit for the first time in the session with the default method levmar and statistics chi2gehrels. However, if I change the method/stats to appropriate ones for the image analysis, so set_method('neldermead') and set_stat('cash') I get AssertionError again.

anetasie · 2019-06-17T21:35:25Z

Here is the screen output - I updated the test script in /pool14/aneta/PSF

In [12]: set_method('simplex')                                                                                         

In [13]: fit()                                                                                                         
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-13-2e8ead0159f0> in <module>
----> 1 fit()

<string> in fit(id, *otherids, **kwargs)

/data/qednew/git/sherpa/sherpa/astro/ui/utils.py in fit(self, id, *otherids, **kwargs)
   9777         """
   9778         kwargs['bkg_only'] = False
-> 9779         self._fit(id, *otherids, **kwargs)
   9780 
   9781     def fit_bkg(self, id=None, *otherids, **kwargs):

/data/qednew/git/sherpa/sherpa/astro/ui/utils.py in _fit(self, id, *otherids, **kwargs)
   9881                     i).mask & numpy.isfinite(self.get_data(i).get_x())
   9882 
-> 9883         res = f.fit(**kwargs)
   9884         res.datasets = ids
   9885         self._fit_results = res

/data/qednew/git/sherpa/sherpa/fit.py in run(fit, *args, **kwargs)
     51     def run(fit, *args, **kwargs):
     52         fit.model.startup()
---> 53         result = func(fit, *args, **kwargs)
     54         fit.model.teardown()
     55         return result

/data/qednew/git/sherpa/sherpa/fit.py in fit(self, outfile, clobber)
   1228             raise FitErr('binhas0')
   1229 
-> 1230         init_stat = self.calc_stat()
   1231         # output = self.method.fit ...
   1232         output = self._iterfit.fit(self._iterfit._get_callback(outfile, clobber),

/data/qednew/git/sherpa/sherpa/fit.py in calc_stat(self)
   1113         """
   1114 
-> 1115         return self._calc_stat()[0]
   1116 
   1117     def calc_chisqr(self):

/data/qednew/git/sherpa/sherpa/fit.py in _calc_stat(self)
   1094         # TODO: is there anything missing here that
   1095         #       self._iterfit.get_extra_args calculates?
-> 1096         return self.stat.calc_stat(self.data, self.model)
   1097 
   1098     def calc_stat(self):

/data/qednew/git/sherpa/sherpa/stats/__init__.py in calc_stat(self, data, model)
    328 
    329     def calc_stat(self, data, model):
--> 330         fitdata, modeldata = self._get_fit_model_data(data, model)
    331 
    332         return self._calc(fitdata[0], modeldata, None,

/data/qednew/git/sherpa/sherpa/stats/__init__.py in _get_fit_model_data(self, data, model)
    202         data, model = self._validate_inputs(data, model)
    203         fitdata = data.to_fit(staterrfunc=self.calc_staterror)
--> 204         modeldata = data.eval_model_to_fit(model)
    205 
    206         return fitdata, modeldata

/data/qednew/git/sherpa/sherpa/data.py in eval_model_to_fit(self, modelfuncs)
    519 
    520         for func, data in izip(modelfuncs, self.datasets):
--> 521             total_model.append(data.eval_model_to_fit(func))
    522 
    523         return numpy.concatenate(total_model)

/data/qednew/git/sherpa/sherpa/data.py in eval_model_to_fit(self, modelfunc)
    188 
    189     def eval_model_to_fit(self, modelfunc):
--> 190         return modelfunc(*self.get_indep(filter=True, model=modelfunc))
    191 
    192     def _get_indep(self, filter=False):

/data/qednew/git/sherpa/sherpa/models/model.py in __call__(self, *args, **kwargs)
    262         if (len(args) == 0 and len(kwargs) == 0):
    263             return self
--> 264         return self.calc([p.val for p in self.pars], *args, **kwargs)
    265 
    266     def _get_thawed_pars(self):

/data/qednew/git/sherpa/sherpa/instrument.py in calc(self, p, *args, **kwargs)
     71         nlhs = len(self.lhs.pars)
     72         return self.psf.calc(p[:nlhs], p[nlhs:],
---> 73                              self.lhs.calc, self.rhs.calc, *args, **kwargs)
     74 
     75 

/data/qednew/git/sherpa/sherpa/instrument.py in calc(self, *args, **kwargs)
    527 
    528         if self._must_rebin:
--> 529             return rebin_2d(psf_space_evaluation, self.psf_space, self.data_space).ravel()
    530         else:
    531             return psf_space_evaluation

/data/qednew/git/sherpa/sherpa/models/regrid.py in rebin_2d(y, from_space, to_space)
    570     reshaped_y = y.reshape(from_x_dim, from_y_dim)
    571 
--> 572     return rebin_flux(reshaped_y, dimensions=(to_x_dim, to_y_dim))
    573 
    574 

/data/qednew/git/sherpa/sherpa/models/regrid.py in rebin_flux(array, dimensions, scale)
    648     assert array.sum() == 0 or \
    649            (array.sum() < result.sum() * (1 + allowError)) and \
--> 650            (array.sum() > result.sum() * (1 - allowError))
    651     return result

AssertionError:

dtnguyen2 · 2019-06-18T12:17:07Z

I haven't trace through the code, but it looks like failure is from the assert statement to compare two floating numbers if they are close. If this is the case then one has a lot of options, choose your poison:

https://docs.scipy.org/doc/numpy/reference/generated/numpy.isclose.html
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.testing.assert_almost_equal.html
https://docs.python.org/3/library/math.html

sao_fcmp or Knuth_close defined in sherpa/utils/init.py (note: GitHub somehow removes the __ prefix and suffix in the init.py)

dtnguyen2 · 2019-06-18T12:33:27Z

Oops, just realized that there is also the function Knuth_boost_close which I think is the boost interpretation of the Knuth's algorithm. BTW, it looks like I forgot to Knuth_boost_close in the definition of all in the sherpa/ui/init.py file.

olaurino · 2019-06-18T14:17:49Z

@anetasie you can also just comment out the assertion to begin with, and maybe print out the difference. I'll add this to the things to investigate, but yesterday @juramaga was telling me we could probably switch the rebinning function to use one that does not do additional calculations and just adds the bins together rather than the rebin_flux we are currently using.

anetasie · 2019-06-19T18:24:00Z

@olaurino I can get the fit to run with the assertion commented in the code.

anetasie · 2019-06-19T20:29:44Z

I expected image_source() to display the model in the full resolution given by psf, while image_model() in the resolution given by the data. I don't think this is currently implemented.
I see that image_source and image_model - display images in ds9 with exactly the same resolution, image_psf displays the image at the psf resolution as expected.

anetasie · 2019-06-19T21:18:39Z

I'm wondering if the hit in the performance when working with higher resolution psf is mainly due to rebin_flux step and the performance will improve with the simple summation. Now, the convergence is significantly slower (with a similar number of evals, 912 vs. 818) for the psf image with a factor of 2 smaller pixel size than the data (~550sec) vs. the same binning of the psf and the data (~1 sec).

juramaga · 2019-06-19T21:47:54Z

My tests indicate that rebin_flux is very inefficient. Rebin_int is much faster and the results are almost identical.

On Wed, Jun 19, 2019 at 5:18 PM Aneta Siemiginowska < ***@***.***> wrote: I'm wondering if the hit in the performance when working with higher resolution psf is mainly due to rebin_flux step and the performance will improve with the simple summation. Now, the convergence is significantly slower (with a similar number of evals, 912 vs. 818) for the psf image with a factor of 2 smaller pixel size than the data (~550sec) vs. the same binning of the psf and the data (~1 sec). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#607?email_source=notifications&email_token=ABJ7SAZMO67BTSRA3GE3CYTP3KPDBA5CNFSM4G6AXJLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYDJ6QA#issuecomment-503750464>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJ7SA2SKCDYZSTE6SDP56LP3KPDBANCNFSM4G6AXJLA> .

-- Rafael Martínez-Galarza Staff Astrophysicist Chandra X-Ray Data Center Center for Astrophysics | Harvard & Smithsonian Office: (617) 4957 7027 | Cell: (617) 945 6397 60 Garden Street | MS 66 | Cambridge, MA 02138 cfa.harvard.edu | Facebook <http://cfa.harvard.edu/facebook> | Twitter <http://cfa.harvard.edu/twitter> | YouTube <http://cfa.harvard.edu/youtube> | Newsletter <http://cfa.harvard.edu/newsletter>

dtnguyen2 · 2019-06-19T21:54:39Z

What does "much faster" translate to: factor of 10, 100, 1000 etc...?

dtnguyen2 · 2019-06-20T08:25:32Z

@anetasie For the computation times (550 vs 1 sec) what size image were you using? I haven't read through the rebin_flux, nor the much faster Rebin_int, but I suspect that they contain two nested loops (since we are talking about an image after all) so there is a reason to suspect that it will scale quadratically. Shouldn't the aforementioned rebin functions be translated to C/C++ land? If for some crazy reason the rebin function must be in python then shouldn't it be parallelize it?

DougBurke · 2019-06-20T13:11:20Z

@anetasie - with regards to the binning choice for image_source and image_model/image_fit - it's tricky, I think, because you could have be trying to fit something like (psf convolution + flat bgnd) * exposure map.

As we're not using integrated models, then if you display the background (e.g. Const2D) at different resolutions then you would still get the same per-cell value (e.g. if c0 = 10 then each pixel, whatever scale you use, would be 10), and the exposure map is fixed to its pixel size (assuming it is read in from a file), so it can't be displayed at other scales.

juramaga · 2019-06-20T13:16:55Z

@anetasie For the computation times (550 vs 1 sec) what size image were you using? I haven't read through the rebin_flux, nor the much faster Rebin_int, but I suspect that they contain two nested loops (since we are talking about an image after all) so there is a reason to suspect that it will scale quadratically. Shouldn't the aforementioned rebin functions be translated to C/C++ land? If for some crazy reason the rebin function must be in python then shouldn't it be parallelize it?

Here is rebin_int:

def rebin_int(array,factorx,factory):
xedge = np.shape(array)[0]%factorx
yedge = np.shape(array)[1]%factory
array_binned1 = array[xedge:,yedge:]

binim = np.reshape(array_binned1,
                   (np.shape(array_binned1)[0]//factorx,factorx,np.shape(array_binned1)[1]//factory,factory))
binim = np.sum(binim,axis=3)
binim = np.sum(binim,axis=1)

return binim

dtnguyen2 · 2019-06-20T13:24:58Z

without testing rebin_int, but it looks like it should be practically instantaneous

anetasie · 2019-06-20T15:07:34Z

@DougBurke - well, I want to display the image of the 'source' at the resolution of the psf used in the convolution, so before the rebinning is applied. It will be more complex if you have no convolved background component, or use expmap. For expmap - does the resolution of expmap need to match the psf resolution or the data image resolution?

juramaga · 2019-07-12T16:59:51Z

I performed additional tests using Frank's simulated sources. In my tests, I used both the case when the data pixel is an integer multiple of the PSF pixel, and the case when it is not (so, Sherpa in that case selects between rebin_int and rebin_flux as appropriate). I ran fits with data and psf having the same pixel size, and also with different combinations of the pixel size ratio.

Results look good. MCMC sampling is still not too efficient, but it runs fine in al cases (as long as a good covariance matrix is available).

So you have my green light for freezing the code.

olaurino · 2019-07-12T17:02:22Z

@DougBurke requested a way to get the information on the data/psf pixel ratio. I added it to the PSFSpace2D class, so that:

ui.get_psf().psf_space.pixel_size_ratio

will give a tuple with the x/y ratios (psf_pixel_size/data_pixel_size on each axis.)

anetasie · 2019-07-12T18:01:17Z

@juramaga yes I tested get_draws() with different options for psf binning. They are OK

anetasie · 2019-07-12T18:03:26Z

I will add a separate issue on the access to the model evaluation on high res psf - so one can get the images of model components at the high or low resolution.

olaurino · 2019-07-12T20:01:54Z

I pushed a couple of changes. First, I renamed the pixel_size_ratio to `data_2_psf_pixel_size_ratio, which should be self-explanatory.

Secondly, I added code that rounds the pixel size ratio to the closest integer. The default tolerance is 0.1, and users can change it (see https://github.com/sherpa/sherpa/pull/607/files#diff-dd7454052b970f18eea424de5396e853R225)

Data2D does not have WCS information, so I am assuming that it is ok to consider the pixel size to be the same. This should be the equivalent of a regula application of kernel in image manipulation. When working with photographs there is no such concept as a "pixel size" and the kernle is just applied pixel-by-pixel. So I changed the test to use DataIMG instead of Data2D. The tests are failing for the same reason. However, now there is a warning issued because the PSF we created does not have any pixel size information.

Note the tests now also issue a warning, which is expected given the current implementation. Once the bug is fixed, the warning will be gone.

When the ratio is an integer use rebin_int, which is faster. Otherwise use rebin_flux. Note rebin_flux is useful when the model is using an arbitrary grid with a non-integer ratio with the data grid.

more counts where introduces by evaluating the model on a bigger grid without scaling the results to account for that. In principle we would be better off integrating the models, but that is not possible because some of the models are not available as integrated functions. I also updated PSFSpace2D to produce a better spacing between the pixels as well as decoupling the x vs y pixel size. Finally, when the PSF has a worse resolution than the data, Sherpa errors out. Tests were updated to pass now.

olaurino added this to the 4.11.1 milestone Mar 19, 2019

olaurino force-pushed the SH-48-psf-rebin branch 2 times, most recently from 8431e60 to 261c4ed Compare July 15, 2019 19:10

olaurino changed the title ~~Psf rebinning~~ PSF rebinning (fix #43) Jul 15, 2019

olaurino added 14 commits July 16, 2019 09:04

add unit test

8633d40

added object to the class declaration (had forgot about 2.7)

3647dfd

tweak unit tests

3d53365

Note the tests now also issue a warning, which is expected given the current implementation. Once the bug is fixed, the warning will be gone.

add psf space logic and tests

f334b79

add basic implementation of rebinning

b5fb99c

properly calculate pixel resolution ratio in psf2dspace

1d3f4d8

use rebin_int instead of rebin_flux when possible

aed6e91

When the ratio is an integer use rebin_int, which is faster. Otherwise use rebin_flux. Note rebin_flux is useful when the model is using an arbitrary grid with a non-integer ratio with the data grid.

use correct division in Python2

0602206

add pixel size ratio attribute to psfspace

651343a

round to the closest integer with tolerance

ad72d96

make mock assertion work across py2 and 3

a6d2d21

import future division for python 2

c1a7364

olaurino force-pushed the SH-48-psf-rebin branch from 1bf762a to 698eadf Compare July 16, 2019 13:17

olaurino added 3 commits July 16, 2019 10:13

add docstring

b61cd64

add test and remove reference to data.filter

682a4ba

refactor rebin function

63791ae

olaurino force-pushed the SH-48-psf-rebin branch from 698eadf to 63791ae Compare July 16, 2019 16:02

olaurino merged commit 63791ae into sherpa:master Jul 16, 2019

olaurino added a commit that referenced this pull request Jul 16, 2019

Merge #607 (olaurino) - PSF rebinning (fix #43)

b8a3bd0

olaurino mentioned this pull request Sep 12, 2019

warning messages with different grids #570

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSF rebinning (fix #43) #607

PSF rebinning (fix #43) #607

olaurino commented Mar 13, 2019 •

edited

olaurino commented Mar 18, 2019 •

edited

dtnguyen2 commented Mar 18, 2019

olaurino commented Mar 18, 2019

dtnguyen2 commented Mar 19, 2019

DougBurke commented Mar 19, 2019

dtnguyen2 commented Mar 20, 2019

olaurino commented Mar 20, 2019

anetasie commented Jun 7, 2019

juramaga commented Jun 7, 2019 via email

olaurino commented Jun 11, 2019

anetasie commented Jun 17, 2019

anetasie commented Jun 17, 2019

juramaga commented Jun 17, 2019

olaurino commented Jun 17, 2019

anetasie commented Jun 17, 2019

anetasie commented Jun 17, 2019

dtnguyen2 commented Jun 18, 2019 •

edited

dtnguyen2 commented Jun 18, 2019

olaurino commented Jun 18, 2019

anetasie commented Jun 19, 2019

anetasie commented Jun 19, 2019 •

edited

anetasie commented Jun 19, 2019

juramaga commented Jun 19, 2019 via email

dtnguyen2 commented Jun 19, 2019

dtnguyen2 commented Jun 20, 2019

DougBurke commented Jun 20, 2019

juramaga commented Jun 20, 2019

dtnguyen2 commented Jun 20, 2019

anetasie commented Jun 20, 2019

juramaga commented Jul 12, 2019

olaurino commented Jul 12, 2019

anetasie commented Jul 12, 2019

anetasie commented Jul 12, 2019

olaurino commented Jul 12, 2019

PSF rebinning (fix #43) #607

PSF rebinning (fix #43) #607

Conversation

olaurino commented Mar 13, 2019 • edited

Release Note

olaurino commented Mar 18, 2019 • edited

dtnguyen2 commented Mar 18, 2019

olaurino commented Mar 18, 2019

dtnguyen2 commented Mar 19, 2019

DougBurke commented Mar 19, 2019

dtnguyen2 commented Mar 20, 2019

olaurino commented Mar 20, 2019

anetasie commented Jun 7, 2019

juramaga commented Jun 7, 2019 via email

olaurino commented Jun 11, 2019

anetasie commented Jun 17, 2019

anetasie commented Jun 17, 2019

juramaga commented Jun 17, 2019

olaurino commented Jun 17, 2019

anetasie commented Jun 17, 2019

anetasie commented Jun 17, 2019

dtnguyen2 commented Jun 18, 2019 • edited

dtnguyen2 commented Jun 18, 2019

olaurino commented Jun 18, 2019

anetasie commented Jun 19, 2019

anetasie commented Jun 19, 2019 • edited

anetasie commented Jun 19, 2019

juramaga commented Jun 19, 2019 via email

dtnguyen2 commented Jun 19, 2019

dtnguyen2 commented Jun 20, 2019

DougBurke commented Jun 20, 2019

juramaga commented Jun 20, 2019

dtnguyen2 commented Jun 20, 2019

anetasie commented Jun 20, 2019

juramaga commented Jul 12, 2019

olaurino commented Jul 12, 2019

anetasie commented Jul 12, 2019

anetasie commented Jul 12, 2019

olaurino commented Jul 12, 2019

olaurino commented Mar 13, 2019 •

edited

olaurino commented Mar 18, 2019 •

edited

dtnguyen2 commented Jun 18, 2019 •

edited

anetasie commented Jun 19, 2019 •

edited