`Spectrum` rebinning #29

bplimley · 2017-03-08T21:55:09Z

Rebinning according to an input bin edges vector
rebin_like(spec) to use the binning from another spectrum

Please expand on this, I'm not very familiar with rebinning methods or needs.

The text was updated successfully, but these errors were encountered:

markbandstra · 2017-03-10T23:41:14Z

@thjoshi and @jccurtis might have some thoughts on this. There are many ways one could do this, off the top of my head:

Preserve bin edges: Require the preservation of Poisson statistics by not splitting any bins. This of course which would highly restrict the possible rebinnings one could do, e.g., spectrum of 1024 bins can be downsampled to 512 bins by combining adjacent bins.
Interpolation: Allow counts to be split into fractions that are moved into the new bins. Spectrum is converted from array of ints to array of floats.
Stochastic rebinning: Convert the histogram to list-mode energies by sampling energies uniformly from each bin, then bin those energies into the new binning. The statistical properties are largely preserved:
- Tsai, Mucciolo, & Helene, NIM A 345 538-540 (1994)
- Helene & Moralles, MIN A 378, 624-626 (1996)

Someone else might know more about what is practically and mathematically desirable.

This comment was moved over from #28 and expanded. Thanks @chunhochow for the references and finding a relevant discussion in Knoll, 4th Edition, 18.IV.B.

chunhochow · 2017-03-10T23:49:53Z

Also found this: https://github.com/jhykes/rebin, which claims to implement the (interpolation) routine described in Knoll (4th ed) 18.IV.B. It has a general copyleft license (attribute + copyright notice + disclaimer).

jccurtis · 2017-03-13T20:17:40Z

@thjoshi - I'll put a version of the interpolation rebin from GRPPinto bq unless you have objection.

bplimley · 2017-04-05T21:26:59Z

I will try to start using Bq for the FERMI project if we can get this feature soon. @jccurtis @thjoshi can we start putting this together? I can help integrate code if needed.

jccurtis · 2017-04-06T00:39:04Z

I'll give it a quick shot tomorrow.

bplimley · 2017-04-10T22:20:32Z

If a spectrum is rebinned with arbitrary (unequal) bin widths, you can't assign a calibration curve based on those new bins, right? In contrast to bins that are equal width (in channel or energy) which just scale a variable in the calibration curve.

For arbitrary bin widths, it seems to me we would either abandon the idea of a "curve" and just keep track of the bin edges in keV, or remember the original bins that the calibration curve applies to.

Looking forward to seeing the GRPP version when you have a chance @jccurtis and the code will help address my questions/thoughts.

jccurtis · 2017-04-12T00:26:00Z

I've created a new branch (feature-rebinning) to integrate the interpolation rebinning functionality as written by @thjoshi. I added 2 tests (one for count summation) and another for plotting). I also added a tool to spectrum.py to turn bin edges and heights (counts) to x/y for stepped plotting. I will modify the rebin function with numba but it is already pretty fast (and definitely enough for RadWatch's use).

@thjoshi - Please advise if you think usage of this code is inappropriate or poses another conflict. I think we should squash the commits before merging to remove the code which is currently a blatant rip from GRPP.

bplimley · 2017-04-18T00:42:56Z

The counts aren't quite lining up. The relative difference in total counts ranges from about 2e-5 to 2e-3. I'm documenting here for reference.

# some rearranging has been done for clarity
In [1]: import becquerel as bq   # on feature-rebinning branch
In [2]: import numpy as np

In [3]: old_data = np.random.poisson(lam=10, size=8192).astype(int)
In [4]: old_edges = np.arange(8193) - 0.5
In [5]: old_edges[:3]
Out[5]: array([-0.5,  0.5,  1.5])
In [6]: old_edges[-3:]
Out[6]: array([ 8189.5,  8190.5,  8191.5])
In [7]: new_edges = np.arange(-1, 8195, 0.3)
In [8]: new_data = bq.core.rebin(old_data, old_edges, new_edges)
In [10]: np.sum(old_data)
Out[10]: 82572.0
In [11]: np.sum(new_data)
Out[11]: 82573.600000000006
In [17]: (np.sum(new_data) - np.sum(old_data)) / np.sum(old_data)
Out[17]: 1.9377028532745009e-05

In [12]: old_counts = np.random.poisson(lam=50, size=8192)
In [13]: new_counts = bq.core.rebin(old_counts, old_edges, new_edges)
In [14]: np.sum(old_counts)
Out[14]: 408267
In [15]: np.sum(new_counts)
Out[15]: 408276.59999999998
In [16]: (np.sum(new_counts) - np.sum(old_counts)) / np.sum(old_counts)
Out[16]: 2.3514023910766036e-05

In [19]: old_counts = np.random.poisson(lam=5000, size=8192)
In [20]: new_counts = bq.core.rebin(old_counts, old_edges, new_edges)
In [23]: np.sum(old_counts), np.sum(new_counts)
Out[23]: (40961581, 40962619.199999996)
In [21]: (np.sum(new_counts) - np.sum(old_counts)) / np.sum(old_counts)
Out[21]: 2.5345701377969997e-05

In [24]: old_counts = np.random.poisson(lam=5000, size=256)
In [25]: old_edges = np.arange(257) - 0.5
In [26]: new_edges = np.arange(-1, 267, 1.3)
In [27]: new_counts = bq.core.rebin(old_counts, old_edges, new_edges)
In [28]: np.sum(old_counts), np.sum(new_counts)
Out[28]: (1279119, 1281618.0)
In [29]: (np.sum(new_counts) - np.sum(old_counts)) / np.sum(old_counts)
Out[29]: 0.0019536884371196112

The test is also not passing because of this. At least on my computer - are you getting a passing test @jccurtis?

jccurtis · 2017-04-18T00:47:53Z

@bplimley The test passes for me. I'll need to look at this tomorrow.

jccurtis · 2017-04-18T01:05:03Z

Hmmm, I just ran my tests again and they failed. Some trickery is afoot!

bplimley · 2017-04-18T18:18:41Z

I just pushed more thorough parametrized tests for rebin. It seems to be fine when the first and last edge values of the new bins are identical to the first and last edge values of the old bins. But when the new bins cover a larger range, they get more counts somehow.

thjoshi · 2017-04-18T20:09:39Z

Yo, negative value for new_edges is the problem.
This could be fixed with a small modification to the code. @jccurtis has volunteered to solve that.

jccurtis · 2017-05-05T18:08:49Z

I've updated rebin.py to resolve the previously noted bug. Another issue is the case of overflow/underflow bins for output bins that don't fully cover the input bin structure. I could add a flag which optionally handles this. Thoughts?

jccurtis · 2017-05-05T18:10:44Z

I also have a couple more edge cases to flush out which I'll get to next week.

bplimley · 2017-05-05T18:35:33Z

The overflow/underflow bins sound useful. What would be default? No overflow/underflow?

If we are unsure how we want to implement it, we can also make it an issue and postpone it for later.

@jccurtis do you want to make a pull request so we can take a look?

jccurtis · 2017-05-05T18:37:49Z

I'll make a couple more additions and then make the PR. I don't want to burn through Travis builds.

jccurtis · 2017-05-11T00:07:04Z

I've updated the rebin code with numba, docs and a 2D option for multiple spectra. I still need to flush out some bugs for underflow/overflow edge cases

chunhochow · 2017-05-19T17:42:18Z

np.searchsorted() may be faster than

while in_edges[in_idx] < out_edges[out_idx]:
    in_idx += 1
    in_idx -= 1

?

searchsorted is numba-compatible. I feel that since in_idx moves slowly it probably would be faster to do it the current way (benchmarks needed), but searchsorted() should definitely be faster for the initial guess rather than the current in_idx = 1.

becquerel/becquerel/core/rebin.py

Lines 63 to 65 in 3d290e9

    
           while in_edges[in_idx] < out_edges[out_idx]: 
        
               in_idx += 1 
        
           in_idx -= 1

chunhochow · 2017-05-22T19:19:02Z

I rewrote @thjoshi / @jccurtis 's code. I need help merging the pytest files though because they've diverged too much and I have no clue what's going on in there.

======

I also looked into stochastic rebinning and the two papers referenced in Knoll. See Mark's comment which is actually inaccurate.

There's two stochastic methods here:
listmode-method, AKA Mark's method is described above in the comment.
knoll/sao-paolo-method (since both papers are from Sao Paolo I'm christening it that) is to take the interpolation exact result and sampling from that, while still conserving the total counts.

The paper says that the sao-paolo-method should preserve statistical properties of the bin-counts (e.g. Poisson errors). The deterministic interpolation/Tenzing-Joey's method would give a smaller chi-squared value because the statistical fluctuation is less.

I'm not sure/haven't thought out if Mark's method preserves the statistical properties. Or whether the sao-paolo-method solves the moiré issue when the in/out bin widths are different.

bplimley · 2017-05-23T20:17:26Z

Thanks @chunhochow! I haven't had the time to finish absorbing what you're saying about the methods. But if you have a bunch of rebin code that you are happy with, can you make a pull request? (Despite the impending Travis build crisis, which isn't really a crisis, just run tests yourself before pushing or merging)

chunhochow · 2017-05-24T16:59:36Z

It should probably be cleaned up a bit and tests written before it be pul requested.

chunhochow · 2017-09-12T19:56:39Z

Todo/Moved:

[Rebin: Add binomial sampling option #90] Binomial sampling option (keyword argument)
[Rebin: auto fit linear spline (keyword arg) #89] auto fit linear spline (keyword arg)
[Rebin: auto fit higher-order spline? (with a Sav-Gol filter?) #88] auto fit higher-order spline? (with a Sav-Gol filter?)
write tests

thjoshi · 2018-08-21T16:59:33Z

@chunhochow @jccurtis when is rebinning coming to Spectrum? I was trying to use Becquerel last week for a project and was quite surprised to find that any form of rebinning wasn't in the primary branch. If additional features come later that makes sense, but in my view there should be a basic functionality already.

bplimley added core new feature help wanted labels Mar 8, 2017

bplimley mentioned this issue Mar 8, 2017

Addition and subtraction of calibrated spectra #28

Closed

jccurtis self-assigned this Apr 12, 2017

jccurtis added a commit that referenced this issue Apr 12, 2017

Added bq.core.rebin, bq.core.bin_edges_and_heights_to_steps (#29)

a100ef9

bplimley mentioned this issue Apr 14, 2017

Counts, CPS, CPS/keV #38

Closed

chunhochow closed this as completed May 24, 2017

chunhochow reopened this May 24, 2017

jccurtis assigned chunhochow Aug 29, 2017

jccurtis mentioned this issue Sep 12, 2017

Feature rebinning #78

Merged

13 tasks

jccurtis added the priority label Feb 13, 2018

jccurtis pushed a commit that referenced this issue Apr 9, 2018

[#29] Merge branch 'feature-rebinning' into quickstart-notebook

a854c02

jccurtis pushed a commit that referenced this issue Apr 9, 2018

[#29 #78] Spectrum.rebin only works if counts data present

34071b6

jccurtis pushed a commit that referenced this issue Apr 9, 2018

[#29 #78] Added Spectrum.rebin_like

60844e8

jccurtis pushed a commit that referenced this issue Apr 9, 2018

[#29 #78] Merge branch 'feature-rebinning' into quickstart-notebook

e795c6c

markbandstra mentioned this issue Jul 3, 2018

Version 0.2 Checklist #120

Closed

10 tasks

jccurtis added this to the v0.2.0 milestone Jul 10, 2018

jccurtis closed this as completed in #78 Nov 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Spectrum` rebinning #29

`Spectrum` rebinning #29

bplimley commented Mar 8, 2017

markbandstra commented Mar 10, 2017 •

edited

Loading

chunhochow commented Mar 10, 2017 •

edited

Loading

jccurtis commented Mar 13, 2017

bplimley commented Apr 5, 2017

jccurtis commented Apr 6, 2017 via email •

edited

Loading

bplimley commented Apr 10, 2017 •

edited

Loading

jccurtis commented Apr 12, 2017 •

edited

Loading

bplimley commented Apr 18, 2017

jccurtis commented Apr 18, 2017

jccurtis commented Apr 18, 2017

bplimley commented Apr 18, 2017

thjoshi commented Apr 18, 2017

jccurtis commented May 5, 2017

jccurtis commented May 5, 2017

bplimley commented May 5, 2017

jccurtis commented May 5, 2017 via email •

edited

Loading

jccurtis commented May 11, 2017

chunhochow commented May 19, 2017

chunhochow commented May 22, 2017 •

edited

Loading

bplimley commented May 23, 2017

chunhochow commented May 24, 2017

chunhochow commented Sep 12, 2017 •

edited

Loading

thjoshi commented Aug 21, 2018

Spectrum rebinning #29

Spectrum rebinning #29

Comments

bplimley commented Mar 8, 2017

markbandstra commented Mar 10, 2017 • edited Loading

chunhochow commented Mar 10, 2017 • edited Loading

jccurtis commented Mar 13, 2017

bplimley commented Apr 5, 2017

jccurtis commented Apr 6, 2017 via email • edited Loading

bplimley commented Apr 10, 2017 • edited Loading

jccurtis commented Apr 12, 2017 • edited Loading

bplimley commented Apr 18, 2017

jccurtis commented Apr 18, 2017

jccurtis commented Apr 18, 2017

bplimley commented Apr 18, 2017

thjoshi commented Apr 18, 2017

jccurtis commented May 5, 2017

jccurtis commented May 5, 2017

bplimley commented May 5, 2017

jccurtis commented May 5, 2017 via email • edited Loading

jccurtis commented May 11, 2017

chunhochow commented May 19, 2017

chunhochow commented May 22, 2017 • edited Loading

bplimley commented May 23, 2017

chunhochow commented May 24, 2017

chunhochow commented Sep 12, 2017 • edited Loading

thjoshi commented Aug 21, 2018

`Spectrum` rebinning #29

`Spectrum` rebinning #29

markbandstra commented Mar 10, 2017 •

edited

Loading

chunhochow commented Mar 10, 2017 •

edited

Loading

jccurtis commented Apr 6, 2017 via email •

edited

Loading

bplimley commented Apr 10, 2017 •

edited

Loading

jccurtis commented Apr 12, 2017 •

edited

Loading

jccurtis commented May 5, 2017 via email •

edited

Loading

chunhochow commented May 22, 2017 •

edited

Loading

chunhochow commented Sep 12, 2017 •

edited

Loading