Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup from_lc further #757

Merged
merged 4 commits into from Sep 20, 2023
Merged

Speedup from_lc further #757

merged 4 commits into from Sep 20, 2023

Conversation

matteobachetti
Copy link
Member

@matteobachetti matteobachetti commented Sep 20, 2023

Working further on #756, I came up with a faster way to allocate an event list from a list of counts.

Here I compare the performance of the two methods used now (using an iterator) and before #756 (using a list) with two new methods. One using simple numpy array, preallocated, and the other using Numba to further cut on loop execution speed.

In [1]: from numba import njit
   ...: from stingray import Lightcurve, EventList
   ...: import numpy as np
   ...: 
   ...: def from_lc(lc):
   ...:         # Multiply times by number of counts
   ...:         times = ([i] * int(j) for i, j in zip(lc.time, lc.counts))
   ...:         # Concatenate all lists
   ...:         times = list(i for j in times for i in j)
   ...: 
   ...:         return EventList(time=times, gti=lc.gti)
   ...: 
   ...: def from_lc_list(lc):
   ...:         # Multiply times by number of counts
   ...:         times = [[i] * int(j) for i, j in zip(lc.time, lc.counts)]
   ...:         # Concatenate all lists
   ...:         times = list(i for j in times for i in j)
   ...: 
   ...:         return EventList(time=times, gti=lc.gti)
   ...: def from_lc_array(lc):
   ...:         times = np.zeros(np.sum(lc.counts), dtype=float)
   ...:         last = 0
   ...:         for t, c in zip(lc.time, lc.counts):
   ...:             times[last:c + last] = t
   ...:             last = c + last
   ...:         return EventList(time=times, gti=lc.gti)
   ...: 
   ...: 
   ...: @njit
   ...: def _from_lc_numba(times, counts, empty_times):
   ...:         last = 0
   ...:         for t, c in zip(times, counts):
   ...:             val = c + last
   ...:             empty_times[last:val] = t
   ...:             last = val
   ...:         return empty_times
   ...: def from_lc_numba(lc):
   ...:     times = _from_lc_numba(lc.time, lc.counts, np.zeros(np.sum(lc.counts), dtype=float))
   ...:     EventList(time=times, gti=lc.gti)
   ...: 
   ...: counts = np.random.poisson(10, 100000)
   ...: times = np.arange(0, 100000, 1)
   ...: lc = Lightcurve(times, counts)
   ...: %time from_lc_numba(lc)
   ...: %time from_lc_array(lc)
   ...: %time from_lc(lc)
   ...: %time from_lc_list(lc)
   ...: 
CPU times: user 50.5 ms, sys: 1.53 ms, total: 52 ms
Wall time: 52 ms
CPU times: user 34.3 ms, sys: 888 µs, total: 35.2 ms
Wall time: 35.2 ms
CPU times: user 60.7 ms, sys: 2.63 ms, total: 63.3 ms
Wall time: 63.4 ms
CPU times: user 119 ms, sys: 4.53 ms, total: 124 ms
Wall time: 124 ms
Out[1]: <stingray.events.EventList at 0x10484e020>

So far, the simple numpy implementation seems faster, and comparable to the iterator solution. But the big change happens after the first execution:

In [2]: counts = np.random.poisson(10, 100000)
   ...: lc = Lightcurve(times, counts)  # New light curve, new counts. No caching involved
   ...: %time from_lc_numba(lc)
   ...: %time from_lc_array(lc)
   ...: %time from_lc(lc)
   ...: %time from_lc_list(lc)
CPU times: user 3.37 ms, sys: 4.85 ms, total: 8.22 ms
Wall time: 6.91 ms 
CPU times: user 43.6 ms, sys: 1.82 ms, total: 45.4 ms
Wall time: 45.4 ms
CPU times: user 61.3 ms, sys: 3.26 ms, total: 64.6 ms
Wall time: 64.7 ms
CPU times: user 116 ms, sys: 4.54 ms, total: 121 ms
Wall time: 121 ms
Out[2]: <stingray.events.EventList at 0x103dc6b30>

Here, the numba-compiled version is more than 10 times faster than the numpy-only solution, and almost 20 times faster than the iterator

In general, the performance push with respect to the iterators is maintained for larger arrays and higher numbers of counts. The numpy-only solution can be faster than Numba with large number of counts and fewer bins:

In [5]: counts = np.random.poisson(1000, 10000)
   ...: times = np.arange(0, 10000, 1)
   ...: lc = Lightcurve(times, counts)

In [6]: %time from_lc_numba(lc)
   ...: %time from_lc_array(lc)
   ...: %time from_lc(lc)
   ...: %time from_lc_list(lc)
CPU times: user 5.79 ms, sys: 15.9 ms, total: 21.7 ms
Wall time: 26.7 ms
CPU times: user 9.95 ms, sys: 7.52 ms, total: 17.5 ms
Wall time: 17.1 ms
CPU times: user 452 ms, sys: 20.3 ms, total: 472 ms
Wall time: 472 ms
CPU times: user 488 ms, sys: 31.2 ms, total: 519 ms
Wall time: 520 ms
Out[6]: <stingray.events.EventList at 0x106afcca0>

the Numba solution wins (by a lot) for very long light curves with relatively few counts.

In [11]: counts = np.random.poisson(10, 10000000)
    ...: times = np.arange(0, 10000000, 1)
    ...: lc = Lightcurve(times, counts)

In [12]: %time from_lc_numba(lc)
    ...: %time from_lc_array(lc)
    ...: %time from_lc(lc)
    ...: %time from_lc_list(lc)
CPU times: user 156 ms, sys: 87.9 ms, total: 244 ms
Wall time: 261 ms
CPU times: user 3.27 s, sys: 81 ms, total: 3.35 s
Wall time: 3.35 s
CPU times: user 5.97 s, sys: 263 ms, total: 6.24 s
Wall time: 6.24 s
CPU times: user 10 s, sys: 595 ms, total: 10.6 s
Wall time: 10.7 s
Out[12]: <stingray.events.EventList at 0x15a743880>

@codecov
Copy link

codecov bot commented Sep 20, 2023

Codecov Report

Merging #757 (b0c30f5) into main (c6e71d5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #757   +/-   ##
=======================================
  Coverage   97.13%   97.14%           
=======================================
  Files          42       42           
  Lines        7932     7941    +9     
=======================================
+ Hits         7705     7714    +9     
  Misses        227      227           
Files Changed Coverage Δ
stingray/events.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@mgullik mgullik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matteobachetti,

very interesting upgrade to generate an EventList from the lightcurve.
Thanks!

@mgullik mgullik added this pull request to the merge queue Sep 20, 2023
Merged via the queue into main with commit c89490b Sep 20, 2023
15 checks passed
@matteobachetti matteobachetti deleted the speedup_to_lc_further branch September 20, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants