# Advanced Event 'Loops'

Iterating in python is relatively slow, though it can be useful for prototyping.

First, lets declare a projection function that takes an event and projects out the neutrino energy from it

In [1]:
import pyNUISANCE as pn
import pyProSelecta as pps

evs = pn.EventSource("dune_argon_sf_10mega.nuwro.pb.gz")
if not evs:
    print("Error: failed to open input file")

def enu_GeV(ev):
    bpart = pps.sel.Beam(ev,14)
    if bpart:
        return bpart.momentum().e() * 1E-3 # events are always with MeV momentum units
    return -0

print("Found muon neutrino beam particle with %.2f GeV of energy" % enu_GeV(evs.first()[0]))

Found muon neutrino beam particle with 2.27 GeV of energy


## DataFrames

NUISANCE provides the `FrameGen` facility for defining functional event loops and then executing them in batch. Lets see an example of how it works. We include the `limit` call to stop the internal loop running over the entire file.

In [2]:
print(pn.FrameGen(evs).limit(10).all())

 --------------
 | evt# | cvw |
 --------------
 |    0 |   1 |
 |    1 |   1 |
 |    2 |   1 |
 |    3 |   1 |
 |    4 |   1 |
 |    5 |   1 |
 |    6 |   1 |
 |    7 |   1 |
 |    8 |   1 |
 |    9 |   1 |
 --------------


### New Columns
The Frame returned from `FrameGen.all` always contains the event number and the CV weight for all processed events. These are a useful start, but we can define a new column to hold the neutrino energy for each event.

In [3]:
print(pn.FrameGen(evs).limit(10).add_column("enu_GeV",enu_GeV).all())

 ------------------------
 | evt# | cvw | enu_GeV |
 ------------------------
 |    0 |   1 |   2.275 |
 |    1 |   1 |    14.3 |
 |    2 |   1 |    2.86 |
 |    3 |   1 |   3.728 |
 |    4 |   1 |    9.08 |
 |    5 |   1 |   3.237 |
 |    6 |   1 |   2.473 |
 |    7 |   1 |   1.916 |
 |    8 |   1 |   1.988 |
 |    9 |   1 |   3.671 |
 ------------------------


## Filters

We can apply filters in a similar way to the batched loop

In [4]:
print(pn.FrameGen(evs).limit(50).filter(lambda x : not (x.event_number() % 5)).add_column("enu_GeV",enu_GeV).all())

 ------------------------
 | evt# | cvw | enu_GeV |
 ------------------------
 |    0 |   1 |   2.275 |
 |    5 |   1 |   3.237 |
 |   10 |   1 |   2.506 |
 |   15 |   1 |   2.682 |
 |   20 |   1 |   1.528 |
 |   25 |   1 |   3.214 |
 |   30 |   1 |   3.033 |
 |   35 |   1 |   2.014 |
 |   40 |   1 |   14.87 |
 |   45 |   1 |   1.661 |
 ------------------------


## Chunked Processing

For long-running processes that would produce very large data frames it might be better to do secondary processing on chunks of the full data frame rather than waiting for the whole thing to be ready. Internally, `FrameGen::all` calls `FrameGen::first` and then `FrameGen::next` until there are no more events in the input event stream to process to new rows. We can steer this chunked processing loop manually in the python. 

The chunk size can be set with the second parameter to the FrameGen constructor and it defaults to 50,000.

In [5]:
chunk_size = 100
fg = pn.FrameGen(evs,chunk_size).limit(4*chunk_size)

chunk = fg.first()
nrows = 0
while chunk.rows() > 0:
    nrows += chunk.rows()
    print("fetched %s new rows, total fetched: %s" % (chunk.rows(), nrows))
    chunk = fg.next()

print("processed %s rows in total" % nrows)

fetched 100 new rows, total fetched: 100
fetched 100 new rows, total fetched: 200
fetched 100 new rows, total fetched: 300
fetched 100 new rows, total fetched: 400
processed 400 rows in total


**N.B.** `FrameGen::limit` limits the number of events read from the event stream, but the `chunk_size` limits the number of rows returned per call.

In [6]:
chunk_size = 74
fg = pn.FrameGen(evs,chunk_size).filter(lambda x : not (x.event_number() % 3)).limit(10*chunk_size)

chunk = fg.first()
nrows = 0
while chunk.rows() > 0:
    nrows += chunk.rows()
    print("fetched %s new rows, total fetched: %s" % (chunk.rows(), nrows))
    chunk = fg.next()

print("processed %s rows in total, total events read %s" % (nrows,chunk.nevents()))

fetched 74 new rows, total fetched: 74
fetched 74 new rows, total fetched: 148
fetched 74 new rows, total fetched: 222
fetched 25 new rows, total fetched: 247
processed 247 rows in total, total events read 740


**N.B.B** If you request a chunk, you must always process every row in the chunk before breaking out of your processing loop early, or the normalization information from the last chunk will not correspond to the number of rows that you processed.

### A Short Race

Lets see if there is any appreciable difference in the looping speed between a python loop and a C++ loop:

In [7]:
import time

time_start = time.perf_counter()
for i, (ev, cvw) in enumerate(evs):
    enu_GeV(ev)
time_end = time.perf_counter()
pyloop_elapsed = (time_end-time_start)
print("pure python event loop took %.2fs" % pyloop_elapsed)

pure python event loop took 145.62s


In [8]:
time_start = time.perf_counter()
pn.FrameGen(evs).add_column("enu_GeV",enu_GeV).all()
time_end = time.perf_counter()
fg_pyfunc_elapsed = (time_end-time_start)
print("FrameGen with a python function took %.2fs" % fg_pyfunc_elapsed)

FrameGen with a python function took 226.45s


So that it actually slower than doing the loop in pure python! This is most likely due to the overheads of calling the python function on every event from the C++ side.

### ProSelecta

We can use ProSelecta to write and JIT compile C++ functions and then use the JITted functions to create new columns

In [9]:
pps.load_text("""
double enu_GeV(HepMC3::GenEvent const &ev){
  auto bpart = ps::sel::Beam(ev,14);
  if(bpart) return bpart->momentum().e()*1E-3;
  return -0;
}
""")
enu_GeV_cpp = pps.project.enu_GeV

In [10]:
time_start = time.perf_counter()
pn.FrameGen(evs).add_column("enu_GeV",enu_GeV_cpp).all()
time_end = time.perf_counter()
fg_ps_elapsed = (time_end-time_start)
print("FrameGen with ProSelecta took %.2fs" % fg_ps_elapsed)

FrameGen with ProSelecta took 0.02s


In [11]:
time_start = time.perf_counter()
pn.FrameGen(evs).all()
time_end = time.perf_counter()
fg_noop_elapsed = (time_end-time_start)
print("FrameGen with no-op took %.2fs" % fg_noop_elapsed)

FrameGen with no-op took 0.02s


So the majority of the time spent is reading the events off disk

In [12]:
pyloop_elapsed -= fg_noop_elapsed
fg_pyfunc_elapsed -= fg_noop_elapsed
fg_ps_elapsed -= fg_noop_elapsed
print("IO Corrected: Pure Python %.2fs" % pyloop_elapsed)
print("IO Corrected: FrameGen with Python event processor %.2fs" % fg_pyfunc_elapsed)
print("IO Corrected: FrameGen with ProSelecta JIT compiled function %.2fs" % fg_ps_elapsed)

IO Corrected: Pure Python 145.60s
IO Corrected: FrameGen with Python event processor 226.43s
IO Corrected: FrameGen with ProSelecta JIT compiled function 0.00s
