# Advanced Event 'Loops'

Iterating in python is relatively slow, though it can be useful for prototyping.

First, lets declare a projection function that takes an event and projects out the neutrino energy from it

In [1]:
import pyNUISANCE as pn

evs = pn.EventSource("dune_argon_sf_10mega.nuwro.pb.gz")
if not evs:
    print("Error: failed to open input file")

# just get the first event so that we can check the units
ev,_ = evs.first()
ToGeV = 1 if (ev.momentum_unit() == pn.hm.HepMC3.Units.GEV) else 1E-3

def enu_GeV(ev):
    bpart = pn.pps.sel.Beam(ev,14)
    if bpart:
        return bpart.momentum().e()*ToGeV
    return -0

print("Found muon neutrino beam particle with %.2f GeV of energy" % enu_GeV(ev))

[2024-02-29 09:53:30.475] [info] Found eventinput plugin: /root/software/NUISANCEMC/eventinput/build/Linux/lib/plugins/nuisplugin-eventinput-GHEP3.so
[2024-02-29 09:53:30.503] [info] Found eventinput plugin: /root/software/NUISANCEMC/eventinput/build/Linux/lib/plugins/nuisplugin-eventinput-NuWroevent1.so
[2024-02-29 09:53:30.509] [info] Found eventinput plugin: /root/software/NUISANCEMC/eventinput/build/Linux/lib/plugins/nuisplugin-eventinput-neutvect.so
[2024-02-29 09:53:30.511] [info] EventSourceFactory: PathResolver::resolve filepath: dune_argon_sf_10mega.nuwro.pb.gz, exists: true
[2024-02-29 09:53:30.689] [info] Reading file dune_argon_sf_10mega.nuwro.pb.gz with native HepMC3EventSource
Found muon neutrino beam particle with 2.27 GeV of energy


## DataFrames

NUISANCE provides the `FrameGen` facility for defining functional event loops and then executing them in batch. Lets see an example of how it works. We include the `limit` call to stop the internal loop running over the entire file.

In [3]:
print(pn.FrameGen(evs).limit(10).all())

 --------------
 | evt# | cvw |
 --------------
 |    0 |   1 |
 |    1 |   1 |
 |    2 |   1 |
 |    3 |   1 |
 |    4 |   1 |
 |    5 |   1 |
 |    6 |   1 |
 |    7 |   1 |
 |    8 |   1 |
 |    9 |   1 |
 --------------


### New Columns
The Frame returned from `FrameGen.all` always contains the event number and the CV weight for all processed events. These are a useful start, but we can define a new column to hold the neutrino energy for each event.

In [4]:
print(pn.FrameGen(evs).limit(10).add_column("enu_GeV",enu_GeV).all())

 ------------------------
 | evt# | cvw | enu_GeV |
 ------------------------
 |    0 |   1 |   2.275 |
 |    1 |   1 |    14.3 |
 |    2 |   1 |    2.86 |
 |    3 |   1 |   3.728 |
 |    4 |   1 |    9.08 |
 |    5 |   1 |   3.237 |
 |    6 |   1 |   2.473 |
 |    7 |   1 |   1.916 |
 |    8 |   1 |   1.988 |
 |    9 |   1 |   3.671 |
 ------------------------


## Filters

We can apply filters in a similar way to the batched loop

In [5]:
print(pn.FrameGen(evs).limit(50).filter(lambda x : not (x.event_number() % 5)).add_column("enu_GeV",enu_GeV).all())

 ------------------------
 | evt# | cvw | enu_GeV |
 ------------------------
 |    0 |   1 |   2.275 |
 |    5 |   1 |   3.237 |
 |   10 |   1 |   2.506 |
 |   15 |   1 |   2.682 |
 |   20 |   1 |   1.528 |
 |   25 |   1 |   3.214 |
 |   30 |   1 |   3.033 |
 |   35 |   1 |   2.014 |
 |   40 |   1 |   14.87 |
 |   45 |   1 |   1.661 |
 ------------------------


### A Short Race

Lets see if there is any appreciable difference in the looping speed between a python loop and a C++ loop:

In [6]:
import time

time_start = time.perf_counter()
for i, (ev, cvw) in enumerate(evs):
    if i >= 1E6:
        break
    enu_GeV(ev)
time_end = time.perf_counter()
print("event loop took %.2fs" % (time_end-time_start))

event loop took 19.99s


In [7]:
time_start = time.perf_counter()
pn.FrameGen(evs).limit(int(1E6)).add_column("enu_GeV",enu_GeV).all()
time_end = time.perf_counter()
print("FrameGen took %.2fs" % (time_end-time_start))

FrameGen took 22.38s


So that it actually slower than doing it directly in python! Probably due to the overheads of calling the python function from the C++ side.

### ProSelecta

We can use ProSelecta to write and JIT C++ functions and then use the JITted functions to create new columns

In [8]:
pn.pps.load_text("""
double enu_GeV(HepMC3::GenEvent const &ev){
  auto bpart = ps::sel::Beam(ev,14);
  if(bpart) return bpart->momentum().e()*%f;
  return -0;
}
""" % ToGeV)
enu_GeV_cpp = pn.pps.project.enu_GeV

In [9]:
time_start = time.perf_counter()
pn.FrameGen(evs).limit(int(1E6)).add_column("enu_GeV",enu_GeV_cpp).all()
time_end = time.perf_counter()
print("FrameGen with ProSelecta took %.2fs" % (time_end-time_start))

FrameGen with ProSelecta took 17.88s


Not much faster... probably could do with some profiling and optimization