# Extraction Example
Author: Michael Larson
Last Update: 24 April 2023

This repository is intended to serve as an example for anyone who needs to build an effective area curve. This script is a minimal working example used to extract information from i3 files generated by neutrino-generator or GENIE.

In [1]:
import os, sys, glob
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
from tqdm.notebook import tqdm

import icecube
from icecube import dataclasses, dataio, icetray

## Find some files you'd like to generate effective areas from
This can be any set of NuGen or GENIE files. 

In [2]:
oscnext_dir = "/data/ana/LE/oscNext/pass2/genie/level7_flercnn/"
run = "140000"

# Let's just grab the first 100 files.
files = sorted(glob.glob(os.path.join(oscnext_dir, run, "*")))[:100]

## Grab the OneWeight values
Here I'll define a function to extract the "oneweight" value with it's appropriate scale factors. Note that the 0.7 and 0.3 are specific to older GENIE MC. GENIE files that are produced via genie-reader do NOT need these factors (ie, they're 1.0 for both). NuGen files that don't have `OneWeightPerType` use factors of 0.5 for both.

In [3]:
def get_oneweight(frame):
    mcwd = frame['I3MCWeightDict']
    
    # These are newer NuGen files if this is found. It directly includes
    # the fraction of nu and nubar in the OneWeightPerType value, so we 
    # don't need to guess what the right values are.
    if 'OneWeightPerType' in mcwd:
        oneweight_pertype = mcwd['OneWeightPerType']
    
    # No luck. We'll have to guess at some factors, but that shouldn't be
    # too difficult.
    else:
        oneweight_pertype = mcwd['OneWeight']

        if 'GENIEResultsDict' in frame.keys():
            nu = dataclasses.get_most_energetic_neutrino(frame['I3MCTree'])
            if nu.pdg_encoding > 0: oneweight_pertype /= 0.7  # Generated with 70% nu
            else:                   oneweight_pertype /= 0.3  # and 30% nubar
        else:
            oneweight_pertype /= 0.5  # Old NuGen is generated with equal nu and nubar

    return oneweight_pertype / mcwd['NEvents'] / len(files)

## And now we do a general loop over i3 files
This is just an example. You could do this in a more efficient way by looking at hdf5 files instead, but here I just show a simple loop over frames.

I'll be storing the results in a numpy structured array since it's easy to save. You could just as easily use pandas or hdf if you choose.

In [4]:
ow, energy, dec = [], [], []

# Loop over the files
for f in tqdm(files):
    i3file = dataio.I3File(f)
    
    # And loop through frames for this file
    while i3file.more():
        frame = i3file.pop_frame()
        if frame.Stop != icetray.I3Frame.Physics: continue
        
        # Get the relevant information from this frame
        nu = dataclasses.get_most_energetic_neutrino(frame['I3MCTree'])
        ow.append(get_oneweight(frame))
        energy.append(nu.energy)
        dec.append(np.pi/2 - nu.dir.zenith) # Calculate zenith to declination

# Convert from basic python lists to a numpy structured array
recarr = np.array(list(zip(energy, dec, ow)), 
                  dtype = [('trueE', float), ('trueDec', float), ('ow', np.float64)])

HBox(children=(HTML(value=''), FloatProgress(value=0.0), HTML(value='')))




## And save the results.

In [5]:
print(f"Number of events found: {recarr.shape}")
np.save(f"oscnext_{run}.npy", recarr)

Number of events found: (304651,)
