In [1]:
import sys
sys.path.append("..")
from IPython.display import display

In [2]:
import numpy as np
import logging
import pickle
%matplotlib inline

In [3]:
# from scripts import linkageList

# BINARY TREE FEATURES

In this notebook we detail the structure of the jet dictionaries created with the Toy Generative Model for jets that can be accessed below:

[Toy Generative Model for jets](https://github.com/SebastianMacaluso/ToyJetsShower)

In [4]:
#Data dir
input_dir= 'data/truth'
# input_dir ="../ToyJetsShower/data/"

In [5]:
start =1
end  =2
Njets = 100
root_dir = "data/truth/"
filename = "tree_" + str(Njets) + "_truth_"


for i in range(start, end):

    with open(root_dir+filename+ str(i) + ".pkl", "rb") as fd:
        jetsList = pickle.load(fd, encoding='latin-1')



In [6]:
# Choose 1 jet from the list
jet = jetsList[0]

jet["algorithm"]="truth"


## JET DICTIONARY STRUCTURE

BASIC FEATURES

- jet["root_id"]: root node id of the tree

- jet["content"]: list with the tree nodes (particles) momentum vectors. For the ToyJetsShower we consider a 2D model, so we have (py,pz), with pz the direction of the beam axis

- jet["tree"]: list with the tree structure. Each entry contains a list with the [left,right] children of a node. If [-1,-1] then the node is a leaf. 
    for node_id=ii
    jet["content"][ii]=[py,pz] for node ii
    jet["tree"][ii]=[left,right] children of ii
    jet["content"][jet["tree"][ii]][0]]=[py,pz] for left child of node ii
    jet["content"][jet["tree"][ii]][1]]=[py,pz] for right child of node ii

- jet["outers_node_id"]: node id for all the leaves of the tree, in the order that they are accessed when traversing the tree.

- jet["outers_list"]: momentum vectors for all the leaves of the tree, in the order that they are accessed when traversing the tree.

- jet["Nconst"]: Number of leaves of the tree.

- jet["name"]: Number that identifies each jet.

- jet["node_id"]: List where leaves idxs are added in the order they appear when we traverse the reclustered tree (each number indicates the node id that we picked when we did the reclustering.). However, the idx value specifies the order in which the leaf nodes appear when traversing the origianl jet (e.g. truth level) jet . The value here is an integer between 0 and Nleaves. This is not available for the truth jet, but in that case, we have jet["node_id"]=np.arange(jet["Nconst"])


PARAMETERS TO RUN THE TOY GENERATIVE MODEL FOR JETS

- jet["Lambda"]: Decaying rate for the exponential distribution.

- jet["Delta_0"]: Initial splitting scale.

- jet["pt_cut"]: Cut-off scale to stop the showering process.

- jet["M_Hard"]: Initial splitting scale for a jet comming from a heavy resonance X. Currently is M_hard/2.

BINARY TREE STRUCTURE FEATURES

- jet["algorithm"]: Algorithm to generate the tree structure, e.g. truth, kt, antikt, CA.

- jet["deltas"]: Splitting scale for each parent node according to the Toy Generative Model for Jets.

- jet["draws"]: r values drawn from an exponential distribution of the form f(r) = lambda*Exp[-lambda r] while running the generative model. More details in [Toy Generative Model for jets](https://github.com/SebastianMacaluso/ToyJetsShower).

- jet["tree_ancestors"]: List with one entry for each leaf of the tree, where each entry lists all the ancestor node ids when traversing the tree from the root to the leaf node.

- jet["linkage_list"]: linkage list to build heat clustermap visualizations. The format and link are described below: 

    [SciPy linkage list website](https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html) 
    
    Linkage list format: A  (n - 1) by 4 matrix Z is returned. At the i-th iteration, clusters with indices Z[i, 0] and Z[i, 1] are combined to form cluster (n + 1) . A cluster with an index less than n  corresponds to one of the n original observations. The distance between clusters Z[i, 0] and Z[i, 1] is given by Z[i, 2]. The fourth value Z[i, 3] represents the number of original observations in the newly formed cluster.

### Jet dictionary usage examples

In [7]:
# Get parent momentum from adding its children momentum
ii=1
print("Node #"+str(ii)+" children =",jet["tree"][ii])
print("Node #"+str(ii)+" momentum = ",jet["content"][ii])
print("Node #"+str(ii)+" momentum from adding its children momentum = ", np.sum(jet["content"][jet["tree"][ii]],axis=0))

Node #1 children = [ 2 41]
Node #1 momentum =  [276.11334 169.69995]
Node #1 momentum from adding its children momentum =  [276.11334 169.69995]


In [8]:
# Simple validation test  of the jet structure. We compare the parent splitting scale Delta as calculated from
# the children momentum (Delta_P) vs reading it from the jet structure (truth_Delta_P)

node_id=0
tree=jet['tree']
content=jet['content']
deltas=jet['deltas']
truth_Delta_P=deltas[node_id]
Delta_L=deltas[tree[node_id][0]]
Delta_R=deltas[tree[node_id][1]]
p_L=content[tree[node_id][0]]
p_R=content[tree[node_id][1]]

#Get DeltaP from jet dictionary
truth_Delta_P=deltas[node_id]

# Calculate DeltaP
Delta_P=np.sqrt(1/4*np.sum((p_R-p_L)**2))

draws=jet['draws']
r_L=draws[tree[node_id][0]]
r_R=draws[tree[node_id][1]]

In [9]:
print("Left child Delta = ", Delta_L)
print("Right child Delta = ", Delta_R)
print("Left child r value  drawn = ", r_L)
print("Right child r value  drawn = ",r_R)
print("Left child momentum = ",p_L)
print("Left child momentum = ",p_R)
print("Splitting scale Delta read from jet dictionary =", truth_Delta_P)
print("Splitting scale Delta as obtained from children momentum =" , Delta_P)

Left child Delta =  28.73630142211914
Right child Delta =  10.709257125854492
Left child r value  drawn =  0.7184075
Right child r value  drawn =  0.26773143
Left child momentum =  [276.11334 169.69995]
Left child momentum =  [223.88666 230.30005]
Splitting scale Delta read from jet dictionary = 40.0
Splitting scale Delta as obtained from children momentum = 39.99999389648391
