# Install or upgrade libraries

It might be that you are running with the latest libraries and that they all work together fine.

Running the following cell takes a minute or so but ensures that you have a consistent set of python tools.

In [None]:
import sys
print(f"{sys.version = }\n")


In [None]:
# If there are issues with fsspect-xrootd not being found, run this outside of Jupyter-notebook and restart
# !pip install --upgrade fsspec-xrootd

In [None]:
#'''
!pip install --upgrade pip

!pip install futures

!pip install --user --upgrade coffea

!pip install --upgrade awkward
!pip install --upgrade uproot

!pip install --upgrade fsspec-xrootd

!pip install vector

!pip install --upgrade pandas


!pip install --upgrade matplotlib
#'''

We've also prepared some helper code that makes it easier to work with the data in this lesson.

You can see the code [here](https://github.com/cms-opendata-workshop/workshop2024-lesson-event-selection/blob/main/instructors/dpoa_workshop_utilities.py) but we will explain the functions and data objects in this notebook.

Let's download it first.

In [None]:
#!wget https://raw.githubusercontent.com/cms-opendata-workshop/workshop2024-lesson-event-selection/main/instructors/dpoa_workshop_utilities.py

## Imports

Import all the libraries we will need and check their versions, in case you run into issues.

In [None]:
%load_ext autoreload
%autoreload 2

# The classics
import numpy as np
import matplotlib.pylab as plt
import matplotlib # To get the version

import pandas as pd

# The newcomers
import awkward as ak
import uproot

import vector
vector.register_awkward()

import requests
import os

import time

import json

import dpoa_workshop_utilities
from dpoa_workshop_utilities import nanoaod_filenames
from dpoa_workshop_utilities import get_files_for_dataset
from dpoa_workshop_utilities import pretty_print
from dpoa_workshop_utilities import build_lumi_mask

import sys

In [None]:
print("Versions --------\n")
print(f"{sys.version = }\n")
print(f"{ak.__version__ = }\n")
print(f"{uproot.__version__ = }\n")
print(f"{np.__version__ = }\n")
print(f"{matplotlib.__version__ = }\n")
print(f"{vector.__version__ = }\n")
print(f"{pd.__version__ = }\n")

# Opening a file

Let's open and explore a sample file.

We'll be getting the data from [here](https://opendata.cern.ch/record/67993).

This is some Monte Carlo that contains simulations of a top-antitop pair being created in a proton-proton collision at CMS.

One top decays leptonically and the other decays hadronically.

**Do you know what leptonically and hadronically mean? If not, do a bit of research.**

When you go to open the file, it might take 10-30 seconds at this step if you are working with the larger file.

In [None]:
# For testing
# Big file
#filename = 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v1/120000/08FCB2ED-176B-064B-85AB-37B898773B98.root'

# Smaller file, better for prototyping your code as things will run faster
filename = 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v1/120000/7D120E49-E712-B74B-9E1C-67F2D0057995.root'

# print(f"Opening...{filename}")
# f = uproot.open(filename)

# events = f['Events']

# nevents = events.num_entries

# print(f"{nevents = }")

The `events` object is a `TTree` implementation in python and behaves like a dictionary. This means
we can get all the keys if we want.

In [None]:
# Uncomment the following line to print all the keys

#print(events.keys())

Again, we have provided you with a helper function called `pretty_print` that will print subsets of the keys, based on strings
that you require or ignore.

It will also format that output based on how many characters you want in a column (you are limited to 80 characters per line).

Here is some example usage.

In [None]:
# Pretty print all the keys with the default format
#pretty_print(events.keys())

# Pretty print keys with 30 characters per column, for keys that contain `FatJet`
#pretty_print(events.keys(), fmt='30s', require='FatJet')

# Pretty print keys with 40 characters per column, for keys that contain `Muon` and `Iso` but ignore ones with `HLT`
#pretty_print(events.keys(), fmt='40s', require=['Muon'], ignore='HLT')

# Pretty print keys with 40 characters per column, for keys that contain `HLT` and `TkMu50`
#pretty_print(events.keys(), fmt='40s', require=['HLT', 'TkMu50'])

# Pretty print keys with 40 characters per column, for keys that contain `HLT`
#pretty_print(events.keys(), fmt='40s', require='HLT')

# Pretty print keys with 40 characters per column, for keys that contain `Jet_` but ignore ones with `Fat`
#pretty_print(events.keys(), fmt='40s', require='Jet_', ignore='Fat')

# Pretty print keys with 40 characters per column, for keys that contain `PuppiMET` but ignore ones with `Raw`
#pretty_print(events.keys(), fmt='40s', require='PuppiMET', ignore='Raw')

## Extract some data

We're going to pull out subsets of the data in order to do our analysis.

As a reminder, you can find a list of the variable names in each dataset on the CERN Open Data Portal page for that dataset, for example, [here](https://opendata.cern.ch/eos/opendata/cms/dataset-semantics/NanoAODSIM/75156/ZprimeToTT_M2000_W20_TuneCP2_PSweights_13TeV-madgraph-pythiaMLM-pythia8_doc.html).

We're going to work with the following sets of variables
* `FatJet` for jets that are merges
* `Jet` for non-merged jets
* `Muon` for muons
* `PuppiMET` which is missing energy in the transverse plane (MET) for pileup per particle identification (Puppi)

Running this cell might take a little bit if you are running over the bigger file. However, once you pull out the values, later calculations are much faster.

In [None]:
# # Jets ---------------------------------------------------
# # B-tagging variable
# jet_btag = events['Jet_btagDeepB'].array()

# # Measure of quality of measurement of jet
# jet_jetid = events['Jet_jetId'].array()

# # 4-momentum in pt, eta, phi, mass
# jet_pt = events['Jet_pt'].array()
# jet_eta = events['Jet_eta'].array()
# jet_phi = events['Jet_phi'].array()
# jet_mass = events['Jet_mass'].array()


# # Muons ---------------------------------------------------
# # Muon isolation
# muon_iso = events['Muon_miniIsoId'].array()

# # Measure of quality of how well the muon is reconstructed
# muon_tightId = events['Muon_tightId'].array()

# # 4-momentum in pt, eta, phi, mass
# muon_pt = events['Muon_pt'].array()
# muon_eta = events['Muon_eta'].array()
# muon_phi = events['Muon_phi'].array()
# muon_mass = events['Muon_mass'].array()


# # MET ------------------------------------------------------
# # 3-momentum in pt, eta, phi, mass
# met_pt = events['PuppiMET_pt'].array()
# met_eta = 0*events['PuppiMET_pt'].array()  # Fix this to be 0
# met_phi = events['PuppiMET_phi'].array()

# What comes next?

In [None]:
def process_file(filename):
    """
    Root file processing function;
    
    ############################
    ########## INPUTS ##########
    ############################
    
    filename (str, default=None) - Full root file destination

    #############################
    ########## RETURNS ##########
    #############################
    
    events (uproot.model (Tree)) - Root file keys
    """
    ############################################
    ########## OPENING SPECIFIED FILE ##########
    ############################################
    print(f"Opening...{filename}")
    
    try:
        f = uproot.open(filename)
    except:
        print(f"Could not open {filename}")
        return None

    ####################################################################
    ########## ACCESSING EVENTS AND MAKING SPECIFIC VARIABLES ##########_jet_cut
    ####################################################################
    
    events = f['Events']

    nevents = events.num_entries

    print(f"{nevents = }")


    
    return events

def plot_func(events, pt_cut_vals=[0], btag_cut="none"):
    """
    Plotting Function

    Plots histograms of Jet and Muon Transverse Momentum and respective numbers
    
    ############################
    ########## INPUTS ##########
    ############################
    
    events_keys (uproot.model (Tree), default=None) - Root file keys; Can be entered to skip re-processing of root file
    
    plot (bool, default=False) - Calls plotting function if set to True
    
    cut_vals (int/float list, default=[0]) - Transverse momentum cut values
    
    btag_cut (str, default="none") - None, Loose, Medium, Tight cut using Jet_btagDeepB and Jet_btagDeepFlavB

    #############################
    ########## RETURNS ##########
    #############################

    None.
    """
    
    ################################################
    ########## VARIABLES FROM EVENTS FILE ##########
    ################################################
    
    # Muons -------------------------------------------------------
    muon_pt = events['Muon_pt'].array()
    # muon_eta = events['Muon_eta'].array()
    # muon_phi = events['Muon_phi'].array()
    # muon_mass = events['Muon_mass'].array()

    # muon_iso = events['Muon_miniIsoId'].array()

    # muon_tightId = events['Muon_tightId'].array()

    
    # Jets -------------------------------------------------------
    jet_btag = events['Jet_btagDeepB'].array()
    # jet_jetid = events['Jet_jetId'].array()

    jet_pt = events['Jet_pt'].array()
    # jet_eta = events['Jet_eta'].array()
    # jet_phi = events['Jet_phi'].array()
    # jet_mass = events['Jet_mass'].array()

    ###################################
    ########## APPLYING CUTS ##########
    ###################################

    for cut in pt_cut_vals:
        pt_jet_cut = jet_pt > cut
        
        ## B-Tagging ----------------------------------------------------
        if btag_cut == "tight" or btag_cut == "Tight":
            deep_b_tag = events["Jet_btagDeepB"].array() > 0.8767
            deep_flavb_tag = events["Jet_btagDeepFlavB"].array() > 0.6377
            btag = deep_b_tag & deep_flavb_tag

            cut_jets = jet_pt[pt_jet_cut[btag]]

        elif btag_cut == "medium" or btag_cut == "Medium":
            deep_b_tag = events["Jet_btagDeepB"].array() > 0.5847
            deep_flavb_tag = events["Jet_btagDeepFlavB"].array() > 0.2489
            btag = deep_b_tag & deep_flavb_tag

            cut_jets = jet_pt[pt_jet_cut[btag]]

        elif btag_cut == "loose" or btag_cut == "Loose":
            deep_b_tag = events["Jet_btagDeepB"].array() > 0.1918
            deep_flavb_tag = events["Jet_btagDeepFlavB"].array() > 0.0480
            btag = deep_b_tag & deep_flavb_tag

            cut_jets = jet_pt[pt_jet_cut[btag]]

        else:
            cut_jets = jet_pt[pt_jet_cut]

        
        pt_muon_cut = muon_pt > cut
        cut_muons = muon_pt[pt_muon_cut]

        ##############################
        ########## PLOTTING ##########
        ##############################
        
        fig, (ax1, ax2, ax3, ax4) = plt.subplots(1,4, figsize=(20,5), tight_layout=True)
        
        fig.text(0.27,0.92, f"Jets : $p_T$>{cut} (GeV/$c$) | BTag: {btag_cut}", ha='center', fontsize=18)
        fig.text(0.77, 0.92, f"Muons : $p_T$>{cut} (GeV/$c$)", ha='center', fontsize=18)
        
        ## Jets -------------------------------------------------------------
        ax1.hist(ak.flatten(cut_jets), 
                 bins=100,label=f"Number of Jets:{ak.sum(ak.num(cut_jets))}",
                range=(0,400))
        ax1.set_xlabel("Transverse Momentum",fontsize=14)
        ax1.set_ylabel("Counts",fontsize=14)
        ax1.legend()

        ax2.hist(ak.num(cut_jets), 
                 bins=100,label=f"Number of Jets:{ak.sum(ak.num(cut_jets))}",
                range=(0,10))
        ax2.set_xlabel("Number of Jets",fontsize=14)
        #ax2.set_ylabel("Counts")
        #ax2.title(f"Jets : $p_T$>{cut} (GeV/$c$)")
        ax2.legend()

        ## Muons --------------------------------------------------------------
        ax3.hist(ak.flatten(cut_muons), 
                 bins=100,label=f"Number of Muons:{ak.sum(ak.num(cut_muons))}",
                range=(0,400), color="darksalmon")
        ax3.set_xlabel("Transverse Momentum",fontsize=14)
        #ax3.set_ylabel("Counts")
        ax3.legend()

        ax4.hist(ak.num(cut_muons), 
                 bins=100,label=f"Number of Muons:{ak.sum(ak.num(cut_muons))}",
                range=(0,5), color="darksalmon")
        ax4.set_xlabel("Number of Muons",fontsize=14)
        #ax4.set_ylabel("Counts")
        #ax4.title(f"Muons : $p_T$>{cut} (GeV/$c$)")
        ax4.legend()

        plt.tight_layout(rect=[0,0,1,0.93])

def main(filename=None, events_keys=None, plot=False, pt_cut_vals=[0], btag_cut="none"):
    """
    Main calling function for process_file and plot_func functions
    
    ############################
    ########## INPUTS ##########
    ############################
    
    filename (str, default=None) - Full root file destination
    
    events_keys (uproot.model (Tree), default=None) - Root file keys; Can be entered to skip re-processing of root file
    
    plot (bool, default=False) - Calls plotting function if set to True
    
    cut_vals (int/float list, default=[0]) - Transverse momentum cut values
    
    btag_cut (str, default="none") - None, Loose, Medium, Tight cut using Jet_btagDeepB and Jet_btagDeepFlavB

    #############################
    ########## RETURNS ##########
    #############################
    
    events (uproot.model (Tree)) - Root file keys
    """

    if events_keys is None:
        events = process_file(filename)

    elif events_keys is not None:
        events = events_keys
    
    if plot == True:
        plot_func(events, pt_cut_vals, btag_cut);

    return events

In [None]:
events = main(filename)

In [None]:
jet_pt = events["Jet_pt"].array()
muon_pt = events["Muon_pt"].array()
print(f"Number of jets: {len(ak.flatten(jet_pt))}")
print(f"Number of muons: {len(ak.flatten(muon_pt))}")

In [None]:
cut_vals = [10, 20, 25, 30]
# Smaller file, better for prototyping your code as things will run faster
small_file = 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v1/120000/7D120E49-E712-B74B-9E1C-67F2D0057995.root'
#events = main(filename, plot=True, cut_vals=cut_vals)

In [None]:
# Big file
big_file = 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/TTToSemiLeptonic_TuneCP5_13TeV-powheg-pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v1/120000/08FCB2ED-176B-064B-85AB-37B898773B98.root'

In [None]:
small_events = main(small_file)
big_events = main(big_file)

In [None]:
main(events_keys=big_events, pt_cut_vals=[20], btag_cut="Loose", plot=True)

In [None]:
main(events_keys=big_events, pt_cut_vals=[20], btag_cut="Medium", plot=True)

In [None]:
main(events_keys=big_events, pt_cut_vals=[20], btag_cut="Tight", plot=True)

In [None]:
jet_btagB = big_events["Jet_btagDeepB"].array()
jet_btag_flavB = big_events["Jet_btagDeepFlavB"].array()
x_vals = {
    "Loose": [0.1918, 0.0480, "purple"],
    "Medium": [0.5847, 0.2489, "yellow"],
    "Tight": [0.8767, 0.6377, "red"]
}

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)

plt.hist(ak.flatten(jet_btagB), bins=100, range=(0,1));
for val in x_vals:
    plt.axvline(x_vals[val][0], color=x_vals[val][-1], linestyle='--', label=val)
plt.title("Jet_btagDeepB")
plt.legend();

plt.subplot(1,2,2)

plt.hist(ak.flatten(jet_btag_flavB), bins=100, range=(0,1));
for val in x_vals:
    plt.axvline(x_vals[val][1], color=x_vals[val][-1], linestyle='--', label=val)
plt.title("Jet_btagDeepFlavB")
plt.legend();

# Invariant Mass of Top Quark and W Boson

In [None]:
events = big_events

In [None]:
pretty_print(events.keys(), fmt='40s', require=['Muon'], ignore='HLT')

In [None]:
# FatJet -----------------------------------------------------
fatjet_mSD = events['FatJet_msoftdrop'].array()

fatjet_tag = events['FatJet_particleNet_TvsQCD'].array()

fatjet_tau2 = events['FatJet_tau2'].array()
fatjet_tau3 = events['FatJet_tau3'].array()

fatjet_pt = events['FatJet_pt'].array()
fatjet_eta = events['FatJet_eta'].array()
fatjet_phi = events['FatJet_phi'].array()
fatjet_mass = events['FatJet_mass'].array()
    
# Muons -------------------------------------------------------
muon_pt = events['Muon_pt'].array()
muon_eta = events['Muon_eta'].array()
muon_phi = events['Muon_phi'].array()
muon_mass = events['Muon_mass'].array()

muon_iso = events['Muon_miniIsoId'].array()

muon_tightId = events['Muon_tightId'].array()

    
# Jets -------------------------------------------------------
jet_btag = events['Jet_btagDeepB'].array()
jet_jetid = events['Jet_jetId'].array()

jet_pt = events['Jet_pt'].array()
jet_eta = events['Jet_eta'].array()
jet_phi = events['Jet_phi'].array()
jet_mass = events['Jet_mass'].array()
jets_btagDeepB = events["Jet_btagDeepB"].array()
jets_btagDeepFlavB = events["Jet_btagDeepFlavB"].array()
    
# MET ---------------------------------------------------------
met_pt = events['PuppiMET_pt'].array()
met_eta = 0*events['PuppiMET_pt'].array()  # Fix this to be 0
met_phi = events['PuppiMET_phi'].array() 

ht_lep = muon_pt + met_pt

In [None]:
    ##########################
    ########## CUTS ##########
    ##########################

# # Particle-specific cuts --------------------------------------
# tau32 = fatjet_tau3/fatjet_tau2

# #cut_fatjet = (tau32>0.67) & (fatjet_eta>-2.4) & (fatjet_eta<2.4) & (fatjet_mSD>105) & (fatjet_mSD<220)
# cut_fatjet = (fatjet_pt > 500) & (fatjet_tag > 0.5)

# cut_muon = (muon_pt>20) & (muon_eta>-2.4) & (muon_eta<2.4) & \
#            (muon_tightId == True) & (muon_iso>1) & (ht_lep>150)

# cut_jet = (jet_btag > 0.5) & (jet_jetid>=4)
# # Event cuts -------------------------------------------------
# cut_met = (met_pt > 50)

# cut_nmuons = ak.num(cut_muon[cut_muon]) == 1
# cut_njets = ak.num(cut_jet[cut_jet]) == 4


# cut_trigger = (events['HLT_TkMu50'].array())
    
# cut_ntop = ak.num(cut_fatjet[cut_fatjet]) == 1

# cut_full_event = cut_trigger & cut_nmuons# & cut_met & cut_ntop
        
# #############################################
# ########## CALCULATING DI-TOP MASS ##########
# #############################################
    
# fatjets = ak.zip(
#     {"pt": fatjet_pt[cut_full_event][cut_fatjet[cut_full_event]], 
#      "eta": fatjet_eta[cut_full_event][cut_fatjet[cut_full_event]], 
#      "phi": fatjet_phi[cut_full_event][cut_fatjet[cut_full_event]], 
#      "mass": fatjet_mass[cut_full_event][cut_fatjet[cut_full_event]]},
#     with_name="Momentum4D",
# )

# muons = ak.zip(
#     {"pt": muon_pt[cut_full_event][cut_muon[cut_full_event]], 
#      "eta": muon_eta[cut_full_event][cut_muon[cut_full_event]], 
#      "phi": muon_phi[cut_full_event][cut_muon[cut_full_event]], 
#      "mass": muon_mass[cut_full_event][cut_muon[cut_full_event]]},
#     with_name="Momentum4D",
# )

# jets = ak.zip(
#     {"pt": jet_pt[cut_full_event][cut_jet[cut_full_event]], 
#      "eta": jet_eta[cut_full_event][cut_jet[cut_full_event]], 
#      "phi": jet_phi[cut_full_event][cut_jet[cut_full_event]], 
#      "mass": jet_mass[cut_full_event][cut_jet[cut_full_event]],
#      "btagCSVV2": jets_btagCSVV2[cut_full_event][cut_jet[cut_full_event]]},
#     with_name="Momentum4D",
# )

# met = ak.zip(
#     {"pt": met_pt[cut_full_event], 
#      "eta": met_eta[cut_full_event], 
#      "phi": met_phi[cut_full_event], 
#      "mass": 0}, # We assume this is a neutrino with 0 mass
#     with_name="Momentum4D",
# )
###########################################################################################
###########################################################################################
###########################################################################################
###########################################################################################
fatjets = ak.zip(
    {"pt": fatjet_pt, 
     "eta": fatjet_eta, 
     "phi": fatjet_phi, 
     "mass": fatjet_mass},
    with_name="Momentum4D",
)

muons = ak.zip(
    {"pt": muon_pt, 
     "eta": muon_eta, 
     "phi": muon_phi, 
     "mass": muon_mass},
    with_name="Momentum4D",
)

jets = ak.zip(
    {"pt": jet_pt, 
     "eta": jet_eta, 
     "phi": jet_phi, 
     "mass": jet_mass,
     "btagDeepB": jets_btagDeepB,
     "btagDeepFlavB": jets_btagDeepFlavB},
    with_name="Momentum4D",
)

met = ak.zip(
    {"pt": met_pt, 
     "eta": met_eta, 
     "phi": met_phi, 
     "mass": np.zeros(len(met_phi))}, # We assume this is a neutrino with 0 mass
    with_name="Momentum4D",
)

In [None]:
# Calculate all the different combinations
p4mu,p4fj,p4j,p4met = ak.unzip(ak.cartesian([muons, fatjets, jets, met]))

# Calculate a sum of the 4-momenta
p4tot = p4mu + p4fj + p4j + p4met

In [None]:
# Get the mass
x = p4tot.mass

print(x)

#ncand_cut = ak.num(x)==1
ncand_cut = ak.num(x)>0

# Plot it!
# Your code here
plt.figure()
plt.hist(ak.flatten(x[ncand_cut]), bins=40, range=(0,4000));
plt.hist(x[ncand_cut][:,0], bins=40, range=(0,4000));

In [None]:
#########################################################################
##################### SELECTING DESIRED DATA (CUTS) #####################
#########################################################################

muon_mask = (muons.pt > 20) & (np.abs(muons.eta) < 2.1)
jet_mask = (jets.pt > 20) & (np.abs(jets.eta) < 2.4)

cut_muons = muons[muon_mask]
cut_jets = jets[jet_mask]

n_muons = ak.num(cut_muons)
n_jets = ak.num(cut_jets)
n_bjets = ak.sum(jets.btagDeepB > 0.5847, axis=1)

event_mask = (n_muons == 1) & (n_jets == 4) & (n_bjets == 2)

selected_jets = cut_jets[event_mask]
selected_muons = cut_muons[event_mask]

#################################################################
##################### TOP QUARK CALCULATION #####################
#################################################################

btag_cut = 0.8767
non_btag_cut = 0.2

trijets = ak.combinations(selected_jets, 3, fields=["j1","j2","j3"])
trijets["p4"] = trijets.j1 + trijets.j2 + trijets.j3
# trijets["max_btag"] = ak.max([
#     trijets.j1.btagDeepB,
#     trijets.j2.btagDeepB,
#     trijets.j3.btagDeepB
# ], axis=0)



mask_jets_btag = ((trijets.j1.btagDeepB > btag_cut) & (trijets.j2.btagDeepB < non_btag_cut) & (trijets.j3.btagDeepB < non_btag_cut)) | \
                 ((trijets.j1.btagDeepB < non_btag_cut) & (trijets.j2.btagDeepB > btag_cut) & (trijets.j3.btagDeepB < non_btag_cut)) | \
                 ((trijets.j1.btagDeepB < non_btag_cut) & (trijets.j2.btagDeepB < non_btag_cut) & (trijets.j3.btagDeepB > btag_cut))

top_trijet = trijets.p4[mask_jets_btag][ak.argmax(trijets.p4.pt[mask_jets_btag], axis=1, keepdims=True)]
invmass_top = ak.flatten(top_trijet.mass)


###############################################################
##################### W BOSON CALCULATION #####################
###############################################################

event_mask = (n_muons == 1) & (n_jets == 4)

selected_jets = cut_jets[event_mask]
selected_muons = cut_muons[event_mask]

dijets = ak.combinations(selected_jets, 2, fields=["j1","j2"])
dijets["p3"] = dijets.j1 + dijets.j2


non_mask_jets_btag = ((dijets.j1.btagDeepB < btag_cut) & (dijets.j2.btagDeepB < btag_cut))

w_boson = dijets.p3[non_mask_jets_btag][ak.argmax(dijets.p3.pt[non_mask_jets_btag], axis=1, keepdims=True)]
invmass_w = ak.flatten(w_boson.mass)



In [None]:
plt.figure(figsize=(20,5))

plt.subplot(1,2,1)
plt.title("Top Quark Invariant Mass")
plt.hist(invmass_top, bins=100, range=(0,400))
plt.axvline(173,color="red",linestyle="--",label="Known Value (173 GeV)")

plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]")
plt.ylabel("Events")
plt.legend();

plt.subplot(1,2,2)
plt.title("W Boson Invariant Mass")
plt.hist(invmass_w, bins=100, range=(0,250))
plt.axvline(80,color="red",linestyle="--",label="Known Value (80 GeV)")

plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]")
plt.ylabel("Events")
plt.legend();

In [None]:
btag_cut = 0.5
non_btag_cut = 0.2


mask1 = ((trijets.j1.btagDeepB > btag_cut) & (trijets.j2.btagDeepB < non_btag_cut) & (trijets.j3.btagDeepB < non_btag_cut))
mask2 = ((trijets.j1.btagDeepB < non_btag_cut) & (trijets.j2.btagDeepB > btag_cut) & (trijets.j3.btagDeepB < non_btag_cut))
mask3 = ((trijets.j1.btagDeepB < non_btag_cut) & (trijets.j2.btagDeepB < non_btag_cut) & (trijets.j3.btagDeepB > btag_cut))

mask = mask1 | mask2 | mask3

n = 2
for i in range(len(trijets[n].j1)):
    
    print('------------')
    print(trijets[n].j1[i])
    print(trijets[n].j2[i])
    print(trijets[n].j3[i])
    print()
    print(mask1[n])
    print(mask2[n])
    print(mask3[n])
    print(mask[n])

# Reconstructing Top Quark using missing mass of Neutrino

In [None]:
muons[muon_mask],met

In [None]:
muon_mask = (muons.pt > 20)# & (np.abs(muons.eta) < 2.1)
jet_mask = (jets.pt > 20)# & (np.abs(jets.eta) < 2.4)
#met_mask = (met.pt > 20)

cut_muons = muons[muon_mask]
cut_jets = jets[jet_mask]
cut_met = met#[met_mask]

n_muons = ak.num(cut_muons)
n_jets = ak.num(cut_jets)
n_nonbjets = ak.sum(jets.btagDeepB < 0.8767, axis=1)
n_bjets = ak.sum(jets.btagDeepB > 0.8767, axis=1)


event_mask = (n_muons == 1) & (n_jets == 4) & (n_nonbjets == 2)
event_mask2 = (n_muons == 1) & (n_jets == 4) & (n_bjets == 2)


selected_nonbjets = cut_jets[event_mask]
selected_bjets = cut_jets[event_mask2]
selected_muons = cut_muons[event_mask2]
selected_met = cut_met[event_mask2]

nonbjet = ak.zip(
    {"px": selected_nonbjets[:,0].px,
     "py": selected_nonbjets[:,0].py,
     "pz": selected_nonbjets[:,0].pz,
     "E":  selected_nonbjets[:,0].energy},
    with_name="Momentum4D"
)

bjet = ak.zip(
    {"px": selected_bjets[:,0].px,
     "py": selected_bjets[:,0].py,
     "pz": selected_bjets[:,0].pz,
     "E":  selected_bjets[:,0].energy},
    with_name="Momentum4D"
)

mu = ak.zip(
    {"px": selected_muons[:,0].px,
     "py": selected_muons[:,0].py,
     "pz": selected_muons[:,0].pz,
     "E":  selected_muons[:,0].energy},
    with_name="Momentum4D"
)

nu = ak.zip(
    {"px": selected_met.px,
     "py": selected_met.py,
     "pz": np.zeros(len(selected_met.py)),
     "E":  np.zeros(len(selected_met.py))},
    with_name="Momentum4D"
)

In [None]:
mu,nu

In [None]:
MW = 80.4 # GeV
mu_nu_p = mu.px*nu.px + mu.py*nu.py

Dtmp = MW**2 - mu.mass**2 + 2*(mu_nu_p)
Atmp = 4*(mu.energy**2 - mu.pz**2)
Btmp = -4 * Dtmp * mu.pz
Ctmp = 4 * mu.energy**2 * nu.pt**2 - Dtmp**2

disc = Btmp**2 - 4*Atmp*Ctmp

nu_pz1 = (-Btmp + np.sqrt(disc))/(2*Atmp)
nu_pz2 = (-Btmp - np.sqrt(disc))/(2*Atmp)

real_sol = disc >= 0

nu["pz"] = ak.where(real_sol,
    ak.where(abs(nu_pz1) < abs(nu_pz2), nu_pz1, nu_pz2),
                    -Btmp/(2*Atmp)
)

nu["E"] = np.sqrt(nu.px**2 + nu.py**2 + nu.pz**2)

In [None]:
W_boson = mu + nu
W_boson

In [None]:
plt.figure(figsize=(20,5))

plt.subplot(1,2,1)
plt.title("W Boson Invariant Mass",fontsize=16)
plt.hist(W_boson.mass, bins=100,range=(0,150));
plt.axvline(80.4,color="red",linestyle="--",label="Known Value (80.4 GeV)")

#plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]",fontsize=14)
plt.ylabel("Events",fontsize=14)
plt.legend();

In [None]:
w_cut = W_boson.mass == 80.4
W_boson[w_cut].mass, bjet[w_cut].mass
top_q = W_boson[w_cut] + bjet[w_cut]

In [None]:
plt.figure(figsize=(20,5))

plt.subplot(1,2,1)
plt.title("Top Quark Invariant Mass",fontsize=16)
plt.hist(top_q.mass, bins=100)
plt.axvline(172.52,color="red",linestyle="--",label="Known Value (172.56 GeV)")

#plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]",fontsize=14)
plt.ylabel("Events",fontsize=14)
plt.legend();

In [None]:
plt.figure(figsize=(20,5))

plt.subplot(1,2,1)
plt.title("Top Quark Invariant Mass")
plt.hist(top_q.mass, bins=100, range=(50,250))
plt.axvline(172.52,color="red",linestyle="--",label="Known Value (172.56 GeV)")

#plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]")
plt.ylabel("Events")
plt.legend();

In [None]:
print("W mass:", ak.to_numpy(W_boson.mass)[:5])
print("bjet pt:", ak.to_numpy(bjet.pt)[:5])
print("top mass:", ak.to_numpy(top_q.mass)[:5])

In [None]:

plt.figure(figsize=(20,5))

plt.subplot(1,2,1)
plt.title("Top Quark Invariant Mass",fontsize=16)
plt.hist(top_q.mass, bins=100, range=(100,200))
plt.axvline(172.52,color="red",linestyle="--",label="Known Value (172.56 GeV)")

#plt.xticks(np.arange(150,200,25),minor=True)
plt.xlabel("Invariant Mass [GeV]",fontsize=14)
plt.ylabel("Events",fontsize=14)
plt.legend();

plt.subplot(1,2,2)
plt.title("W Boson Invariant Mass",fontsize=16)
plt.hist(W_boson.mass, bins=100, range=(0,150));
plt.axvline(80,color="red",linestyle="--",label="Known Value (80 GeV)")

#plt.xticks(np.arange(0,275,25),minor=True)
plt.xlabel("Invariant Mass [GeV]",fontsize=14)
plt.ylabel("Events",fontsize=14)
plt.legend();