## _Evaluation Metrics_

_If **GNNBuilder** callback has been run during training, just load data from `dnn_processed/test` and extract `scores` and `y_pid ~ truth` and simply run the following metrics_.

In [None]:
import sys, os, glob, yaml

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import pprint
from tqdm import tqdm
import trackml.dataset

In [None]:
import torch
import torchmetrics
import pytorch_lightning as pl
from torch_geometric.data import Data
from torch_geometric.loader import DataLoader
import itertools

In [None]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [None]:
# append parent dir
sys.path.append('..')

In [None]:
from src.metric_utils import compute_metrics, plot_metrics
from src.metric_utils import plot_roc, plot_prc, plot_prc_thr, plot_epc, plot_epc_cut, plot_output

### _Evaluation Definitions_

Metrics to evaluate the GNN networks:

- Accuracy/ACC = $TP+TN/TP+TN+FP+FN$
- sensitivity, recall, hit rate, or true positive rate ($TPR = 1 - FNR$)
- specificity, selectivity or true negative rate ($TNR = 1 - FPR$)
- miss rate or false negative rate ($FNR = 1 - TPR$)
- fall-out or false positive rate ($FPR = 1 - TNR$)
- F1-score = $2 \times (\text{PPV} \times \text{TPR})/(\text{PPV} + \text{TPR})$
- Efficiency/Recall/Sensitivity/Hit Rate: $TPR = TP/(TP+FN)$
- Purity/Precision/Positive Predictive Value: $PPV = TP/(TP+FP$
- AUC-ROC Curve $\equiv$ FPR ($x-$axis) v.s. TPR ($y-$axis) plot
- AUC-PRC Curve $\equiv$ TPR ($x-$axis) v.s. PPV ($y-$axis) plot


Use _`tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()`_ to directly access TN, FP, FN and TP using Scikit-learn.

### _Classifier Evaluation_

In [None]:
# fetch all files
# inputdir = "run_all/gnn_processed/test"
# inputdir = "run_all/dnn_processed_bn/test"
# inputdir = "run_all/dnn_processed_ln/test"

# HypGNN (FWP + Filtering)
# inputdir = "run_all/fwp_gnn_processed_nf/pred"

# HypGNN (FWP + No Filtering)
inputdir = "run_all/fwp_gnn_processed/pred"

In [None]:
test_files = sorted(glob.glob(os.path.join(inputdir, "*")))
print("Number of Files: ", len(test_files))

In [None]:
# Let's test a event
data = torch.load(test_files[0], map_location=device)

In [None]:
data

### _Append Scores and Truths_
- _Load all `truth` and `scores` from the `testset` from the `DNN` stage_

In [None]:
scoresl, truthsl = [], []

for e in range(len(test_files)):

    # read test events e.g. gnn_processed/test
    graph = torch.load(test_files[e], map_location=device)
    
    # get truths and scores
    truth = graph.y_pid
    score = graph.scores
    score = score[:truth.size(0)]

    # logging
    if e !=0 and (e)%1000==0:
        print("Processed Batches: ", e)
        
    # append each batch
    truthsl.append(truth)
    scoresl.append(score)

In [None]:
scores = torch.cat(scoresl)
truths = torch.cat(truthsl)

In [None]:
# torch to numpy
scores = scores.numpy()
truths = truths.numpy()

In [None]:
# save scores and truths as .npy files
# np.save("scores.npy", scores.numpy())
# np.save("truths.npy", truths.numpy())

### _Compute Metrics_

In [None]:
metrics = compute_metrics(scores,truths,threshold=0.5)

In [None]:
print("{:.4f},{:.4f},{:.4f},{:.4f}".format(metrics.accuracy, metrics.precision, metrics.recall, metrics.f1))

### _(a) - Plot Metrics_

In [None]:
outname = "fwp"

In [None]:
# plot_metrics(scores,truths, metrics, name=outname)

In [None]:
# ROC Curve
# plot_roc(metrics, name=outname)

In [None]:
# PR Curve
# plot_prc(metrics, name=outname)

In [None]:
# Built from PRC Curve
# plot_prc_thr(metrics, name=outname)

In [None]:
# EP Curve from ROC
plot_epc(metrics, name=outname)

In [None]:
# Built from ROC Curve
plot_epc_cut(metrics, name=outname)

In [None]:
# Model output: True and False
plot_output(scores, truths, threshold=0.9, name=outname)

### _(b) - S/B Suppression_

Background rejection rate (1/FPR) is given as $1/\epsilon_{bkg}$ where $\epsilon_{bkg}$ is the fraction of fake edges that pass the classification requirement. Signal efficiency (TPR ~ Recall) ($\epsilon_{sig}$) is defined as the number of true edges above a given classification score cut over the total number of true edges. What we have?

- Signal Efficiency = $\epsilon_{sig}$ = TPR ~ Recall 
- Background Rejection = $1 - \epsilon_{bkg}$ ???
- Background Rejection Rate = $1/\epsilon_{bkg}$ = 1/FPR


First apply a edge score cut to binarized the `scores`, we will call it `preds`. The count number of false or true edges that pass this cut. Then calculated background rejection rate and signal efficiency. For making a plot one can do calculations in batch by batch mode on the test dataset.

In [None]:
from sklearn.metrics import confusion_matrix

In [None]:
threshold = 0.5

In [None]:
# Metrics with Threshold
metrics = compute_metrics(scores,truths,threshold)

- _recall/tpr and fpr_

In [None]:
preds, targets = scores, truths
y_pred, y_true = (preds > threshold), (targets > threshold)

In [None]:
# Confusion Matrix
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()

# Find Recal/TPR and FPR
tpr = tp / (tp + fn)
fpr = fp / (fp + tn)

In [None]:
# signal, bkg, bkg rejection
tpr, fpr, (1/fpr)

- _signal vs background rejection rate_

In [None]:
sig = metrics.roc_tpr
bkg_rejection = 1/metrics.roc_fpr

In [None]:
# cut off eff < 0.2 or 0.5
sig_mask = sig > 0.65

In [None]:
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8,6))
ax.plot(sig[sig_mask], bkg_rejection[sig_mask], label="Interaction GNN", color="blue")
ax.plot(tpr, 1/fpr, marker="o", markersize=10, markeredgecolor="k", markerfacecolor="k", label="Edge Score = 0.5", color="k")

# Axes Params
ax.set_xlabel("Signal Efficiency", fontsize=16)
ax.set_ylabel("Background Rejection", fontsize=16)
ax.set_yscale('log')
ax.tick_params(axis='both', which='major', labelsize=12)
ax.tick_params(axis='both', which='minor', labelsize=12)
ax.grid(True)
ax.legend(fontsize=14, loc='upper right')
    
# Figure Params
fig.tight_layout()
fig.savefig(outname+"_SB.pdf")

### _(c) - Visualize Model Output_

In [None]:
from src.drawing import detector_layout
from src.utils_math import polar_to_cartesian

In [None]:
e = filter_files[2]

In [None]:
# load graph
graph = torch.load(test_files[e], map_location=device)

# get truths and scores
truth = graph.y_pid
scores = graph.scores[:truth.size(0)]
edges = graph.edge_index
eid = int(graph.event_file[-10:])

In [None]:
eid

In [None]:
truth.shape, scores.shape, edges.shape

In [None]:
preds, labels = scores.numpy(), truth.numpy()

In [None]:
preds.shape, labels.shape

In [None]:
def draw_sample_xy(graph, cut=0.5, figsize=(15,15)):
    """"Draw Sample with True and False Edges"""
    
    # coordinate transformation
    x = graph.x.detach().numpy()
    r, phi, ir = x.T
    x, y = polar_to_cartesian(r, phi)
    
    
    truth = graph.y_pid
    scores = graph.scores[:truth.size(0)]
    edges = graph.edge_index
    preds, labels = scores.numpy(), truth.numpy()
    
    
    # detector layout
    fig, ax = detector_layout(figsize=figsize)
    
    # True Event
    pids = np.unique(graph.pid)    
    for pid in pids:
        idx = graph.pid == pid
        ax.plot(x[idx], y[idx], 'k-', linewidth=1.5)
        ax.scatter(x[idx], y[idx], label='particle_id: {}'.format(int(pid)))
    
    
    # Draw the segments
    for j in range(labels.shape[0]):
        
        ptx1 = x[edges[0,j]]
        ptx2 = x[edges[1,j]]
        pty1 = y[edges[0,j]]
        pty2 = y[edges[1,j]]
        
        # False Negatives
        if preds[j] < cut and labels[j] > cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '--', c='b')
            ax.plot([ptx1, ptx2], [pty1, pty2], '--', color='b', lw=1.5, alpha=0.9)

        # False Positives
        if preds[j] > cut and labels[j] < cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '-', c='r', alpha=preds[j])
            ax.plot([ptx1, ptx2], [pty1, pty2], '-', color='r', lw=1.5, alpha=0.15)

        # True Positives
        if preds[j] > cut and labels[j] > cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '-', c='k', alpha=preds[j])
            ax.plot([ptx1, ptx2], [pty1, pty2], '-', color='k', lw=1.5, alpha=0.3)

    fig.tight_layout()
    fig.savefig("ambiguous1.pdf")

In [None]:
# draw_sample_xy(graph, cut=0.7);

In [None]:
def draw_sample_xy(graph, lower_cut=0.6, upper_cut=0.8, figsize=(15,15)):
    """"Draw Sample with True and False Edges"""
    
    # coordinate transformation
    x = graph.x.detach().numpy()
    truth = graph.y_pid
    scores = graph.scores[:truth.size(0)]
    edges = graph.edge_index
    eid = int(graph.event_file[-5:])
    pids = np.unique(graph.pid)
    
    preds, labels = scores.numpy(), truth.numpy()
    
    
    # detector layout
    fig, ax = detector_layout(figsize=figsize)
    
    # True Event
    r, phi, ir = x.T
    x, y = polar_to_cartesian(r, phi)
    
    for pid in pids:
        idx = graph.pid == pid
        ax.plot(x[idx], y[idx], '-', linewidth=1.5)
        ax.scatter(x[idx], y[idx], label='particle_id: {}'.format(int(pid)))
    
    
    # Draw the segments
    for j in range(labels.shape[0]):
        
        ptx1 = x[edges[0,j]]
        ptx2 = x[edges[1,j]]
        pty1 = y[edges[0,j]]
        pty2 = y[edges[1,j]]
        
        # False Negatives
        if preds[j] < lower_cut and labels[j] > upper_cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '--', c='b')
            ax.plot([ptx1, ptx2], [pty1, pty2], '--', color='b', lw=1.5, alpha=0.9)

        # False Positives
        if preds[j] > upper_cut and labels[j] < lower_cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '-', c='r', alpha=preds[j])
            ax.plot([ptx1, ptx2], [pty1, pty2], '-', color='r', lw=1.5, alpha=0.15)

        # True Positives
        if preds[j] > lower_cut and labels[j] > upper_cut:
            # ax.plot([x[edges[0,j]], x[edges[1,j]]], [y[edges[0,j]], y[edges[1,j]]], '-', c='k', alpha=preds[j])
            ax.plot([ptx1, ptx2], [pty1, pty2], '-', color='k', lw=1.5, alpha=0.3)
    
    ax.set_title('Azimuthal View of STT, EventID # {}'.format(eid))
    fig.tight_layout()
    fig.savefig("ambiguous_{}.png".format(eid))

In [None]:
filter_files = [18,106,109,113,120,122,133,139,147,152,153,158,159,164,1012,1022,1030,1031,1040,1894,1892,1880,1877,1872,1860,1857,1828,1827,1817,1816,1816,1807,1804,1767,1761,
                1751,1750,1749,1743,1734,1722,1721
               ]