# Models Evaluation

## Confusion Matrix
It's a matrix thta summarizes all the possible outcomes of the classification. On the columns we have the actual (so, *real*) classes, whereas on the rows we have the *predicted* classes for the sample. <br>
For example, let's consider a binary classification problem where we want to distinguish between two classes: *False* and *True**. The resulting confusion matrix might look like this:

|  | Hf (Actual) | Ht (Actual) |
|---|---|---|
| **Hf (Predicted)** | 150 | 25 |
| **Ht (Predicted)** | 10 | 215 |

In this matrix:

* **150** is the number of samples that were actually False and were correctly predicted as False (**TN** for the Hf class).
* **25** is the number of samples that were actually True but were incorrectly predicted as False (**FN** for the Hf class).
* **10** is the number of samples that were actually False but were incorrectly predicted as True (**FN** for the Hf class).
* **215** is the number of samples that were actually True and were correctly predicted as True (**TP** for the Ht class).

This confusion matrix provides a clear view of how many samples were classified correctly and what types of errors the model made. <br>

Now, let's consider the **Iris** dataset, which has 3 classes. Let's import the *train/validation* split used before and fit the three Gaussian Generative Models:

In [4]:
#Import the train validation split from "./split/iris_split.npz"
import numpy as np

savedSplit = np.load('./split/iris_split.npz')

DTR = savedSplit['DTR']
DVAL = savedSplit['DVAL']
LTR = savedSplit['LTR']
LVAL = savedSplit['LVAL']

print(f"DTR shape: {DTR.shape}")
print(f"DVAL shape: {DVAL.shape}")
print(f"LTR shape: {LTR.shape}")
print(f"LVAL shape: {LVAL.shape}")

DTR shape: (4, 100)
DVAL shape: (4, 50)
LTR shape: (100,)
LVAL shape: (50,)


In [11]:
import sys
MVG_path = './models_finished/MVG'
MVGTC_path = './models_finished/MVG_TiedCov'
MVGNB_path = './models_finished/Naive_Bayes'
if not MVG_path in sys.path:
    sys.path.append(MVG_path)
if not MVGTC_path in sys.path:
    sys.path.append(MVGTC_path)
if not MVGNB_path in sys.path:
    sys.path.append(MVGNB_path)

import MVG
import MVG_TiedCov as MVGTC
import Naive_Bayes as MVGNB

### Iris Dataset, MVG Classifier Confusion Matrix Computation

In [82]:
#MVG Pipeline

def MVG_Pipeline(DTR, LTR, useLog=True):

    ML_params_MVG = MVG.computeParams_ML(DTR, LTR)


    S_LogLikelihoods_MVG = MVG.scoreMatrix_Pdf_GAU(DVAL, ML_params_MVG, useLog=useLog)
    print(f"S_LogLikelihoods_MVG shape, computed from the Validation Set: {S_LogLikelihoods_MVG.shape}")

    SJoint_MVG = MVG.computeSJoint(S_LogLikelihoods_MVG, np.ones((3, )) / 3., useLog=useLog) #compute the joint densities by multiplying the score matrix S with the Priors
    print(f"Joint densities shape: {SJoint_MVG.shape}")

    SPost_MVG = MVG.computePosteriors(SJoint_MVG, useLog=useLog) #compute the posteriors by normalizing the joint densities
    print(f"Posteriors shape: {SPost_MVG.shape}")

    PVAL_MVG = np.argmax(SPost_MVG, axis=0) #select the class with the highest posterior probability for each sample, set axis=0 to select the class with the highest posterior probability for each sample
    print(f"Predictions shape: {PVAL_MVG.shape}")
    print(f"Predictions: {PVAL_MVG}")

    return PVAL_MVG

In [83]:
PVAL_MVG = MVG_Pipeline(DTR, LTR, useLog=True)

S_LogLikelihoods_MVG shape, computed from the Validation Set: (3, 50)
Joint densities shape: (3, 50)
Posteriors shape: (3, 50)
Predictions shape: (50,)
Predictions: [0 0 1 2 2 0 0 0 1 1 0 0 1 0 2 1 2 1 0 2 0 2 0 0 2 0 2 1 1 1 2 2 2 1 0 1 2
 2 0 1 1 2 1 0 0 0 2 1 2 0]


In [84]:

#Compute confusion matrix
#classes: 0, 1, 2

Pred0_Actual0 = np.sum((PVAL_MVG == 0) & (LVAL == 0))    #True Positives for class 0
Pred0_Actual1 = np.sum((PVAL_MVG == 0) & (LVAL == 1))    #False Positives for class 0 from class 1
Pred0_Actual2 = np.sum((PVAL_MVG == 0) & (LVAL == 2))    #False Positives for class 0 from class 2

Pred1_Actual0 = np.sum((PVAL_MVG == 1) & (LVAL == 0))    #False Positives for class 0 from class 1
Pred1_Actual1 = np.sum((PVAL_MVG == 1) & (LVAL == 1))    #True Positives for class 1
Pred1_Actual2 = np.sum((PVAL_MVG == 1) & (LVAL == 2))    #False Positives for class 1 from class 2

Pred2_Actual0 = np.sum((PVAL_MVG == 2) & (LVAL == 0))    #False Positives for class 0 from class 2
Pred2_Actual1 = np.sum((PVAL_MVG == 2) & (LVAL == 1))    #False Positives for class 1 from class 2
Pred2_Actual2 = np.sum((PVAL_MVG == 2) & (LVAL == 2))    #True Positives for class 2

#confMatrix is populated manually since I have compute all the values in the confusion matrix
ConfMatrix_MVG_manual = np.array([[Pred0_Actual0, Pred0_Actual1, Pred0_Actual2],
                       [Pred1_Actual0, Pred1_Actual1, Pred1_Actual2],
                       [Pred2_Actual0, Pred2_Actual1, Pred2_Actual2]])

print(f"Confusion Matrix:\n{ConfMatrix_MVG_manual}")

Confusion Matrix:
[[19  0  0]
 [ 0 15  0]
 [ 0  2 14]]


In [89]:
def computeConfMatrix(PVAL, LVAL):
    """
    Compute the confusion matrix for the predicted labels and the actual labels.
    Args:
    - PVAL: Predicted labels
    - LVAL: Actual labels
    Returns:
    - Confusion matrix
    """
    numClasses = np.unique(LVAL).shape[0] #number of classes
    ConfMatrix = np.zeros((numClasses, numClasses)) #initialize the confusion matrix with zeros

    for classPredicted in range(numClasses):
        #for each class find the tre positives and ALL the false negatives

        classRow = np.array([]) #initialize the classRow with an empty array

        for classActual in range(numClasses):
            if classActual == classPredicted: 
                TP = np.sum((PVAL == classPredicted) & (LVAL == classPredicted))
                classRow = np.append(classRow, TP)
                continue

            #compute each FP for each wrongly assigned label
            FPi = np.sum((PVAL == classPredicted) & (LVAL == classActual))

            #add FPi to the classCol
            classRow = np.append(classRow, FPi)

        
        #add classCol to the confusion matrix
        ConfMatrix[classPredicted, :] = classRow


    return ConfMatrix

In [72]:
confMatrix_MVG = computeConfMatrix(PVAL_MVG, LVAL)
print(f"Confusion Matrix, MVG Classifier:\n{confMatrix_MVG}")

Confusion Matrix, MVG Classifier:
[[19.  0.  0.]
 [ 0. 15.  0.]
 [ 0.  2. 14.]]


### Iris Dataset, Tied Covariance MVG Classifier Confusion Matrix Computation

In [80]:
#MVGTC Pipeline

def MVTC_Pipeline(DTR, LTR, useLog=True):
    ML_params_MVGTC = MVGTC.computeParams_ML_TiedCov(DTR, LTR, useLDAForTiedCov=True)

    S_LogLikelihoods_MVGTC = MVGTC.scoreMatrix_Pdf_GAU(DVAL, ML_params_MVGTC, useLog=useLog)
    print(f"S_LogLikelihoods_MVGTC shape, computed from the Validation Set: {S_LogLikelihoods_MVGTC.shape}")

    SJoint_MVGTC = MVGTC.computeSJoint(S_LogLikelihoods_MVGTC, np.ones((3, )) / 3., useLog=useLog) #compute the joint densities by multiplying the score matrix S with the Priors
    print(f"Joint densities shape: {SJoint_MVGTC.shape}")

    SPost_MVGTC = MVGTC.computePosteriors(SJoint_MVGTC, useLog=useLog) #compute the posteriors by normalizing the joint densities
    print(f"Posteriors shape: {SPost_MVGTC.shape}")

    PVAL_MVGTC = np.argmax(SPost_MVGTC, axis=0) #select the class with the highest posterior probability for each sample, set axis=0 to select the class with the highest posterior probability for each sample
    print(f"Predictions shape: {PVAL_MVGTC.shape}")
    print(f"Predictions: {PVAL_MVGTC}")

    return PVAL_MVGTC


In [88]:
confMatrix_MVGTC = computeConfMatrix(MVTC_Pipeline(DTR, LTR), LVAL)
print(f"\nConfusion Matrix, Tied Cov MVG Classifier:\n{confMatrix_MVGTC}")

S_LogLikelihoods_MVGTC shape, computed from the Validation Set: (3, 50)
Joint densities shape: (3, 50)
Posteriors shape: (3, 50)
Predictions shape: (50,)
Predictions: [0 0 1 2 2 0 0 0 1 1 0 0 1 0 2 1 2 1 0 2 0 2 0 0 2 0 2 1 1 1 2 2 1 1 0 1 2
 2 0 1 1 2 1 0 0 0 2 1 2 0]

Confusion Matrix, Tied Cov MVG Classifier:
[[19.  0.  0.]
 [ 0. 16.  0.]
 [ 0.  1. 14.]]


### Iris Dataset, Naive Bayes MVG Classifier Confusion Matrix Computation

In [85]:
#Naive Bayes Pipeline

def MVGNB_Pipeline(DTR, LTR, useLog=True):

    ML_params_MVGNB = MVGNB.computeParams_ML_NaiveBayesAssumption(DTR, LTR)

    S_LogLikelihoods_MVGNB = MVGNB.scoreMatrix_Pdf_GAU(DVAL, ML_params_MVGNB, useLog=True)
    print(f"S_LogLikelihoods_MVGNB shape, computed from the Validation Set: {S_LogLikelihoods_MVGNB.shape}")

    SJoint_MVGNB = MVGNB.computeSJoint(S_LogLikelihoods_MVGNB, np.ones((3, )) / 3., useLog=True) #compute the joint densities by multiplying the score matrix S with the Priors
    print(f"Joint densities shape: {SJoint_MVGNB.shape}")

    SPost_MVGNB = MVGNB.computePosteriors(SJoint_MVGNB, useLog=True) #compute the posteriors by normalizing the joint densities
    print(f"Posteriors shape: {SPost_MVGNB.shape}")

    PVAL_MVGNB = np.argmax(SPost_MVGNB, axis=0) #select the class with the highest posterior probability for each sample, set axis=0 to select the class with the highest posterior probability for each sample
    print(f"Predictions shape: {PVAL_MVGNB.shape}")
    print(f"Predictions: {PVAL_MVGNB}")

    return PVAL_MVGNB

In [87]:
confMatrix_MVGNB = computeConfMatrix(MVGNB_Pipeline(DTR, LTR), LVAL)
print(f"\nConfusion Matrix, Naive Bayes MVG Classifier:\n{confMatrix_MVGNB}")

S_LogLikelihoods_MVGNB shape, computed from the Validation Set: (3, 50)
Joint densities shape: (3, 50)
Posteriors shape: (3, 50)
Predictions shape: (50,)
Predictions: [0 0 1 2 2 0 0 0 1 1 0 0 1 0 2 1 2 1 0 2 0 2 0 0 2 0 2 1 1 1 2 2 1 2 0 1 2
 2 0 1 1 2 1 0 0 0 2 1 2 0]

Confusion Matrix, Naive Bayes MVG Classifier:
[[19.  0.  0.]
 [ 0. 15.  0.]
 [ 0.  2. 14.]]


Given the limited number of errors, a detailed analysis of the IRIS dataset is not very interesting. We
thus turn our attention to a larger evaluation dataset. <br>
We can use the dataset used in Lab7, storing the tercets samples of the *Divina Commedia*. Each tercet is associated to a label that denotes the cantica from where tercet is extracted ($0$: *Inferno*, $1$ = *Purgatorio*, $2$ = *Paradiso*):

In [74]:
commedia_ll = np.load("./data/commedia_ll.npy")
commedia_labels = np.load("./data/commedia_labels.npy")

In [79]:
print(f"commedia_ll shape: {commedia_ll.shape}")
print(f"commedia_labels shape: {commedia_labels.shape}")
print(f"First 10 logLikelihoods of Inferno: {commedia_ll[0, :10]}")
print(f"First 10 labels: {commedia_labels[:10]}")

commedia_ll shape: (3, 1204)
commedia_labels shape: (1204,)
First 10 logLikelihoods of Inferno: [-122.72443339 -133.30648701 -134.36987251 -170.65723182 -163.97348133
 -139.39515141 -166.71004347 -174.57737603 -147.62396153 -123.47570192]
First 10 labels: [0 0 0 0 0 0 0 0 0 0]


Let's create a new function that computes the confusionMatrix given the log-likelihoods and the Priors. The classification rule used is always the maximum Posterior class probability Decisions:

In [92]:
#always import MVG before using this function!

def computeConfMatrixFromLL(LVAL, logLikelihoods, Priors, useLog=True):
    """
    Compute the confusion matrix for the predicted labels and the actual labels.
    Args:
    - logLikelihoods: matriix of log likelihoods for each class
    - Priors: array of priors for each class, priors are application dependent
    - useLog: if True, use log likelihoods, else use normal likelihoods

    Returns:
    - Confusion matrix
    """

    SJoint = MVG.computeSJoint(logLikelihoods, Priors, useLog=useLog) #compute the joint densities by multiplying the score matrix S with the Priors
    SPost = MVG.computePosteriors(SJoint, useLog=True)  #compute the posteriors by normalizing the joint densities
    PVAL = np.argmax(SPost, axis=0) #select the class with the highest posterior probability for each sample, set axis=0 to select the class with the highest posterior probability for each sample

    #call the computeConfMatrix function to compute the confusion matrix
    return computeConfMatrix(PVAL, LVAL)
    

For computing the Confusion Matrix for the *Commedia* dataset, we assume uniform Priors for each cantica: $P(l\_lInf) = P(l\_lPur) = P(l\_lPar) = \frac{1}{3}$

In [94]:
#Compute the confusion matrix for the log likelihoods
#Assume uniform priors for each class

confMatrix_Commedia = computeConfMatrixFromLL(commedia_labels, commedia_ll, np.ones((3, )) / 3., useLog=True)
print(f"\nConfusion Matrix, Commedia Classifier:\n{confMatrix_Commedia}")


Confusion Matrix, Commedia Classifier:
[[210. 113.  61.]
 [137. 191. 111.]
 [ 53.  98. 230.]]


## Optimal Bayes decision
The goal of a classifier is to allow us to choose an action $a$ to perform among
a set of actions $\mathcal{A}$. In the context of classification, an action can be simply "Classify sample $x_t$ with label $k$", although we can have also more complex types of actions. <br> 
We can associate to each action a **cost** $C(a \mid k)$ that we have to pay when we choose action $a$ and the sample belongs to class $k$. This can be seen as a missclassification cost, which depends both on the actual and predicted class. <br>
Unluckily, at evaluation time we don't know the actual classes of the samples (what whould the point of classification be, otherwise?), but we have access to the Priors and we can calculate the Posterios. These are useful to compute the costs. <br>
For a $K$-class problem (where classes are numbered from $0$ up to $K-1$), let's denote the Priors as:
$$
\pi = \begin{bmatrix} \pi_0 \\ \vdots \\ \pi_{K-1} \end{bmatrix}
$$
We can compute the class Posteriors, conditioned on the **Recognizer** $\mathcal{R}$ that was used to compute them (the Recognizer is our Classifier) by applying Bayes' theorem. This involves expressing the Joint probability in terms of the Likelihood of the data given the class and the Prior probability of the class. The Posterior probability is then obtained by normalizing these terms by the sum over all possible classes:
$$
P(C = c | x, \mathcal{R}) = \frac{f_{X|C,\mathcal{R}}(x|c)\pi_c}{\sum_{k=0}^{K-1} f_{X|C,\mathcal{R}}(x|k)\pi_k}
$$
These class Posteriors are conditioned on the Recognizer because they represent the **believes** that the Recognizer has about each sample belonging to a class $k$. We can compute the **Expected Cost** that we'll pay, according to these believes (which are the best of our knowledge at evaluation time):
$$
C_{X, \mathcal{R}}(a) = \mathop{\mathbb{E}} \left[ C(a \mid k) \mid x, \mathcal{R}\right] = \sum_{k=0}^{K-1} P(C = k \mid x, \mathcal{R}) C(a \mid k)
$$
where:
- $a$ is the action
- $k$ is the class
- $x$ is the test sample
- $\mathcal{R}$ is the Recognizer, so the Classifier

So in practice we calculate the Posteriors according to the Recognizer, and then we multiply them by the Cost of taking action $a$ given class $k$. The Expected Cost is obatined by summing this quantity all over all classes, from $0$ to $K-1$. <br>
Regarding the costs $ C(a \mid k)$, we can define the **cost matrix** as: 
$$
\mathbf{C} = \begin{bmatrix}
0 & C_{0,1} & \cdots & C_{0,K-1} \\
C_{1,0} & 0 & \cdots & C_{1,K-1} \\
\vdots & \vdots & \ddots & \vdots \\
C_{K-1,0} & C_{K-1,1} & \cdots & 0
\end{bmatrix}
$$
where $C_{i,j}$ represents the cost of predicting class $i$ when the actual class is $j$. <br>

-------------------------
## Example: Rain vs. Clear Classification

For example, let's consider a binary classification problem where the possible classes are "Rain" and "Clear".

**Confusion Matrix**

The confusion matrix is a table that summarizes the performance of a classification model. For our two classes, it would look like this:

$$
\text{Confusion Matrix} = \begin{bmatrix}
\text{True Positive (TP)} & \text{False Positive (FP)} \\
\text{False Negative (FN)} & \text{True Negative (TN)}
\end{bmatrix}
= \begin{bmatrix}
\text{Predicted Rain | Actual Rain} & \text{Predicted Rain | Actual Clear} \\
\text{Predicted Clear | Actual Rain} & \text{Predicted Clear | Actual Clear}
\end{bmatrix}
$$

Where:

* **TP (True Positive):** The number of times the model correctly predicted "Rain" when it was actually raining.
* **FP (False Positive):** The number of times the model incorrectly predicted "Rain" when it was actually "Clear" (also known as a Type I error).
* **FN (False Negative):** The number of times the model incorrectly predicted "Clear" when it was actually raining (also known as a Type II error).
* **TN (True Negative):** The number of times the model correctly predicted "Clear" when it was actually clear.

**Cost Matrix**

The cost matrix defines the cost associated with each type of prediction outcome. Let's assume the following costs:

* Correctly predicting "Rain" (TP) has no cost: 0
* Incorrectly predicting "Rain" when it's "Clear" (FP) has a cost of 1 (e.g., inconvenience of carrying an umbrella unnecessarily).
* Incorrectly predicting "Clear" when it's raining (FN) has a higher cost of 5 (e.g., getting caught in the rain without an umbrella).
* Correctly predicting "Clear" (TN) has no cost: 0

Based on these costs, the cost matrix would be:

$$
\mathbf{C} = \begin{bmatrix}
C_{\text{Predicted Rain, Actual Rain}} & C_{\text{Predicted Rain, Actual Clear}} \\
C_{\text{Predicted Clear, Actual Rain}} & C_{\text{Predicted Clear, Actual Clear}}
\end{bmatrix}
= \begin{bmatrix}
0 & 1 \\
5 & 0
\end{bmatrix}
$$

Here:

* $C_{0,0} = 0$: Cost of predicting "Rain" (index 0) when the actual class is "Rain" (index 0).
* $C_{0,1} = 1$: Cost of predicting "Rain" (index 0) when the actual class is "Clear" (index 1).
* $C_{1,0} = 5$: Cost of predicting "Clear" (index 1) when the actual class is "Rain" (index 0).
* $C_{1,1} = 0$: Cost of predicting "Clear" (index 1) when the actual class is "Clear" (index 1).

-------------------------


The optimal Bayes decision consists in predicting the class $c^*$ which has minimum expected Bayes cost:
$$
c^* = argmin_c \{ C_{X, \mathcal{R}}(a) \}
$$

## Binary task: optimal decisions
Now let's consider **binary tasks** in which we have two classes, that can always be summarized as class *True* - $H_T$ - and class *False* -$H_F$. IN this case the cost matrix is always:
$$
\mathbf{C} = \begin{bmatrix}
C_{\text{Predicted } H_F \text{, Actual } H_F} & C_{\text{Predicted } H_F \text{, Actual } H_T} \\
C_{\text{Predicted } H_T   \text{, Actual } H_F} & C_{\text{Predicted } H_T \text{, Actual } H_T}
\end{bmatrix}
= \begin{bmatrix}
C(H_F \mid H_F) & C(H_F \mid H_T) \\
C(H_T \mid H_F) & C(H_T \mid H_T)
\end{bmatrix}
= \begin{bmatrix}
C_{tn} & C_{fn} \\
C_{fp} & C_{tp}
\end{bmatrix}
$$
The cost for predicting the right class is of course zero so we can rewrite the matrix as:
$$
\mathbf{C} = \begin{bmatrix}
0 & C_{fn} \\
C_{fp} & 0
\end{bmatrix}
$$
So, applying the formula, the expected Bayes costs for predicting either of the two
classes are:
$$
C_{x,\mathcal{R}}(H_F) = C (H_F \mid H_F) P(C = H_F | x, \mathcal{R}) + C (H_F \mid H_T) P(C = H_T | x, \mathcal{R}) = C_{fn}P(C = H_T | x, \mathcal{R}) \\
C_{x,\mathcal{R}}(H_T) = C (H_T \mid H_F) P(C = H_F | x, \mathcal{R}) + C (H_T \mid H_T) P(C = H_T | x, \mathcal{R}) = C_{fp}P(C = H_F | x, \mathcal{R})
$$
We predict the class $c^*$ that has minimum cost: $c^* = argmin_c \{ C_{X, \mathcal{R}}(c) \}$ <br>
For the binary task, we can also
express $c^*$ taking into account the logarithm of the two costs:
$$
c^* =
\begin{cases}
    H_T & \text{if } \log \frac{C_{fn} P(C = H_T | x, \mathcal{R})}{C_{fp} P(C = H_F | x, \mathcal{R})} > 0 \\ \\
    H_F & \text{if } \log \frac{C_{fn} P(C = H_T | x, \mathcal{R})}{C_{fp} P(C = H_F | x, \mathcal{R})} \le 0
\end{cases}
$$
If $\mathcal{R}$ is a Generative Model, we can factorize the class Posteriors into Likelihoods and Priors:

$$
c^* =
\begin{cases}
    H_T & \text{if } \log \frac{\pi_{H_T} C_{fn} f_{X|C,\mathcal{R}}(x|H_T)}{(1-\pi_{H_T}) C_{fp} f_{X|C,\mathcal{R}}(x|H_F)} > 0 \\ \\
    H_F & \text{if } \log \frac{\pi_{H_T} C_{fn} f_{X|C,\mathcal{R}}(x|H_T)}{(1-\pi_{H_T}) C_{fp} f_{X|C,\mathcal{R}}(x|H_F)} \le 0
\end{cases}
$$
Where $\pi_{H_T} = P(C = H_T)$ and $1 - \pi_{H_T} = P(C = H_F)$ are the two Priors. <br>
By recalling that the ratio between the two likelihoods is the **log-likelihood ratio** - $\text{llr}(x)$ and the ratio between the Priors is the **Prior log odds**, we can compare the $\text{llr}(x)$, acting a a **Score**, to the other quantities, that act as a **Threshold**:

$$
c^* =
\begin{cases}
    H_T & \text{if } \text{llr}(x) = \log \frac{f_{X|C,\mathcal{R}}(x|H_T)}{f_{X|C,\mathcal{R}}(x|H_F)} > - \log \frac{(1-\pi_{H_T}) C_{H_T}}{\pi_{H_T} C_{fn}} \\ \\
    H_F & \text{if } \text{llr}(x) = \log \frac{f_{X|C,\mathcal{R}}(x|H_T)}{f_{X|C,\mathcal{R}}(x|H_F)} \le - \log \frac{(1-\pi_{H_T}) C_{fp}}{\pi_{H_T} C_{fn}}
\end{cases}
$$
Where here the Threshold takes into account both Priors (defined by $\pi_{H_T}$ and $1-\pi_{H_T}$) and Costs of Errors (which are $C_{fp}$, $C_{fn}$). <br> <br>
The *triplet* $\left( \pi_{H_T}, C_{fp}, C_{fn}\right)$ denotes the **application**: since these three values are specific to the problem we're trying to solve, they define the specific application or context of the decision-making process.



In practice, 