# Q: How can you compute a confusion matrix for binary classification using the dot product?

### Confusion Matrix (Binary Classification) with Dot Product

- **Confusion Matrix**: A 2x2 matrix to evaluate binary classification performance:
  - **True Positive (TP)**: Correctly predicted positives
  - **False Positive (FP)**: Incorrectly predicted positives
  - **False Negative (FN)**: Incorrectly predicted negatives
  - **True Negative (TN)**: Correctly predicted negatives

### Dot Product for Computing Confusion Matrix

- **True Labels Vector**: $\mathbf{y_{\text{true}}}$ (binary values: 0 or 1)
- **Predicted Labels Vector**: $\mathbf{y_{\text{pred}}}$ (binary values: 0 or 1)

1. **True Positives (TP)**: 
   $$\text{TP} = \mathbf{y_{\text{true}}} \cdot \mathbf{y_{\text{pred}}}$$
   (Sum of cases where both true and predicted are 1)

2. **False Negatives (FN)**: 
   $$\text{FN} = \mathbf{y_{\text{true}}} \cdot (\mathbf{1} - \mathbf{y_{\text{pred}}})$$
   (Cases where true is 1 and predicted is 0)

3. **False Positives (FP)**: 
   $$\text{FP} = (\mathbf{1} - \mathbf{y_{\text{true}}}) \cdot \mathbf{y_{\text{pred}}}$$
   (Cases where true is 0 and predicted is 1)

4. **True Negatives (TN)**: 
   $$\text{TN} = (\mathbf{1} - \mathbf{y_{\text{true}}}) \cdot (\mathbf{1} - \mathbf{y_{\text{pred}}})$$
   (Cases where both true and predicted are 0)


In [1]:
import numpy as np

labels = [0, 1, 1, 1, 0, 0, 0, 1, 1, 0] * 10000
predictions = [1, 1, 1, 0, 0, 1, 1, 0, 1, 1] * 10000

In [2]:
def confusion_matrix_dot(t, p):
    t = np.array(t)
    p = np.array(p)
    TP = t @ p 
    FP = (1 - t) @ p
    FN = t @ (1 - p) 
    TN = (1 - t) @ (1 - p)
    return (TP, FP, FN, TN)

In [3]:
TP, FP, FN, TN = confusion_matrix_dot(labels, predictions)
print(TP, FP, FN, TN)

30000 40000 20000 10000


In [4]:
%timeit confusion_matrix_dot(labels, predictions)

5.15 ms ± 276 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [5]:
def confusion_matrix_dot_faster(t, p):
    t = np.array(t) 
    p = np.array(p)

    TP = t @ p
    P = np.sum(t) # total positives
    PP = np.sum(p) # predicted positives
    N = len(t) - P
    
    FP = PP - TP
    FN = P - TP 

    TN = N - FP 

    return (TP, FP, FN, TN)

In [6]:
TP, FP, FN, TN = confusion_matrix_dot_faster(labels, predictions)
print(TP, FP, FN, TN)

30000 40000 20000 10000


In [7]:
%timeit confusion_matrix_dot_faster(labels, predictions)

4.38 ms ± 50.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
