# Explanation of my ideas on some complex parts/codes

**1. To get the true and predicted labels - `src.utils.extract_label_img`**
- As our test set was fetched by using `image_dataset_from_directory`, we need to extract the true and predicted labels by utilizing the test set and the CNN model.
- Extract `images` and `labels` batches of the test_set:
    - Get true labels by taking all the labels of `labels`, convert them from tf tensors to np arrays.
    - For predicted labels: Use `argmax` to convert the predicted probabilties from the model into class labels.
    - Use `extend()` to add each element of the extracted numpy array of true/predicted labels to the list. We don't use `append()` here as it add the whole array, not each array's item.
    - Extract the test image as np array so we can later plot the images for error analysis

In [None]:
y_true = []
y_pred = []
test_images = []

for images, labels in test_set:
    # Append true labels for each batch
    y_true.extend(labels.numpy())
    # Compute predictions for each batch
    preds = model.predict(images)
    # Append predicted labels for each batch
    y_pred.extend(np.argmax(preds, axis=-1))
    # Append the images for each batch
    test_images.extend(images.numpy())

**2. To generate confusion matrix heatmap - `src.utils.plot_conf_matrix`**
- Use sklearn `confusion_matrix` function to get the confusion matrix by comparing `y_true` and `y_pred`.
- Use seaborn and matplotlib library to plot the matrix heatmap

In [None]:
# Calculate the confusion matrix
pred_matrix = confusion_matrix(y_true, y_pred)

# Plot the confusion matrix
sns.heatmap(
    pred_matrix,
    annot = True,
    fmt = 'd',  # Format annotations as integers
    center = 0,
    square = True,
    cmap = "RdBu_r",
    xticklabels = class_labels,
    yticklabels = class_labels,
)

plt.title('Confusion Matrix')
plt.xlabel('Predicted labels')
plt.tick_params(axis = 'x', labelsize = 8)
plt.ylabel('True labels')
plt.tick_params(axis = 'y', labelsize = 8) 

plt.show()