# **DAV Project --- Results Generation**


In this notebook, the results showcased in the main paper are generated, and written into text files.

**Note:** Since the project code was written on a virtual machine terminal, which does not allow displaying plots, this notebook solely generates the results discussed in the paper and writes them to TXT files, without displaying them. To display the results, please refer to the provided ```display_results.ipynb``` notebook, with the generated TXT files.

## **Initialization**

In [None]:
# imports
from torchvision import models, datasets, transforms
import torch.nn as nn
from matplotlib import pyplot as plt
from torch.utils.data import Subset
import torch.optim as optim
from scipy.stats import entropy
import torch
import zipfile
from torch.utils.data import Dataset
from abc import ABC
from torchvision.models import resnet50, ResNet50_Weights
from sklearn.cluster import KMeans
from collections import defaultdict
from sklearn.preprocessing import MinMaxScaler
import scipy
import math
import numpy as np
import random

# constants
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
IMG_SIZE = 224

UNCERTAINTY, SIZE = 0, 1
UNCERTAINTY, DATASET_IDX = 0, 1
UNCERTAINTY, MAX_PRED_PROB = 0, 1

# set seeds for reproducibility
SEED = 123

def set_random_seeds(seed_value):
    torch.manual_seed(seed_value)
    torch.cuda.manual_seed(seed_value)
    torch.cuda.manual_seed_all(seed_value)
    np.random.seed(seed_value)
    random.seed(seed_value)

set_random_seeds(SEED)

## **Data and Model Preparation**

Let us define the model discussed in the main paper, as well as the DataLoader generator that we will utilize throughout the experiments:

In [None]:
# CNN classifier class (based on the pre-tranied ResNet50 model)
class Model(nn.Module):
  def __init__(self, conv_output_dim, embedding_dim, num_labels, resnet50):
    super(Model, self).__init__()

    # copy all layers of the pre-trained ResNet50 model except the fully connected layer
    self.resnet50 = nn.Sequential(*list(resnet50.children())[:-1])

    # convert the convolution output into an embedding vector
    self.embedding_layer = nn.Linear(conv_output_dim, embedding_dim)
    # activation function
    self.relu = nn.ReLU()
    # convert the embedding vectors into a vector of 'num_labels' length
    self.classification_layer = nn.Linear(embedding_dim, num_labels)
    self.softmax = nn.Softmax(dim=1)

  def forward(self, x):
    # get convolution output
    conv_output = torch.squeeze(self.resnet50(x), (2, 3))
    # get embeddings
    embeddings = self.embedding_layer(conv_output)
    # apply activation function
    embeddings_activated = self.relu(embeddings)
    # get final layer
    final_layer = self.classification_layer(embeddings_activated)
    # apply softmax to extract prediction probabilities
    pred_probs = self.softmax(final_layer)

    return embeddings, pred_probs

In [None]:
def get_dataloader(indices, dataset, batch_size=32, shuffle=True):
  """
  Gets indices of specific dataset images.
  Returns a dataloader object of the subset dataset (according to the indices).
  """
  sub_dataset = Subset(dataset, indices)
  dataloader = torch.utils.data.DataLoader(sub_dataset, batch_size=batch_size, shuffle=shuffle)

  return dataloader

## **Cluster Uncertainty Sampling**

Now, we will define helper functions that will be used when implementing our suggested sampling method:

In [None]:
def get_embeddings_and_pred_probs(indices, dataset, trained_model):
  """
  Gets indices of specific dataset images, and a tranied model.
  Returns the embeddings and prediction probabilities of those instances.
  """
  # create dataloader
  dataloader = get_dataloader(indices, dataset, shuffle=False)

  embeddings_lst = []
  pred_probs = []

  # get the embddings and predication probabilites of each batch
  for batch_X, _ in dataloader:
    batch_X = batch_X.to(DEVICE)
    batch_embeddings, batch_pred_probs = trained_model(batch_X)

    embeddings_lst.append(batch_embeddings)
    pred_probs.append(batch_pred_probs)

  # concatenate all batch results as a single list
  embeddings_lst = torch.cat(embeddings_lst)
  pred_probs = torch.cat(pred_probs)

  return embeddings_lst.cpu().detach().numpy(), pred_probs.cpu().detach().numpy()


def get_sample_clusters(embeddings, num_clusters):
  """
  Gets images embeddings and cluster them using the KMeans algorithm.
  Returns the cluster labels of the images
  """
  kmeans = KMeans(n_clusters=num_clusters, n_init='auto', random_state=SEED).fit(embeddings)
  return kmeans.labels_


def get_cluster_sizes(sample_clusters, cluster_num):
  """
  Gets samples cluster labels of images.
  Returns a list of the the cluster sizes.
  """
  cluster_sizes = [0] * cluster_num
  for cluster in sample_clusters:
    cluster_sizes[cluster] += 1

  return cluster_sizes


def get_cluster_uncertainty_scores(pred_probs, sample_clusters, cluster_num):
  """
  Get samples predication probabilites.
  Returns a list of the cluster uncertainty scores.
  """
  # calculate sample uncertainties using entropy
  sample_uncertainties = list(entropy(pred_probs, axis=1))

  cluster_members_uncertainties = [[] for i in range(cluster_num)]
  for cluster, uncertainty in zip(sample_clusters, sample_uncertainties):
    cluster_members_uncertainties[cluster].append(uncertainty)

  # calculate each cluster uncertainty score as the mean uncertainties of its members
  cluster_uncertainty_scores = [0] * cluster_num
  for cluster, cluster_members_uncertainties in enumerate(cluster_members_uncertainties):
    cluster_uncertainty_scores[cluster] = np.mean(cluster_members_uncertainties)

  return cluster_uncertainty_scores


def get_cluster_overall_scores(pred_probs, sample_clusters, cluster_num, alpha):
  """
  Gets samples predication probabilites and cluster labels.
  Returns a list of the cluster overall scores:
  weighting (using alpha) of the cluster uncertainty score and its size.
  """
  # get cluster sizes
  cluster_sizes = get_cluster_sizes(sample_clusters, cluster_num)
  cluster_sizes = np.array(cluster_sizes).reshape(-1, 1)
  # get cluster uncertainty scores
  cluster_uncertainty_scores = get_cluster_uncertainty_scores(pred_probs, sample_clusters, cluster_num)
  cluster_uncertainty_scores = np.array(cluster_uncertainty_scores).reshape(-1, 1)

  # scale both metrics to have the same scale (0-1)
  scaler = MinMaxScaler()
  scaled_cluster_uncertainty_scores = scaler.fit_transform(cluster_uncertainty_scores).reshape(1, -1)[0]
  scaled_cluster_sizes = scaler.fit_transform(cluster_sizes).reshape(1, -1)[0]

  # calculate each cluster score as a weighted sum of the scaled metrics
  cluster_overall_scores = [0] * cluster_num
  for cluster, (size, uncertainty_score) in enumerate(zip(scaled_cluster_sizes, scaled_cluster_uncertainty_scores)):
    cluster_overall_scores[cluster] = alpha * size + (1 - alpha) * uncertainty_score

  return cluster_overall_scores


def assign_remaining_budget(cluster_budgets, cluster_sizes, cluster_overall_scores, remaining_budget):
  """
  Assigns unallocated budget to clusters based on their overall scores (from highest to lowest)
  """
  # sort the cluster labels based on their scores in descending order
  highest_score_clusters = np.argsort(-np.array(cluster_overall_scores))

  for cluster in highest_score_clusters:
    cluster_budget, cluster_size = cluster_budgets[cluster], cluster_sizes[cluster]

    # check if the cluster budget could be increased
    if cluster_budget < cluster_size:

      # calculate the remaining sample amount of the cluster
      cluster_remaining_sample_amount = cluster_size - cluster_budget

      # get the increased budget delta as the minimum of the following:
      # - the remaining sample amount of the cluster
      # - the remaining_budget
      cluster_increased_budget_delta = min(cluster_remaining_sample_amount, remaining_budget)

      # increase the cluster budget
      cluster_budgets[cluster] += cluster_increased_budget_delta

      # update remaining unallocated budget for this iteration
      remaining_budget -= cluster_increased_budget_delta

      # stop the procedure when all the unallocated budget has been assigend
      if remaining_budget == 0:
        break


def get_cluster_budgets(pred_probs, sample_clusters, alpha, n_select):
  """
  Gets samples prediction probabilites and cluster labels.
  Returns the sampling budget of each cluster (proportional to 'n_select').
  """
  # calculate the number of clusters
  cluster_num = len(set(sample_clusters))

  # get cluster sizes
  cluster_sizes = get_cluster_sizes(sample_clusters, cluster_num)

  # get cluster overall scores
  cluster_overall_scores = get_cluster_overall_scores(pred_probs, sample_clusters, cluster_num, alpha)
  # convert cluster scores into sampling proportions using softmax
  cluster_relative_scores = scipy.special.softmax(cluster_overall_scores)

  # calculate each cluster sampling budget as the minimum of the following:
  # proportion of the current iteration budget ('n_select') w.r.t the realtive score
  # the total number of members
  cluster_budgets = []
  for cluster_relative_score, cluster_size in zip(cluster_relative_scores, cluster_sizes):
    cluster_budget = min(math.floor(cluster_relative_score * n_select), cluster_size)
    cluster_budgets.append(cluster_budget)

  # calculate the remaining budget
  remaining_budget = n_select - sum(cluster_budgets)

  # assign remaining budget
  if remaining_budget > 0:
    assign_remaining_budget(cluster_budgets, cluster_sizes, cluster_overall_scores, remaining_budget)

  return cluster_budgets


def get_cluster_members(pred_probs, sample_clusters, dataset_indices):
  """
  Gets samples predication probabilites and cluster labels.
  Return a dictionary of the form:
  key - cluster label
  value - list of cluster member tuples of the form (member uncertainty, member dataset index)
  """
  # calculate sample uncertainties using entropy
  uncertainties = list(entropy(pred_probs, axis=1))

  # get each cluster member tuples
  cluster_members_dict = defaultdict(list)
  for cluster, uncertainty, dataset_idx in zip(sample_clusters, uncertainties, dataset_indices):
    cluster_members_dict[cluster].append((uncertainty, dataset_idx))

  return cluster_members_dict

Now, let us implement the Cluster Uncertainty sampling method below:

In [None]:
def cluster_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model, num_clusters, alpha):
  """
  Sample indices from the availabel pool as follows:
  1. cluster the images based on thier embeddings
  2. calculate each cluster's score as a weighting of its uncertainty score and size
  3. calculate each cluster's sampling budget as its relative score times 'n_select'
  4. sample from each cluster the instances with the highest uncertainty based on its budget
  """
  # get samples embeddings and prediction probabilities
  embeddings, pred_probs = get_embeddings_and_pred_probs(available_pool_indices, dataset, trained_model)

  # cluster samples based on their embeddings
  sample_clusters = get_sample_clusters(embeddings, num_clusters)

  # get each cluster's members
  cluster_members = get_cluster_members(pred_probs, sample_clusters, available_pool_indices)

  # get each cluster's budget
  cluster_budgets = get_cluster_budgets(pred_probs, sample_clusters, alpha, n_select)

  # sample from each cluster based on its budget
  selected_indices = []
  for cluster, cluster_budget in enumerate(cluster_budgets):
    cluster_members_uncertainties = cluster_members[cluster]

    # sample the instances with the highest uncertainties
    selected_cluster_members = sorted(cluster_members_uncertainties, key=lambda x: x[UNCERTAINTY], reverse=True)[:cluster_budget]
    selected_cluster_indices = [tup[DATASET_IDX] for tup in selected_cluster_members]

    selected_indices += selected_cluster_indices

  return selected_indices

## **Baseline Sampling Methods**


Let us implement the baseline sampling methods discussed in the main paper:

- Random Sampling

- Entropy Uncertainty Samplng

- MinMax Uncertainty Sampling

In [None]:
def random_sampling(available_pool_indices, n_select):
  """
  Randomly sample indices from the available pool.
  """
  # randomly select n_select indices without replacement
  selected_indices = list(np.random.choice(available_pool_indices, n_select, replace=False))

  return selected_indices


def entropy_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model):
  """
  Sample indices from the availabel pool with the highest uncertainty.
  The uncertainty under this method is measured as follows:
  1. Calculate each example's entropy of the model's predication probabilities
  2. Choose the "n_select" examples with the highest entropy
  """

  # get prediction probabilites for unlabeled samples
  _, pred_probs = get_embeddings_and_pred_probs(available_pool_indices, dataset, trained_model)

  # calculate sample uncertainties using entropy
  uncertainties = list(entropy(pred_probs, axis=1))
  uncert_ind_tups = zip(uncertainties, available_pool_indices)

  # select samples with highest uncertaintt (entropy)
  selected_uncert_ind_tups = sorted(uncert_ind_tups, key=lambda x: x[UNCERTAINTY], reverse=True)[:n_select]
  selected_indices = [tup[DATASET_IDX] for tup in selected_uncert_ind_tups]

  return selected_indices


def minmax_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model):
  """
  Sample indices from the availabel pool with the highest uncertainty.
  The uncertainty under this method is measured as follows:
  1. For each example, find the maxiumum probability among the model's predication distribution,
  2. Choose the "n_select" examples with the lowest (minimum) maximum probability
  """
  # get prediction probabilites for unlabeled samples
  _, pred_probs = get_embeddings_and_pred_probs(available_pool_indices, dataset, trained_model)

  # calculate sample max predication probabilities
  max_pred_probs = list(pred_probs.max(axis=1))
  max_pred_prob_ind_tups = zip(max_pred_probs, available_pool_indices)

  # select samples with lowest max predication probabilities
  selected_max_pred_prob_ind_tups = sorted(max_pred_prob_ind_tups, key=lambda x: x[MAX_PRED_PROB])[:n_select]
  selected_indices = [tup[DATASET_IDX] for tup in selected_max_pred_prob_ind_tups]

  return selected_indices

## **Active Learning Pipeline**

Let us define functions that will be used during the AL pipeline run:

In [None]:
def sample_indices(sampling_method, available_pool_indices, dataset, n_select, trained_model, num_clusters, alpha):
  """
  Gets a sampling method and its respective parameters.
  Returns the selected indices accordingly.
  """
  if sampling_method == "random":
    selected_indices = random_sampling(available_pool_indices, n_select)

  elif sampling_method == "entropy uncertainty":
    selected_indices = entropy_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model)

  elif sampling_method == "minmax uncertainty":
    selected_indices = minmax_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model)

  elif sampling_method == "cluster uncertainty":
    selected_indices = cluster_uncertainty_sampling(available_pool_indices, dataset, n_select, trained_model, num_clusters, alpha)

  return selected_indices

def init_model(input_dim, embedding_dim, output_dim):
  """
  Gets dimensions for the different layers of the model.
  Returns the initialized model.
  """
  # initialize the pre-trained ResNet50 model
  resnet50 = models.resnet50(weights=ResNet50_Weights.DEFAULT).to(DEVICE)

  # initialize the model
  model = Model(input_dim, embedding_dim, output_dim, resnet50).to(DEVICE)

  # freeze convolutional layers parameters
  for param in model.resnet50.parameters():
    param.requires_grad = False

  return model


def train_model(model, train_indices, dataset, epochs):
  """
  Trains the model on the train instances of the dataset.
  """
  # initialize optimizer
  optimizer = optim.Adam(model.parameters(), lr=0.001)
  # initialize loss function
  criterion = nn.CrossEntropyLoss()
  # create dataloader
  dataloader = get_dataloader(train_indices, dataset)

  # train the model the the train set
  for epoch in range(epochs):
    for train_X, train_y in dataloader:
      train_X, train_y = train_X.to(DEVICE), train_y.to(DEVICE)
      _, pred_probs = model(train_X)

      pred_probs = pred_probs.to(DEVICE)
      loss = criterion(pred_probs, train_y)
      loss.backward()
      optimizer.step()

  return model


def evaluate_model_accuracy(trained_model, test_indices, dataset):
  """
  Evaluates the trained model on the test instances of the dataset (using accuracy).
  """
  # set model to evaluation model
  trained_model.eval()
  # create dataloader
  dataloader = get_dataloader(test_indices, dataset, shuffle=False)

  # calculate the trained model accuracy score on the test set
  summ = 0
  for test_X, test_y in dataloader:
    test_X, test_y = test_X.to(DEVICE), test_y.to(DEVICE)
    _, pred_probs = trained_model(test_X)

    pred_probs = pred_probs.to(DEVICE)
    pred_labels = torch.argmax(pred_probs, dim=1)

    summ += torch.sum(pred_labels == test_y)

  accuracy = summ / len(test_indices)
  return accuracy.item()


def performance_analysis(trained_model, test_indices, dataset, iter, sampling_method):
  """
  Analyzes the trained model's performance on the test set in every iteration,
  based on several evaluation metrics:

  - Confusion matrix items
  - TPR and TNR rates
  - F1 score

  Writes the results to a txt file with the name "performance_analysis - <sampling_method>.txt".
  """
  # set model to evaluation model
  trained_model.eval()
  # create dataloader
  dataloader = get_dataloader(test_indices, dataset, shuffle=False)

  # count confusion matrix items (TP, FP, TN, FN)
  TP, FP, TN, FN = 0, 0, 0, 0
  for batch_num, (test_X, test_y) in enumerate(dataloader):
    test_X, test_y = test_X.to(DEVICE), test_y.to(DEVICE)
    _, pred_probs = trained_model(test_X)

    pred_probs = pred_probs.to(DEVICE)
    pred_labels = torch.argmax(pred_probs, dim=1)

    TP += torch.sum((pred_labels == 1) & (test_y == 1))
    FP += torch.sum((pred_labels == 1) & (test_y == 0))
    TN += torch.sum((pred_labels == 0) & (test_y == 0))
    FN += torch.sum((pred_labels == 0) & (test_y == 1))

  # calculate metrics and write to txt file
  recall, precision = TP / (TP + FN), TP / (TP + FP)
  f1_score = 2 * (precision * recall) / (precision + recall)

  with open(f"performance_analysis - {sampling_method}.txt", "a") as f:
    f.write(f"Evaluation Metrics --- Iteration {iter}:\n")
    # write confusion matrix items
    f.write(f"TP: {TP} \nFP: {FP} \nTN: {TN} \nFN: {FN} \n")
    # write TPR and TNR rates
    f.write(f"TPR: {TP / (TP + FN)} \nTNR: {TN / (TN + FP)} \n")
    # write the F1 score
    f.write(f"F1 Score: {f1_score:.4f}\n\n")

Lastly, let us implement the active learning pipeline:

In [None]:
def AL_pipeline(train_indices, available_pool_indices, test_indices,
                budget_per_iter, total_budget, num_iters, epochs,
                dataset, sampling_method, num_clusters=None, alpha=None,
                conv_output_dim=2048, embedding_dim=1024, num_labels=2,
                analyze_performance=False):
  """
  Gets the AL pipeline parameters, and runs it accordingly.
  Returns the accuracy scores of the model at each iteration.
  """
  accuracy_scores = []
  budget_left = total_budget
  for iter in range(num_iters):
    print()
    print(f"Iteration: {iter}")

    # initialize the model
    model = init_model(conv_output_dim, embedding_dim, num_labels)

    # train the model (on the train set)
    trained_model = train_model(model, train_indices, dataset, epochs)

    # calculate test set accuracy
    accuracy = evaluate_model_accuracy(trained_model, test_indices, dataset)
    accuracy_scores.append(round(accuracy, 4))

    # analyze performance if needed
    if analyze_performance:
      performance_analysis(trained_model, test_indices, dataset, iter, sampling_method)

    # find the number of instances to add to the train set
    n_select = min(budget_per_iter, len(available_pool_indices))
    budget_left -= n_select

    # check if the budget has been reached
    if n_select == 0:
      break

    # select instances to label based on the choosen sampling method
    selected_indices = sample_indices(sampling_method, available_pool_indices, dataset, n_select,
                                      trained_model, num_clusters, alpha)

    # add selected indices to the train set
    train_indices += selected_indices

    # remove the new added train indices from the available pool
    available_pool_indices = list(set(available_pool_indices) - set(selected_indices))

  return accuracy_scores

## **Experiments**

In this section, we will run all experiments discussed in the main paper, generate the results, and write them to TXT files.


First, please upload the ZIP file which can be found in the following [link](https://drive.google.com/file/d/1fJumk1nYybXVXSbzHVo8oATe4X57NVbs/view?usp=drive_link), **to the same directory as this notebook**, and run the following cell. If you don't, please change the path for the ```zip_dir``` variable below, to the correct path of the ZIP file.


In [None]:
zip_dir = 'data.zip'

Then, unzip the file:

In [None]:
with zipfile.ZipFile(zip_dir, 'r') as zip_ref:
 zip_ref.extractall()

Now, if the 'data' directory created after unzipping 'data.zip' is not in the same directory as this notebook, please move it there. If you don't, please change the path for the ```data_dir``` variable below, to the correct 'data' directory path.

In [None]:
data_dir = 'data'

Now that the data is set, let us process it:

In [None]:
# transform the dataset and load into a Dataset object
transformer = transforms.Compose(
              [transforms.Resize((IMG_SIZE, IMG_SIZE)),
                transforms.ToTensor()])

dataset = datasets.ImageFolder(data_dir, transformer)

# randomize the instances
randomized_indices = torch.randperm(len(dataset)).tolist()
dataset = Subset(dataset, randomized_indices)

Now, we will define the following helper function before conducting the experiments:

In [None]:
def split_data_indices(data_size):
  """
  Splits the data into train, test, and available pool indices, as discussed
  in the main paper.
  """
  data_size = len(dataset)
  train_indices = list(range(0, int(data_size*0.2)))
  test_indices = list(range(int(data_size*0.2), int(data_size*0.4)))
  available_pool_indices = list(range(int(data_size*0.4), int(data_size)))

  return train_indices, test_indices, available_pool_indices

### **Hyperparameter Tuning**

Let us perform grid search in order to optimize for the best (k, $\alpha$) hyperparameters, with the embedding size $D$ set constant at 1024, as discussed in the paper:

In [None]:
def grid_search(dataset, k_values, alpha_values):
  """
  Performs grid search in order to optimize for the best (k, alpha) hyperparameters.
  Writes the accuracy results to a TXT file named "combination_accuracies.txt".
  """
  for k in k_values:
    for alpha in alpha_values:
      print(f"----- k={k}, alpha={alpha} -----")

      # reset random seed
      set_random_seeds(SEED)

      # split to indices
      train_indices, test_indices, available_pool_indices = split_data_indices(len(dataset))

      # run AL pipeline
      num_iters = 10
      total_budget = len(available_pool_indices)  # total number of available unlabeled instances
      budget_per_iter = math.floor(len(available_pool_indices) / num_iters)  # 10% of the total number

      accuracies = AL_pipeline(train_indices, available_pool_indices, test_indices, budget_per_iter=budget_per_iter,
             total_budget=total_budget, num_iters=num_iters, epochs=3, dataset=dataset, sampling_method="cluster uncertainty",
	           num_clusters=k, alpha=alpha, embedding_dim=1024)

      # write results to TXT file
      with open("combination_accuracies.txt", "a") as f:
        f.write(f"({k}, {alpha}): {accuracies}\n")

In [None]:
k_values = [2, 4, 6, 8, 10]
alpha_values = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]

grid_search(dataset, k_values, alpha_values)

**Note:** The accuracy results are written to a TXT file named ```combination_accuracies.txt```, and are displayed in the provided ```display_results.ipynb``` notebook.

After obtaining the results, the best hyperparameter values are k = 6 and $\alpha$ = 0.4 (can be seen in the ```display_results.ipynb``` file or in the main paper). Therefore, we will proceed with those tuned values, and optimize for the best performing embedding dimension:

In [None]:
best_k, best_alpha = 6, 0.4

In [None]:
def optimize_embedding_dim(dataset, embedding_dim_lst):
  """
  Optimizes for the best embedding dimension.
  Writes the accuracy results to a TXT file named "embedding_dim_accuracies.txt".
  """
  for embedding_dim in embedding_dim_lst:
    print(f"--- running embedding dim {embedding_dim} ---")
    # reset random seed
    set_random_seeds(SEED)

    # split to indices
    train_indices, test_indices, available_pool_indices = split_data_indices(len(dataset))

    # run AL pipeline
    num_iters = 10
    total_budget = len(available_pool_indices)
    budget_per_iter = math.floor(len(available_pool_indices) / num_iters)

    accuracies = AL_pipeline(train_indices, available_pool_indices, test_indices, budget_per_iter=budget_per_iter,
            total_budget=total_budget, num_iters=num_iters, epochs=3, dataset=dataset, sampling_method="cluster uncertainty",
            num_clusters=best_k, alpha=best_alpha, embedding_dim=embedding_dim)

    # write results to TXT file
    with open("embedding_dim_accuracies.txt", "a") as f:
      f.write(f"embedding dim {embedding_dim}: {accuracies}\n")

In [None]:
embedding_dim_lst = [256, 512, 768]  # embedding dim 1024 was already examined previously
optimize_embedding_dim(dataset, embedding_dim_lst)

**Note:** The accuracy results are written to a TXT file named ```embedding_dim_accuracies.txt```, and are displayed in the provided ```display_results.ipynb``` notebook.

After obtaining the results, the best embedding dimension is $D=1024$ (as can be seen in the ```display_results.ipynb``` file or in the main paper).

In [None]:
best_embedding_dim = 1024

**Hyperparameter Tuning Summary**

Thus, to summarize, the optimal hyperparameters for our sampling method are:

- k = 6

- $\alpha$ = 0.4

- $D$ = 1024

### **Performance Analysis**

In this section, we analyze our method's performance and compare it to baseline sampling methods:

- Random Sampling

- Entropy Uncertainty Sampling

- MinMax Uncertainty Sampling

In [None]:
sampling_methods = ["cluster uncertainty", "random", "entropy uncertainty", "minmax uncertainty"]

When calling the AL pipeline function, we will set ```analyze_results=True```, in order to calculate the following calculation metrics and write to a TXT file:

- Confusion matrix items (TP, FP, TN, FN)

- TPR and TNR rates

- F1 score

- Accuracy

In [None]:
def compare_sampling_methods(dataset, sampling_methods):
  """
  Compares the performance of the AL pipeline with all baseline sampling methods.
  The baselines examined, as well as the evaluation metrics, are detailed in the main paper.
  Writes the evaluation metric results to a TXT file named "performance_analysis - <sampling_method>.txt",
  for each examined sampling method in 'sampling_methods'.
  """
  for sampling_method in sampling_methods:
    print(f"--- running {sampling_method} sampling method ---")

    # reset random seed
    set_random_seeds(SEED)

    # split to indices
    train_indices, test_indices, available_pool_indices = split_data_indices(len(dataset))

    # run AL pipeline
    num_iters = 10
    total_budget = len(available_pool_indices)
    budget_per_iter = math.floor(len(available_pool_indices) / num_iters)

    if sampling_method == "cluster uncertainty":
      num_clusters, alpha = best_k, best_alpha
    else:
      num_clusters, alpha = None, None

    accuracies = AL_pipeline(train_indices, available_pool_indices, test_indices, budget_per_iter=budget_per_iter,
                total_budget=total_budget, num_iters=num_iters, epochs=3, dataset=dataset, sampling_method=sampling_method,
                num_clusters=num_clusters, alpha=alpha, embedding_dim=best_embedding_dim, analyze_performance=True)

    # The performance analysis does not cover accuracies across the iterations.
    # Thus, let us write it to the appropriate performance analysis TXT file manually:
    with open(f"performance_analysis - {sampling_method}.txt", "a") as f:
      f.write(f"Accuracies across all iterations: {accuracies}\n")

In [None]:
compare_sampling_methods(dataset, sampling_methods)

**Note:** The evaluation metric results are written to a TXT file named ```performance_analysis - <sampling_method>.txt``` where ```<sampling_method>``` is the appropriate method that was evaluated. Moreover, the results are displayed in the provided ```display_results.ipynb``` notebook.

## **Generated Results Summary**

To summarize, the following files were generated, containing all experiment results:

- ```combination_accuracies.txt```

- ```embedding_dim_accuracies.txt```

- ```performance_analysis - cluster uncertainty.txt```

- ```performance_analysis - random.txt```

- ```performance_analysis - entropy uncertainty.txt```

- ```performance_analysis - minmax uncertainty.txt```


As mentioned above, in order to display these results, please refer to the provided ```display_results.ipynb``` notebook.