<h1>ALBench project: al_bench</h1>

<p>This Jupyter lab demonstrates use of the al_bench Active Learning Benchmark Tool</p>

<h2>Install needed Python packages</h2>

<p>If you haven't yet installed these packages, remove the "<code>#</code>" characters and run this code block.<p>

In [1]:
# !pip install -e ../../ALBench  # Installs al_bench and dependencies

<h2>Overview</h2>

<p>The tool takes an input dataset, machine learning model, and active learning strategy and outputs information to be used in evaluating how well the strategy does with that model and dataset. By running the tool multiple times with different inputs, the tool allows comparisons across different active learning strategies and also allows comparisons across different models and across different datasets. Researchers can use the tool to test proposed active learning strategies in the context of a specific model and dataset; or multiple models and datasets can be used to get a broader picture of each strategy's effectiveness in multiple contexts. As an alternative use case, multiple runs of the tool with different models and datasets can be compared, evaluating these models and datasets for their compatibility with a given active learning strategy.</p>

<p>In the present example, we will compare several active learning strategies, each employed on the same model and dataset.  To do this we will fetch a dataset and provide it to a dataset handler, and we will build a model and provide it to a model handler.  These are then used with each of the active learning strategy handlers.</p>

<h2>Find a dataset and create a Dataset Handler</h2>

<p>We fetch a dataset of 4598 feature vectors of length 1280 and the associated label for each feature vector.  The benchmarking tool requires that all examples be labeled, although the labels are not used initially.  The label for a sample is revealed to the machine learning training only when the active learning strategy indicates that the clinician has been asked to label that sample.</p>

In [2]:
import al_bench as alb
import h5py as h5
import numpy as np
import random
from datetime import datetime

filename = "../test/TCGA-A2-A0D0-DX1_xmin68482_ymin39071_MPP-0.2500.h5py"
with h5.File(filename) as ds:
    my_feature_vectors = np.array(ds["features"])
    print(
        f"Read in {my_feature_vectors.shape[0]} feature vectors of length {my_feature_vectors.shape[1]}."
    )
    my_labels = np.array(ds["labels"])
    print(f"Read in {my_labels.shape[0]} labels for the feature vectors.")
my_label_definitions = [
    {
        0: {"description": "other"},
        1: {"description": "tumor"},
        2: {"description": "stroma"},
        3: {"description": "infiltrate"},
    }
]
my_dataset_handler = alb.dataset.GenericDatasetHandler()
my_dataset_handler.set_all_feature_vectors(my_feature_vectors)
my_dataset_handler.set_all_label_definitions(my_label_definitions)
my_dataset_handler.set_all_labels(my_labels)

# Set aside disjoint sets of examples for use in validation and as the initial training set
number_of_validation_indices = my_feature_vectors.shape[0] // 10
number_of_initial_training = 20
random_samples = random.sample(
    range(my_feature_vectors.shape[0]),
    number_of_validation_indices + number_of_initial_training,
)
my_dataset_handler.set_validation_indices(
    np.array(random_samples[:number_of_validation_indices])
)
currently_labeled_examples = np.array(random_samples[number_of_validation_indices:])

2023-02-13 10:33:24.590112: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-13 10:33:24.741693: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-02-13 10:33:24.779296: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-02-13 10:33:25.713309: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; 

Read in 4598 feature vectors of length 1280.
Read in 4598 labels for the feature vectors.


<h2>Create a model and a Model Handler</h2>

<p>Build a model that we will train.  We will build both a TensorFlow model and a PyTorch model.  As part of our comparison we could compare them, however we will not do so.  We'll choose one of them for use with the active learning strategies.  First we set some variables with common parameters.</p>

In [3]:
number_of_categories = len(my_label_definitions[0])
number_of_features = my_feature_vectors.shape[1]
hidden_units = 128
dropout = 0.3

<h3>Build a TensorFlow model and its Model Handler</h3>

In [4]:
import tensorflow as tf

my_tensorflow_model = tf.keras.models.Sequential(
    [
        tf.keras.Input(shape=(number_of_features,)),
        tf.keras.layers.Dense(hidden_units, activation="relu"),
        tf.keras.layers.Dropout(dropout, noise_shape=None, seed=20220909),
        tf.keras.layers.Dense(number_of_categories, activation="softmax"),
    ],
    name=(
        f"{number_of_categories}_labels_from_{number_of_features}_features_with_"
        f"dropout_{dropout}"
    ),
)
my_tensorflow_model_handler = alb.model.NonBayesianTensorFlowModelHandler()
my_tensorflow_model_handler.set_model(my_tensorflow_model)
print("Tensorflow model handler built")

Tensorflow model handler built


2023-02-13 10:33:28.148010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2006] Ignoring visible gpu device (device: 1, name: Quadro P400, pci bus id: 0000:a6:00.0, compute capability: 6.1) with core count: 2. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
2023-02-13 10:33:28.148375: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-13 10:33:28.800814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22331 MB memory:  -> device: 0, name: NVIDIA RTX A5000, pci bus id: 0000:73:00.0, compute capability: 8.6


<h3>Build a Torch model and its Model Handler</h3>

In [5]:
import torch


class MyTorchModel(torch.nn.modules.module.Module):
    def __init__(self, number_of_features, number_of_categories):
        super(MyTorchModel, self).__init__()
        self.fc1 = torch.nn.Linear(number_of_features, hidden_units)
        self.relu1 = torch.nn.ReLU()
        self.dropout1 = torch.nn.Dropout(p=dropout)
        self.fc2 = torch.nn.Linear(hidden_units, number_of_categories)
        self.softmax1 = torch.nn.Softmax(dim=-1)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.dropout1(x)
        x = self.fc2(x)
        x = self.softmax1(x)
        return x


my_torch_model = MyTorchModel(number_of_features, number_of_categories)

my_pytorch_model_handler = alb.model.NonBayesianPyTorchModelHandler()
my_pytorch_model_handler.set_model(my_torch_model)
print("PyTorch model handler built")

PyTorch model handler built


<h3>Choose one of the models to proceed with</h3>

<p>The rest of the code is agnostic to whether one is using a TensorFlow or PyTorch model, or some of each.  One proceeds with whichever model handlers one wants to use.</p>

In [6]:
# my_model_handler = my_tensorflow_model_handler
my_model_handler = my_pytorch_model_handler

<h2>Make use of Strategy Handlers for active learning</h2>

<p>Let's run and compare four active learning strategies.  Each strategy looks at the unlabeled samples, ranks them, and the selects the samples that appear to be the least certain predictions, by one of several evaluaiton methods.  Key to understanding these evaluation methods is understanding that a prediction for a sample is made by a machine learning algorithm by computing a score for each possible label -- the scores are nonnegative and sum to 1.0 -- and then chosing the label that scores highest.</p>

<p>There are different ways to choose which unlabeled samples should be labeled next.  We will demonstrate four:
<ol>
    <li>"Random": Select the next samples randomly</li>
    <li>"LeastConfidence": A sample's certainy is defined as the predicted label's score.</li>
    <li>"LeastMargin": A sample's certainty is defined by the difference between the predicted label's score and the score of the second-best label.</li>
    <li>"Entropy": A sample's scores are interpreted as a probability distribution and its entropy is computed.  Because high entropy means more uncertainty, a sample's certainty is defined to be the negative of the entropy.</li>
</ol></p>

In [7]:
import shutil
import os

all_logs_dir = "runs-SimpleExample"
try:
    shutil.rmtree(all_logs_dir)  # DELETE OLD LOG FILES
except:
    pass

for name, my_strategy_handler in (
    ("Random", alb.strategy.RandomStrategyHandler()),
    ("LeastConfidence", alb.strategy.LeastConfidenceStrategyHandler()),
    ("LeastMargin", alb.strategy.LeastMarginStrategyHandler()),
    ("Entropy", alb.strategy.EntropyStrategyHandler()),
):
    print(f"=== Begin Strategy {repr(name)} at {datetime.now()} ===")
    my_strategy_handler.set_dataset_handler(my_dataset_handler)
    my_strategy_handler.set_model_handler(my_model_handler)
    my_strategy_handler.set_learning_parameters(
        label_of_interest=0,  # We've supplied only one label per feature vector
        maximum_queries=8,
        number_to_select_per_query=10,
    )

    # ################################################################
    # Simulate the strategy.
    my_strategy_handler.run(currently_labeled_examples)
    # ################################################################

    # We will write out collected information to disk.  First say where:
    log_dir = os.path.join(all_logs_dir, name)
    # Write accuracy and loss information during training
    my_strategy_handler.write_train_log_for_tensorboard(log_dir=log_dir)
    # Write confidence statistics during active learning
    my_strategy_handler.write_confidence_log_for_tensorboard(log_dir=log_dir)
print(f"=== Done at {datetime.now()} ===")

=== Begin Strategy 'Random' at 2023-02-13 10:33:29.341913 ===
Training with 20 examples
Predicting for 4598 examples
Training with 30 examples
Predicting for 4598 examples
Training with 40 examples
Predicting for 4598 examples
Training with 50 examples
Predicting for 4598 examples
Training with 60 examples
Predicting for 4598 examples
Training with 70 examples
Predicting for 4598 examples
Training with 80 examples
Predicting for 4598 examples
Training with 90 examples
Predicting for 4598 examples
Training with 100 examples
Predicting for 4598 examples
=== Begin Strategy 'LeastConfidence' at 2023-02-13 10:33:58.095140 ===
Training with 20 examples
Predicting for 4598 examples
Training with 30 examples
Predicting for 4598 examples
Training with 40 examples
Predicting for 4598 examples
Training with 50 examples
Predicting for 4598 examples
Training with 60 examples
Predicting for 4598 examples
Training with 70 examples
Predicting for 4598 examples
Training with 80 examples
Predicting for 

<h2>Use with TensorBoard</h2>

<p>TensorBoard provides a way to graph the information from the log files we have written.  If it is not blocked by a firewall, the TensorBoard graphics will appear in this Jupyter lab.  Otherwise, the TensorBoard output can be made to appear in any web browser by launching "<code>tensorboard --logdir runs</code>" from a command prompt and then asking the web browser to load "<code>http://localhost:6006/</code>".</p>

<p>Because these are randomized simulations you will not see the same output each time you run them.  Clicking on the "Scalars" tab allows one to change the smoothing of the displayed graphics, e.g., to 0.</p>

<p>The Confidence graphs show how the certainty, among samples that are (simulated as) not yet labeled, changes during the active learning process; specifically, as a function of the number of samples that have been labeled so far.  For example, Confidence/margin/10% measures certainty for a sample's prediction as the difference between the two highest-scoring lablels, and the 10% indicates that this is the 10 percentile among all unlabeled samples -- which is among the worst performing of these samples.  Confidence/entropy/50% shows the median value among unlabeled samples of the negative entropy.  Confidence/maximum/5% shows the 5 percentile value -- among the very worst -- for a sample's maximum label score, which is the score for its predicted label.</p>

In [8]:
%load_ext tensorboard
%tensorboard --logdir {all_logs_dir}