# Active Learning on Covid lung scans with VGG16

We will use tensorflow to import the vgg 16 model.  
The dataset can be found here: https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database

## Module tf-al
https://pypi.org/project/tf-al/  
https://exleonem.github.io/tf-al/

In [1]:
import os

# Standard scientific Python imports
import matplotlib.pyplot as plt
import numpy as np

# Used for converting 24 bit to 8 bit images
from PIL import Image

# Import classifiers and performance metrics
from sklearn.model_selection import train_test_split

from keras.applications.vgg16 import VGG16
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import categorical_crossentropy

2022-11-24 11:24:09.894311: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-11-24 11:24:12.734234: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-11-24 11:24:13.982880: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /apps/gent/RHEL8/cascadelake-ib/software/ZeroMQ/4.3.4-GCCcore-11.2.0/lib:/apps/gent

In [2]:
path = "ActiveLearning_ImageClassification/COVID-19_Radiography_Dataset/"

# Load training data
X_train = np.load(path + "x_train.npy")
X_test = np.load(path + "x_test.npy")
y_train = np.load(path + "y_train.npy")
y_test = np.load(path + "y_test.npy")
print(X_train.shape)

y_train = np.unique(y_train, return_inverse=True)[1]
print(y_train)

(2400, 224, 224, 3)
[0 0 0 ... 1 1 0]


## Start Training model

In [11]:
base_model = VGG16(weights=None, classes=3)
opt = Adam(learning_rate=0.001)

base_model.compile(optimizer=opt, loss=categorical_crossentropy, metrics=['accuracy'])
base_model.summary()

Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0     

In [4]:
# create the y_train with prediction values
new_train = np.array([[0.0, 0.0, 0.0] for i in y_train])
for index, elem in enumerate(y_train):
    new_train[index, elem] = 1.0
print(new_train)

[[1. 0. 0.]
 [1. 0. 0.]
 [1. 0. 0.]
 ...
 [0. 1. 0.]
 [0. 1. 0.]
 [1. 0. 0.]]


In [20]:
from tf_al.wrapper import McDropout
from tf_al import Config
from tensorflow import keras

# Wrap, configure and compile
model_config = Config(
    fit={"epochs": 200, "batch_size": 10},
    query={"sample_size": 25},
    eval={"batch_size": 900, "sample_size": 25}
)
model = McDropout(base_model, config=model_config)
model.compile(
    optimizer="adam", 
    loss="categorical_crossentropy", metrics=['accuracy']
)

In [None]:
# Fit model to data
model.fit(X_train, new_train, batch_size=25, epochs=100)

Epoch 1/200