## Class imbalance fixes

We will once again be returning to our pneumonia data set with a special emphasis on class imbalances. The goal will be to explore various methods to overcome these imbalances with the larger aim of increasing our precision/sensitivity.

**Note:** You can greatly improve the computation speed in Google Colab by connecting to a GPU. Click the "Runtime" tab in the top ribbon, then "Change runtime type". You can then select "T4 GPU". Note, however, that GPUs are subject to availability; Google has a fixed (and unspecified) amount of resources available at any given time, so a GPU may not be available. Feel free to try again later if you don't succeed at first.

## Libraries to Import

In [None]:
!pip install tensorflow==2.18.0
!pip install keras==3.8.0
import keras
from keras.models import Sequential
from keras.layers import Dense, Conv2D , MaxPool2D , Flatten , Dropout , BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import classification_report, confusion_matrix
import cv2
import os
import numpy as np
from collections import Counter
import random
import tensorflow as tf
from imblearn.over_sampling import RandomOverSampler, SMOTE
from imblearn.under_sampling import RandomUnderSampler
import pickle



**Real quick**: make sure tensorflow and keras are version 2.18.0 and 3.8.0, respectively by running the cells below.

If either is showing the wrong version, restart the session by clicking the "Runtime" tab up top and selecting "Restart session". After that, run the notebook again from the top.

In [None]:
!pip show tensorflow #Should be version 2.18.0

Name: tensorflow
Version: 2.18.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /usr/local/lib/python3.11/dist-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, libclang, ml-dtypes, numpy, opt-einsum, packaging, protobuf, requests, setuptools, six, tensorboard, tensorflow-io-gcs-filesystem, termcolor, typing-extensions, wrapt
Required-by: dopamine_rl, tensorflow-text, tensorflow_decision_forests, tf_keras


In [None]:
!pip show keras #Should be version 3.8.0

Name: keras
Version: 3.8.0
Summary: Multi-backend Keras
Home-page: 
Author: 
Author-email: Keras team <keras-users@googlegroups.com>
License: Apache License 2.0
Location: /usr/local/lib/python3.11/dist-packages
Requires: absl-py, h5py, ml-dtypes, namex, numpy, optree, packaging, rich
Required-by: tensorflow


In [None]:
def set_random_seed():
  seed = 18
  # Set random seeds for reproducibility
  random.seed(seed)
  np.random.seed(seed)
  tf.random.set_seed(seed)

  # (For TensorFlow GPU determinism)
  os.environ['CUDA_VISBLE_DEVICE'] = ''
  os.environ['TF_DETERMINISTIC_OPS'] = '1'
  os.environ['PYTHONHASHSEED'] = str(1234)

## Part 1: Loading the Image Data

As in HW4, upload the imbalanced X-ray image files as follows. :


1.   Upload the files `imbalanced_xray_train.pkl`, `imbalanced_xray_test.pkl`, and `imbalanced_xray_val.pkl` from your computer into Colab's file tree. You can either drag and drop these files from your computer or use the upload button in Colab. Be sure you are uploading the **imbalanced** data!
2.  Run the cells below to populate the train, test, and val variables.


In [None]:
#Run this cell, but DO NOT EDIT
def get_data(pkl_path):
    with open(pkl_path, 'rb') as f:
      # Read the data from the file
      data = pickle.load(f)
    return data

In [None]:
#Run this cell, but DO NOT EDIT

train = get_data("imbalanced_xray_train.pkl")
test = get_data("imbalanced_xray_test.pkl")
val = get_data("imbalanced_xray_val.pkl")

As the name imbalanced_xray implies, the key difference between the dataset used in this homework and that used in the prior assigment is the number of examples of each data class. In the imbalanced dataset, there are roughly 4 times as many training instances of Pneumonia lungs as there are Normal lungs. The goal of this homework is to explore the effect of this imbalance as well as ways to overcome it.

## Part 2: Data Preparation



1.   Separate the image data from their corresponding labels into x_train, x_val, x_test, y_train, y_val, and y_test arrays.
2.   Normalize the x data by dividing by 255.
3.   Reshape x_train, x_val, and x_test such that they are numpy arrays of shape (|x|, 150, 150, 1)
4.   Convert y_train, y_val, and y_test to numpy arrays.



In [None]:
x_train = [train[i][0] for i in range(len(train))]
y_train = [train[i][1] for i in range(len(train))]

x_val = [val[i][0] for i in range(len(val))]
y_val = [val[i][1] for i in range(len(val))]

x_test = [test[i][0] for i in range(len(test))]
y_test = [test[i][1] for i in range(len(test))]

# normalize
x_train = np.array(x_train) / 255
x_val = np.array(x_val) / 255
x_test = np.array(x_test) / 255

# reshape
x_train = x_train.reshape(x_train.shape[0], 150, 150, 1)
x_val = x_val.reshape(x_val.shape[0], 150, 150, 1)
x_test = x_test.reshape(x_test.shape[0], 150, 150, 1)

# convert labels to numpy
y_train = np.array(y_train)
y_val = np.array(y_val)
y_test = np.array(y_test)

## Part 3a: Naive Random Oversampling

Before we train any CNNs, we want to try to address the class imbalance. One way we can do so is with oversampling methods, the most naive of which is to generate new samples by randomly sampling with replacement the available training data.

In the cell below, we use imblearn's RandomOverSampler with a random state of 0 to create resampled `x_train` and `y_train` datasets for use in a CNN.

In [None]:
random_state = 42

x_resampled, y_resampled = RandomOverSampler(random_state=random_state).fit_resample(x_train.reshape((x_train.shape[0], 150*150)), y_train)

Lastly, we need to reshape our `x` data one more time for use in our CNN. Reshape `x_resampled` such that it is an array of shape (-1, 150, 150, 1).

In [None]:
x_resampled = x_resampled.reshape(-1, 150, 150, 1)

In [None]:
print(Counter(y_train))
print(Counter(y_resampled))

Counter({np.int64(1): 3875, np.int64(0): 970})
Counter({np.int64(0): 3875, np.int64(1): 3875})


## Part 3b: Random Oversampled CNN

We will now train a CNN on both the standard data and the resampled data. For the first run, train the network below using the unmodified `x_train` and `y_train`. Run the evaluation cells below to see the class imbalance's effect on the model's recall. Next, train the model again using `x_resampled` and `y_resampled`. Once again, generate evaluation metrics to observe the effect that oversampling had on the model's recall.

In [None]:
set_random_seed()

model_oversample = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(128, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

model_oversample.compile(optimizer='adam', loss='binary_crossentropy', metrics=[keras.metrics.BinaryAccuracy(), keras.metrics.Precision(), keras.metrics.Recall()])

num_epochs = 2
batch_size = 32

# model_oversample.fit(x = x_train, y = y_train, epochs=num_epochs, validation_data = (x_val, y_val), batch_size=batch_size)
model_oversample.fit(x = x_resampled, y = y_resampled, epochs=num_epochs, validation_data = (x_val, y_val), batch_size=batch_size)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/2
[1m243/243[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 29ms/step - binary_accuracy: 0.8233 - loss: 0.3442 - precision_23: 0.8446 - recall_23: 0.7931 - val_binary_accuracy: 0.8750 - val_loss: 0.2091 - val_precision_23: 0.8000 - val_recall_23: 1.0000
Epoch 2/2
[1m243/243[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 26ms/step - binary_accuracy: 0.9706 - loss: 0.0835 - precision_23: 0.9744 - recall_23: 0.9665 - val_binary_accuracy: 0.9375 - val_loss: 0.1752 - val_precision_23: 0.8889 - val_recall_23: 1.0000


<keras.src.callbacks.history.History at 0x7ec1b7a39950>

Run the cells below to generate evaluation metrics for your model. Pay particular attention to the recall metric for the Normal (0) class.

In [None]:
# evaluations for model trained on unmodified x_train and y_train
predictions = model_oversample.predict(x_test)
binary_predictions = np.where(predictions > 0.5, 1, 0)
print(classification_report(y_test, binary_predictions, target_names = ['Normal (Class 0)','Pneumonia (Class 1)']))

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step
                     precision    recall  f1-score   support

   Normal (Class 0)       0.94      0.41      0.57       234
Pneumonia (Class 1)       0.74      0.98      0.84       390

           accuracy                           0.77       624
          macro avg       0.84      0.70      0.71       624
       weighted avg       0.81      0.77      0.74       624



In [None]:
results = model_oversample.evaluate(x_test,y_test)
print(f"Accuracy: {results[1]}")
print(f"Precision: {results[2]}")
print(f"Recall: {results[3]}")

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - binary_accuracy: 0.6035 - loss: 4.7761 - precision_2: 0.3738 - recall_2: 0.6596
Accuracy: 0.7692307829856873
Precision: 0.7356321811676025
Recall: 0.9846153855323792


In [None]:
"""
There are very few true normal samples and a large number of false negatives
(labeled pneumonic but actually normal) since there are many more pneumonia samples.
so it makes sense that recall for the normal class is low, and recall for the
pneumonia class is high.

The normal sample (class 0) recall value decreases when we use the resampled dataset.
It's possible that the model overfit to the duplicated normal images, but still failed
to generalize to new ones.
"""

In [None]:
# evaluations for model trained on resampled data
predictions = model_oversample.predict(x_test)
binary_predictions = np.where(predictions > 0.5, 1, 0)
print(classification_report(y_test, binary_predictions, target_names = ['Normal (Class 0)','Pneumonia (Class 1)']))

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step
                     precision    recall  f1-score   support

   Normal (Class 0)       0.98      0.42      0.59       234
Pneumonia (Class 1)       0.74      0.99      0.85       390

           accuracy                           0.78       624
          macro avg       0.86      0.71      0.72       624
       weighted avg       0.83      0.78      0.75       624



In [None]:
results = model_oversample.evaluate(x_test,y_test)
print(f"Accuracy: {results[1]}")
print(f"Precision: {results[2]}")
print(f"Recall: {results[3]}")

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - binary_accuracy: 0.6094 - loss: 1.8689 - precision_23: 0.3777 - recall_23: 0.6633
Accuracy: 0.7804487347602844
Precision: 0.7418738007545471
Recall: 0.9948717951774597


When you're ready, save the model trained on `x_resampled` and `y_resampled`




In [None]:
#Run this cell to mount your Google Drive (click through pop-up windows to authenticate)
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
path = "drive/MyDrive/EAS5860" #(Ex: "drive/MyDrive/EAS5860/HW5")

model_oversample.save(os.path.join(path, "EAS5860_HW5_Part_3.keras"))

## Part 4a: Random Undersampling

In addition to oversampling the minority class, we can undersample the majority class. In the cell below, implement imblearn's RandomUnderSampler to create undersampled versions of x_train and y_train. You can read more about the under sampling method here: [click me](https://imbalanced-learn.org/stable/under_sampling.html)

As before, be mindful of reshaping your data.

In [None]:
random_state= 42

x_undersampled, y_undersampled = RandomUnderSampler(random_state=random_state).fit_resample(x_train.reshape((x_train.shape[0], 150*150)), y_train)

In [None]:
x_undersampled = x_undersampled.reshape(-1, 150, 150, 1)

## Part 4b: Random Undersampled CNN

Train the network below using `x_undersampled` and `y_undersampled`.

In [None]:
set_random_seed()

model_undersample = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(128, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

model_undersample.compile(optimizer='adam', loss='binary_crossentropy', metrics=[keras.metrics.BinaryAccuracy(), keras.metrics.Precision(), keras.metrics.Recall()])

num_epochs = 10
batch_size = 16

model_undersample.fit(x = x_undersampled, y = y_undersampled, epochs=num_epochs, validation_data = (x_val, y_val), batch_size=batch_size)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m122/122[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 27ms/step - binary_accuracy: 0.7807 - loss: 0.4724 - precision_19: 0.8344 - recall_19: 0.7109 - val_binary_accuracy: 0.7500 - val_loss: 0.6303 - val_precision_19: 0.6667 - val_recall_19: 1.0000
Epoch 2/10
[1m122/122[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 16ms/step - binary_accuracy: 0.9524 - loss: 0.1406 - precision_19: 0.9541 - recall_19: 0.9518 - val_binary_accuracy: 0.6250 - val_loss: 0.9435 - val_precision_19: 0.5714 - val_recall_19: 1.0000
Epoch 3/10
[1m122/122[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 15ms/step - binary_accuracy: 0.9522 - loss: 0.1239 - precision_19: 0.9532 - recall_19: 0.9533 - val_binary_accuracy: 1.0000 - val_loss: 0.1167 - val_precision_19: 1.0000 - val_recall_19: 1.0000
Epoch 4/10
[1m122/122[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - binary_accuracy: 0.9661 - loss: 0.0971 - precision_19: 0.9721 - recall_19: 0.9616 - val_binary_acc

<keras.src.callbacks.history.History at 0x7ec1b9532690>

As before, run the cells below to generate evaluation metrics for your undersampled model. Pay particular attention to the recall metric for the Normal (0) class.

How does this compare to the oversampled model? How about the basic model?

In [None]:
predictions = model_undersample.predict(x_test)
binary_predictions = np.where(predictions > 0.5, 1, 0)
print(classification_report(y_test, binary_predictions, target_names = ['Normal (Class 0)','Pneumonia (Class 1)']))

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step
                     precision    recall  f1-score   support

   Normal (Class 0)       0.91      0.53      0.67       234
Pneumonia (Class 1)       0.77      0.97      0.86       390

           accuracy                           0.80       624
          macro avg       0.84      0.75      0.76       624
       weighted avg       0.82      0.80      0.79       624



In [None]:
results = model_undersample.evaluate(x_test,y_test)
print(f"Accuracy: {results[1]}")
print(f"Precision: {results[2]}")
print(f"Recall: {results[3]}")

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - binary_accuracy: 0.6734 - loss: 1.8212 - precision_19: 0.4031 - recall_19: 0.6532
Accuracy: 0.8028846383094788
Precision: 0.7730061411857605
Recall: 0.9692307710647583


In [None]:
"""
The undersampled model performs better than both the oversampled and basic model.
This makes sense, because there is no class imbalance, nor the issue of overfitting
due to duplicated samples.
"""

When you're ready, save the model trained on `x_undersampled` and `y_undersampled`

In [None]:
path = "drive/MyDrive/EAS5860"

model_undersample.save(os.path.join(path, "EAS5860_HW5_Part_4.keras"))

## Part 5a: Synthetic Minority Oversampling Technique

The prior oversampling method simply reused training images multiple times. The Synthetic Minority Oversampling Technique (SMOTE), by comparison, creates new synthetic images that are similar to, but distinct from the existing training data. You can read more about how SMOTE works here: [click me](https://medium.com/@corymaklin/synthetic-minority-over-sampling-technique-smote-7d419696b88c)

In the cell below, use imblearn's SMOTE to generate `x_smote` and `y_smote` datasets from the original `x_train` and `y_train` (reference imblearn documentation for details). As before, be mindful to reshape your data.

In [None]:
random_state = 42

x_smote, y_smote = SMOTE(random_state=random_state).fit_resample(x_train.reshape((x_train.shape[0], 150*150)), y_train)

In [None]:
x_smote = x_smote.reshape(-1, 150, 150, 1)

# Part 5b: SMOTE CNN

Train the network below using `x_smote` and `y_smote`. You will need to specify the number of epochs and batch size to do so.

In [None]:
set_random_seed()

model_smote = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(128, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

model_smote.compile(optimizer='adam', loss='binary_crossentropy', metrics=[keras.metrics.BinaryAccuracy(), keras.metrics.Precision(), keras.metrics.Recall()])

num_epochs = 7
batch_size = 16

model_smote.fit(x = x_smote, y = y_smote, epochs=num_epochs, validation_data = (x_val, y_val), batch_size=batch_size)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/7
[1m485/485[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 17ms/step - binary_accuracy: 0.8522 - loss: 0.3114 - precision_21: 0.8680 - recall_21: 0.8343 - val_binary_accuracy: 0.8125 - val_loss: 0.4788 - val_precision_21: 1.0000 - val_recall_21: 0.6250
Epoch 2/7
[1m485/485[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 16ms/step - binary_accuracy: 0.9723 - loss: 0.0882 - precision_21: 0.9780 - recall_21: 0.9668 - val_binary_accuracy: 1.0000 - val_loss: 0.0857 - val_precision_21: 1.0000 - val_recall_21: 1.0000
Epoch 3/7
[1m485/485[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 16ms/step - binary_accuracy: 0.9812 - loss: 0.0609 - precision_21: 0.9857 - recall_21: 0.9768 - val_binary_accuracy: 0.9375 - val_loss: 0.2218 - val_precision_21: 0.8889 - val_recall_21: 1.0000
Epoch 4/7
[1m485/485[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 16ms/step - binary_accuracy: 0.9860 - loss: 0.0390 - precision_21: 0.9884 - recall_21: 0.9835 - val_binary_accu

<keras.src.callbacks.history.History at 0x7ec1b8c20c90>

As before, run the cells below to generate evaluation metrics for your SMOTE model. Pay particular attention to the recall metric for the Normal (0) class.

How does this compare to the previous models?

In [None]:
predictions = model_smote.predict(x_test)
binary_predictions = np.where(predictions > 0.5, 1, 0)
print(classification_report(y_test, binary_predictions, target_names = ['Normal (Class 0)','Pneumonia (Class 1)']))

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step
                     precision    recall  f1-score   support

   Normal (Class 0)       0.94      0.46      0.61       234
Pneumonia (Class 1)       0.75      0.98      0.85       390

           accuracy                           0.79       624
          macro avg       0.84      0.72      0.73       624
       weighted avg       0.82      0.79      0.76       624



In [None]:
results = model_smote.evaluate(x_test,y_test)
print(f"Accuracy: {results[1]}")
print(f"Precision: {results[2]}")
print(f"Recall: {results[3]}")

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - binary_accuracy: 0.5868 - loss: 4.8470 - precision_11: 0.3682 - recall_11: 0.6576
Accuracy: 0.7612179517745972
Precision: 0.7286527752876282
Recall: 0.9846153855323792


In [None]:
"""
This model performs pretty well, but a bit worse than undersampling. Like oversampling,
SMOTE creates additional samples of the minority class. But it creates synthetic
samples, ones that don't actually exist, and could be unrealistic for what real samples
look like. We've avoided the overfitting of oversampling, but the recall remains pretty
low for the normal class since we don't have a variety of real samples that reflect
what new samples look like.
"""

When you're ready, save the model trained on `x_smote` and `y_smote`

In [None]:
path = "drive/MyDrive/EAS5860/"

model_smote.save(os.path.join(path, "EAS5860_HW5_Part_5.keras"))

## Part 6: Binary Focal Crossentropy

One final method we will look at to address class imbalances is the Binary Focal Crossentropy loss function.

In Keras, `BinaryFocalCrossentropy` introduces a focusing mechanism that downweights the contribution of examples that are easier to classify (these often come from the majority class) and focuses more on the challenging minority class examples.

It achieves this by introducing two hyperparameters:

1.   **gamma**: 	A focusing parameter used to compute the focal factor, default is 2.0 as mentioned in the reference [Lin et al., 2018](https://arxiv.org/pdf/1708.02002.pdf).
2.   **alpha**: A weight balancing factor for class 1, default is 0.25 as mentioned in reference [Lin et al., 2018](https://arxiv.org/pdf/1708.02002.pdf). The weight for class 0 is 1.0 - alpha.

In the cell below, specify a BinaryFocalCrossentropy loss function to be used when the model is compiled. Experiment with different alpha and gamma values and observe the overall effect on the model's evaluation metrics. Feel free to adjust other parameters in the loss function as well.

In [None]:
set_random_seed()

model_bfce = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(128, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

loss_function = tf.keras.losses.BinaryFocalCrossentropy(gamma=3.0, alpha=0.2)

model_bfce.compile(optimizer='adam', loss= loss_function, metrics=[keras.metrics.BinaryAccuracy(), keras.metrics.Precision(), keras.metrics.Recall()])
num_epochs = 15
batch_size = 32

model_bfce.fit(x = x_train, y = y_train, epochs=num_epochs, validation_data = (x_val, y_val), batch_size=batch_size)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/15
[1m152/152[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 29ms/step - binary_accuracy: 0.8345 - loss: 0.0683 - precision_13: 0.8655 - recall_13: 0.9348 - val_binary_accuracy: 0.6250 - val_loss: 0.1228 - val_precision_13: 0.5714 - val_recall_13: 1.0000
Epoch 2/15
[1m152/152[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 26ms/step - binary_accuracy: 0.9499 - loss: 0.0204 - precision_13: 0.9699 - recall_13: 0.9661 - val_binary_accuracy: 0.8125 - val_loss: 0.0601 - val_precision_13: 0.7273 - val_recall_13: 1.0000
Epoch 3/15
[1m152/152[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 27ms/step - binary_accuracy: 0.9691 - loss: 0.0113 - precision_13: 0.9800 - recall_13: 0.9806 - val_binary_accuracy: 0.8750 - val_loss: 0.0280 - val_precision_13: 0.8000 - val_recall_13: 1.0000
Epoch 4/15
[1m152/152[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 26ms/step - binary_accuracy: 0.9719 - loss: 0.0094 - precision_13: 0.9817 - recall_13: 0.9826 - val_binary_acc

<keras.src.callbacks.history.History at 0x7ec1bb2d22d0>

As before, run the cells below to generate evaluation metrics for your BinaryFocalCrossEntropy model. Pay particular attention to the recall metric for the Normal (0) class. Models with a recall metric in the Normal (0) class >= 0.35 will receive full credit.

How does this compare to the previous models?

In [None]:
predictions = model_bfce.predict(x_test)
binary_predictions = np.where(predictions > 0.5, 1, 0)
print(classification_report(y_test, binary_predictions, target_names = ['Normal (Class 0)','Pneumonia (Class 1)']))

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step
                     precision    recall  f1-score   support

   Normal (Class 0)       0.97      0.41      0.57       234
Pneumonia (Class 1)       0.74      0.99      0.84       390

           accuracy                           0.77       624
          macro avg       0.85      0.70      0.71       624
       weighted avg       0.82      0.77      0.74       624



In [None]:
results = model_bfce.evaluate(x_test,y_test)
print(f"Accuracy: {results[1]}")
print(f"Precision: {results[2]}")
print(f"Recall: {results[3]}")

[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - binary_accuracy: 0.5927 - loss: 1.1175 - precision_13: 0.3732 - recall_13: 0.6618
Accuracy: 0.7724359035491943
Precision: 0.7357414364814758
Recall: 0.9923076629638672


In [None]:
"""
This model performs pretty well too, but a bit worse than undersampling. By using
BinaryFocalCrossentropy, we can tailor the loss function to downweight easy examples
(class 1: pneumonia) and put higher weight onto the more difficult ones for our model.
"""

When you're ready, save the BinaryFocalCrossEntropy model

In [None]:
path = "drive/MyDrive/EAS5860/"

model_bfce.save(os.path.join(path, "EAS5860_HW5_Part_6.keras"))