## Train BreastCancerNet Convolutional Neural Network (CNN) in PyTorch

First, we import the necessary packages. 

`matplotlib` : We set matplotlib to use the "Agg"  backend so that we’re able to save our training plots to disk.

`torch` : We’ll be taking advantage of the DataLoader , lr_scheduler , Adagrad  optimizer, convert vector to parameters, and one-hot encoder. 

`sklearn` : From scikit-learn we’ll need its implementation of a classification_report  and a confusion_matrix.

`BreastCancerNet` : Import `BreastCancerNet` to train and evaluate it. We’ll also need our `config` to grab the `paths` to our training, validation, and testing data splits. 

`imutils` : We’ll be using the paths  module to grab paths to each of our images.

`numpy` :for numerical processing with Python. 

Now that we’ve imported the required libraries and we’ve parsed command line arguments, let’s define training parameters including our training image paths and account for class imbalance:

*Lines 20* define the number of training epochs, initial learning rate, and batch size.

From there, we grab our training image paths and determine the total number of images in each of the splits (*Lines 23-26*).

We’ll go ahead and compute the classWeight for our training data to account for class imbalance/skew (*Lines 29 - 32*). 

Data augmentation, a form of regularization, is important for nearly all deep learning experiments to assist with model generalization. The method purposely perturbs training examples, changing their appearance slightly, before passing them into the network for training. This partially alleviates the need to gather more training data, though more training data will rarely hurt your model. Our data augmentation object, `trainAug`  is initialized on *Lines 35-44*. As you can see, random rotations, shifts, shears, and flips will be applied to our data as it is generated. Rescaling our image pixel intensities to the range `[0, 1]` is handled by the trainAug  generator as well as the `valAug` generator defined on *Line 47*.

Here we initialize the training (*Lines 50-56*), validation (*lines 59-65*), and testing (*lines 68-74*) generator. Each generator will provide batches of images on demand, as is denoted by the batch_size  parameter.

Our model is initialized with the `Adagrad` optimizer on *Lines 77-78*.

We then compile our model with a "`binary_crossentropy`"  loss  function (since we only have two classes of data), as well as learning rate decay (*Line 79*).

Making a call to the Keras fit_generator method, our training process is initiated. Using this method, our image data can reside on disk and be yielded in batches rather than having the whole dataset in RAM throughout training. While not 100% necessary for today’s 5.8GB dataset, you can see how useful this is if you had a 200GB dataset, for example.

After training is complete, we’ll evaluate the model on the testing data. *Line 93* make predictions on all of our testing data (again using a generator object).

The highest prediction indices are grabbed for each sample (*Line 96*) and then a classification_report is printed conveniently to the terminal (*Line 99*).

Then we compute the confusion_matrix and then derive the accuracy, sensitivity , and specificity  (*Lines 102-106*). The matrix and each of these values is then printed in our terminal (*Lines 109-112*).

Finally, let’s generate and store our training plot (*Lines 115-126*) . Our training history plot consists of training/validation loss and training/validation accuracy. These are plotted over time so that we can spot over/underfitting.

In [5]:
# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

# import the necessary packages
from torch.utils.data import DataLoader
from torch.optim import lr_scheduler
from torch.optim import Adagrad
from torch.nn.utils import convert_parameters
from torch.nn.functional import one_hot
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from breastcancernet import BreastCancerNet
from breastcancernet import config
from breastcancernet import loaders
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import os


[('__call__', <function LevelMapper.__call__ at 0x00000247CC4D90D8>), ('__init__', <function LevelMapper.__init__ at 0x00000247CC4D9048>)]
[('__call__', <function BalancedPositiveNegativeSampler.__call__ at 0x00000247CC594D38>), ('__init__', <function BalancedPositiveNegativeSampler.__init__ at 0x00000247CC594CA8>)]
[('__init__', <function BoxCoder.__init__ at 0x00000247CC5A1558>), ('decode', <function BoxCoder.decode at 0x00000247CC5A1708>), ('decode_single', <function BoxCoder.decode_single at 0x00000247CC5A1798>), ('encode', <function BoxCoder.encode at 0x00000247CC5A15E8>), ('encode_single', <function BoxCoder.encode_single at 0x00000247CC5A1678>)]
[('__call__', <function Matcher.__call__ at 0x00000247CC5A1318>), ('__init__', <function Matcher.__init__ at 0x00000247CC5A18B8>), ('set_low_quality_matches_', <function Matcher.set_low_quality_matches_ at 0x00000247CC5A14C8>)]
[('__init__', <function ImageList.__init__ at 0x00000247CC5A13A8>), ('to', <function ImageList.to at 0x00000247

FileNotFoundError: [WinError 3] The system cannot find the path specified: 'datasets/idc\\training'

In [None]:

# initialize our number of epochs, initial learning rate, and batch size
NUM_EPOCHS=40; INIT_LR=1e-2; BS=32

# determine the total number of image paths in training, validation, and testing directories
trainPaths=list(paths.list_images(config.TRAIN_PATH))
lenTrain=len(trainPaths)
lenVal=len(list(paths.list_images(config.VAL_PATH)))
lenTest=len(list(paths.list_images(config.TEST_PATH)))

# account for skew in the labeled data
trainLabels=[int(p.split(os.path.sep)[-2]) for p in trainPaths]
trainLabels=np_utils.to_categorical(trainLabels)
classTotals=trainLabels.sum(axis=0)
classWeight=classTotals.max()/classTotals

# initialize the training data augmentation object
trainAug = ImageDataGenerator(
  rescale=1/255.0,
  rotation_range=20,
  zoom_range=0.05,
  width_shift_range=0.1,
  height_shift_range=0.1,
  shear_range=0.05,
  horizontal_flip=True,
  vertical_flip=True,
  fill_mode="nearest")

# initialize the validation (and testing) data augmentation object
valAug=ImageDataGenerator(rescale=1 / 255.0)

# initialize the training generator
trainGen = trainAug.flow_from_directory(
  config.TRAIN_PATH,
  class_mode="categorical",
  target_size=(48,48),
  color_mode="rgb",
  shuffle=True,
  batch_size=BS)

# initialize the validation generator
valGen = valAug.flow_from_directory(
  config.VAL_PATH,
  class_mode="categorical",
  target_size=(48,48),
  color_mode="rgb",
  shuffle=False,
  batch_size=BS)

# initialize the testing generator
testGen = valAug.flow_from_directory(
  config.TEST_PATH,
  class_mode="categorical",
  target_size=(48,48),
  color_mode="rgb",
  shuffle=False,
  batch_size=BS)

# initialize our CancerNet model and compile it
model=CancerNet.build(width=48,height=48,depth=3,classes=2)
opt=Adagrad(lr=INIT_LR,decay=INIT_LR/NUM_EPOCHS)
model.compile(loss="binary_crossentropy",optimizer=opt,metrics=["accuracy"])

# fit the model
M=model.fit_generator(
  trainGen,
  steps_per_epoch=lenTrain//BS,
  validation_data=valGen,
  validation_steps=lenVal//BS,
  class_weight=classWeight,
  epochs=NUM_EPOCHS)

# reset the testing generator and then use our trained model to make predictions on the data
print("Now evaluating the model")
testGen.reset()
pred_indices=model.predict_generator(testGen,steps=(lenTest//BS)+1)

# for each image in the testing set we need to find the index of the label with corresponding largest predicted probability
pred_indices=np.argmax(pred_indices,axis=1)

# show a nicely formatted classification report
print(classification_report(testGen.classes, pred_indices, target_names=testGen.class_indices.keys()))

# compute the confusion matrix and and use it to derive the raw accuracy, sensitivity, and specificity
cm=confusion_matrix(testGen.classes,pred_indices)
total=sum(sum(cm))
accuracy=(cm[0,0]+cm[1,1])/total
specificity=cm[1,1]/(cm[1,0]+cm[1,1])
sensitivity=cm[0,0]/(cm[0,0]+cm[0,1])

# show the confusion matrix, accuracy, sensitivity, and specificity
print(cm)
print(f'Accuracy: {accuracy}')
print(f'Specificity: {specificity}')
print(f'Sensitivity: {sensitivity}')

# plot the training loss and accuracy
N = NUM_EPOCHS
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0,N), M.history["loss"], label="train_loss")
plt.plot(np.arange(0,N), M.history["val_loss"], label="val_loss")
plt.plot(np.arange(0,N), M.history["acc"], label="train_acc")
plt.plot(np.arange(0,N), M.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on the IDC Dataset")
plt.xlabel("Epoch No.")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig('plot.png')