Breast cancer is the second most common cancer in women and men worldwide. In 2012, it represented about 12 percent of all new cancer cases and 25 percent of all cancers in women.


Breast cancer starts when cells in the breast begin to grow out of control. These cells usually form a tumor that can often be seen on an x-ray or felt as a lump. The tumor is malignant (cancer) if the cells can grow into (invade) surrounding tissues or spread (metastasize) to distant areas of the body.

### Breast Cancer Classification

We have to build a model to classify patients with Breast Cancer by studying the Biopsy Images

In [3]:
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

from cancer_detection_model import CancerModel
import project_configuration as conf

from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import LearningRateScheduler
from keras.optimizers import Adam
from keras.utils import np_utils

import numpy as np
import os
# import argparse
from imutils import paths

In [10]:
# Prepare for training of the model

NR_EPOCHS = 40
LEARNING_RATE = 1e-2 # lr = 0.01
BATCH_SIZE = 32

# Lets find the total size of each split
all_training_paths = list(paths.list_images(conf.TRAIN_PATH))
total_training_examples = len(all_training_paths)

all_validation_paths = list(paths.list_images(conf.VAL_PATH))
total_validation_examples = len(all_validation_paths)

all_testing_paths = list(paths.list_images(conf.TEST_PATH))
total_testing_examples = len(all_testing_paths)

print('Total training paths :',total_training_examples)
print('Total testing paths :',total_testing_examples)
print('Total validation paths :',total_validation_examples)



Total training paths : 194267
Total testing paths : 55505
Total validation paths : 27752


In [28]:
# Let's check the balance of the labeled Data
training_labels = [ int(p.split(os.path.sep)[-2]) for p in all_training_paths]

# One-Hot Encoding for Categorical labels
training_labels = np_utils.to_categorical(training_labels)

class_totals = training_labels.sum(axis = 0)
class_weight = class_totals.max() / class_totals

class_weight



array([1.       , 2.5151904], dtype=float32)

In [None]:
# Data Augmentation
train_data_aug = ImageDataGenerator(rescale=1/255.0,
                                    rotation_range=20,
                                    zoom_range=0.05,
                                    width_shift_range=0.1,
                                    height_shift_range=0.1,
                                    shear_range=0.05,
                                    horizontal_flip=True,
                                    vertical_flip=True,
                                    fill_mode='nearest')

test_data_aug = ImageDataGenerator(rescale=1/255.0)
val_data_aug = ImageDataGenerator(rescale=1/255.0)

