# Triple Negative Breast Cancer(TNBC) Cell Semantic Segmentation

This notebook applies [U-Net](https://arxiv.org/abs/1505.04597) Convolutional Neural Network for semantic segmentation of TNBC cell images.

The dataset for the task is downloaded from [here](https://zenodo.org/record/1175282#.Xl_4nZMzZQJ) 

**Flow of the notebook:**
- Apply U-Net to standard dataset 
- Plot network's perfomance 
- Show sample test segmentation results 
- Apply U-Net to dataset "overlayed" with canny edges
- Plot network's perfomance 
- Show sample test segmentation results 
- Compare newtork's performance on both datasets

Let's get started!

In [16]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [17]:
import os

path = "/content/drive/MyDrive/U-Net-Breast-Cancer-Image-Segmentation-master"
os.chdir(path)

# Triple Negative Breast Cancer

*Triple-negative breast cancer (TNBC) accounts for about 10-15%  of all breast cancers. These cancers tend to be more common in women younger than age 40, who are African-American.*

*Triple-negative breast cancer differs from other types of invasive breast cancer in that they grow and spread faster, have limited treatment options, and a worse prognosis (outcome)*.  - **American Cancer Society**

Thus early stage cancer detection is required to provide proper treatment to the patient and reduce the risk of death due to cancer as detection of these cancer cells at later stages lead to more suffering and increases chances of death. Semantic segmentation of cancer cell images can be used to improvise the analysis and diagonsis of Breast Cancer! Below is such an attempt.

# U-Net

U-Net is a State of the Art CNN architecture for Bio-medical image segmentation. *The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.* It's a Fully Convolutional Network(FCN) therefore it can **work with arbitrary size images!**

<img src="img/U-Net_arch.png">

In [21]:
! pip install tensorflow==1.15.2

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import tensorflow
print(tensorflow.__version__)

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-15-53968207222e>", line 1, in <module>
    import tensorflow
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/__init__.py", line 99, in <module>
    from tensorflow_core import *
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/__init__.py", line 28, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
ImportError: cannot import name 'pywrap_tensorflow' from 'tensorflow.python' (unknown location)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/IPython/core/ultratb.py", line 1132, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/usr/local/lib/python3.7/di

ImportError: ignored

In [18]:
!pip install tf-nightly-gpu

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [19]:
# To ensure GPU is enabled on Colab

%matplotlib inline
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-19-bb582619fdbe>", line 4, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/__init__.py", line 99, in <module>
    from tensorflow_core import *
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/__init__.py", line 28, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
ImportError: cannot import name 'pywrap_tensorflow' from 'tensorflow.python' (unknown location)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/IPython/core/ultratb.py", line 1132, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/usr/local/lib/python

ImportError: ignored

## 1- Import required modules

In [20]:
from model import *
from augmentation import *
from metrics import *
from plots import *
from utils import *

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-20-942b1f6a67f9>", line 1, in <module>
    from model import *
  File "/content/drive/.shortcut-targets-by-id/1312aga1iOmQ2u72zeDzzFUpEQFd4oIKo/U-Net-Breast-Cancer-Image-Segmentation-master/model.py", line 8, in <module>
    from tensorflow.keras.models import *
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 959, in _find_and_load_unlocked
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/__init__.py", line 50, in __getattr__
    module = self._load()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/__init__.py", line 44, in _load
    module = _importlib.import_module(self.__name__)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_impor

KeyError: ignored

### 1.1- How to arrange Directories for using ImageDataGenerator.flow_from_directory()?

- train
    * images
        * img
    * label
        * img
- test
    * images
        * img
    * label
        * img
        
**train, test, images, label,img** are all directories, where *img* is the directory containing images/segmentation masks .png images

In [None]:
#### 'model.py' module contains the U-Net architecture definition which is the model we use for Semantic Segmentation 

import numpy as np 
import os
import skimage.io as io
import skimage.transform as trans
import numpy as np
from tensorflow import keras 
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.callbacks import ModelCheckpoint, LearningRateScheduler
from tensorflow.keras import backend as keras


def unet(pretrained_weights = None,input_size = (256,256,1)):
    """Initialises Keras Model instance. The following architecture is similar to the original U-Net 
        architecture, except I've used "same" padding instead "valid" which the authors have used. Using "same"
        padding throughout makes the output segmentation mask of same (height, width) as that of the input.
        For detailed U-Net architecture check: https://arxiv.org/abs/1505.04597

    Args:
        pretrained_weights (.hdf5 file): Weights to pre-train our model
        input_size (tuple): Input shape of images to the model

    Returns:
        model (Model): Keras Model instance is the model we use 
    """
    inputs = Input(input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(2, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(1, 1, activation = 'sigmoid')(conv9)

    model = Model(inputs = inputs, outputs = conv10)

    # If pre-trained weights are provided load it 
    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model




In [None]:
#@title Default title text
# Loads and initalises the U-Net network

m=unet()
m.summary()

In [None]:
from tensorflow.keras.utils import plot_model
plot_model(m, to_file=os.path.join('/content/drive/MyDrive/U-Net-Breast-Cancer-Image-Segmentation-master','model_plot.png'), show_shapes=True, show_layer_names=True)

## 2- Model training on Standard Dataset

In [None]:
opt = Adam(lr=1E-6, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
m.compile(loss=dice_coef_loss, optimizer=opt, metrics=['accuracy', iou, F1, recall, precision]) # Keeping track of these metrics

In [None]:
Adam

2.1- Why Data Augmentation?

Our training set has **only 33 images** which is nothing when compared to modern day datasets like [ImageNet](http://www.image-net.org/) which has over 1M annotated examples. *But this is generally the case in Bio-medical tasks.* Thus I've used Data Augmentation extensively to increase the dataset.

### 2.2- Why I haven't used ImageNet for Transfer Learning?

You might be wondering why haven't I done "transfer learning" from ImageNet or any similar datasets? Afterall such pre-training is a standard for Deep Learning. 

ImageNet is a "natural image" dataset and I'm here tacking a very specific problem which has images very different from natural images. Thus such pre-training would provide *little* benefit to the performance. For detailed insight into this check [this](https://arxiv.org/abs/1902.07208) wonderful paper which digs deep into Transfer learning for Medical tasks.

In [None]:
!apt-cache policy libcudnn8

# Install latest version
!apt install --allow-change-held-packages libcudnn8=8.4.1.50-1+cuda11.6

# Export env variables
!export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
!export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH
!export LD_LIBRARY_PATH=/usr/local/cuda-11.4/include:$LD_LIBRARY_PATH
!export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

# Install tensorflow
!pip install tflite-model-maker==0.4.0
!pip uninstall -y tensorflow && pip install -q tensorflow==2.9.1
!pip install pycocotools==2.0.4
!pip install opencv-python-headless==4.6.0.66

In [None]:
checkpoint = ModelCheckpoint('unet_weights.hdf5', monitor='loss', 
                             verbose=1, save_best_only=True, mode='min') # Checkpoint to store "only" the best weights during training
                                                                         # Weights will be saved in file named 'unet_weights.hdf5'
train_generator=train_data_aug() # Peforms real-time Data Augmentation on the Training dataset. See augmentation.py for more details
results = m.fit_generator(train_generator, epochs=50, steps_per_epoch = 16, callbacks=[checkpoint])

Found 35 images belonging to 1 classes.
Found 35 images belonging to 1 classes.


AttributeError: ignored

## 3- Plotting model's training history

### 3.1- Why is the Learning Curve such?

I've trained the model before many times and in previous trainings the "Learning Curve" was exactly as a good learning curve should be ie. noisyier but "trending" downward for Loss function.

In [None]:
from plots import *

training_history_plot(results) # Plots "training curve" for the network/model for metrics listed above. See plots.py for more details

## 4- Model's Performance on various Metrics

In [None]:
titles = ['Dice Loss','Accuracy','IOU','F1','Recall','Precision']
test_generator=test_data_aug() # Peforms real-time Data Augmentation(here only re-scaling and converting to grayscale) on the Test/Validation dataset. See augmentation.py for more details
performance=m.evaluate_generator(test_generator, verbose=1,steps=5)

for i in range(6):
  print("%s = %f" %(titles[i], performance[i]))

### 4.1- Structure of test2 directory

- test2
    * 0
        * 0
            * 0.png
    * 1
        * 1
            * 1.png
            
Such weird file structure is because there should be **"two" nested directories** in the container directory(test2)

In [None]:
results=np.zeros(shape=(5,256,256,1))
for i in range(5): # As we have 17 test images 

  results[i,:,:,:]=predict(i, m) # Predicts the segmentation labels on images in test2 directory. See utils.py for more details augmentation

## 5- Sample Results

Starting from the left:
    - First image is original test image "converted" to grayscale
    - Second is the predicted segmentation labels for above image
    - Third one is a Binary mask ie. pixel values of only 0's and 1's, obtained by thresholding on Predicted segmentation, below is for threshold value 0.2, implies all pixel values greater than 0.2 in Predicted segmentation get 1 and others get 0
    - Rightmost is the Ground Truth segmentation label for this test image
    
Below we see that segmentation results are very good considering the fact that we had only 33 images our training dataset which is very limited! 

In [None]:
results.shape


In [None]:
model_prediction_plot(results, t=0.2) # See plots.py for more details

## 6- Model training on Canny Dataset

## 7- Plotting model's training history on Canny dataset

### 7.1- Why is the Learning Curve such?

I've trained the model before many times and in previous trainings the "Learning Curve" was exactly as a good learning curve should be ie. noisier but "trending" downwards for the Loss function.

## 8- Model's Performance on various Metrics trained on Canny dataset

## 9- Activation Map on Canny dataset

Below is the visualisation of Activation Maps of the model trained on Canny dataset. These visuals are the *activations or the output* of given layer and channel of U-Net CNN. These visualisations tell us **What the model has learnt** or more specifically what the convolutional filters have learnt! It also gives a sense of the **other Biological/Medical features** in the image.

Starting from the left:
    - First image is original test image
    - Second is the Activation Map for provided layer and channel
    - Third one is the Transparent "overlay" of the Activation Map over the test image
For more information see plots.py

**Note:** As I'm using **'jet' cmap** so red corresponds to high activation values and blue to low ones, and green/yellow in the middle in the Activation Map.

The filter corresponding to the last layer and it's only filter has learnt to **segment Cancer cells**, marked in red in the second image! When overlayed on the original image clearly distinguishes Cancer and non-Cancer cells.

As clearly seen above the corresponding filter has learnt to identify **empty regions** see red region in centre image, remember "Red" corresponds to high activation in Activation Map! Also note that Cancer cells have low activations here, marked in blue so in a way *this filter is ignoring those cells.*

Above is the visual for a middle Conv layer, thus it's neither seeing the entire image as it is nor it's segmentation mask! The "Light Blue" marks in the Activation Map shows that this is *tending to learn to segment Cancer cells.*

## 12- References

1. [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597)
2. [Triple Negative Breast Cancer- American Cancer Society](https://www.cancer.org/cancer/breast-cancer/understanding-a-breast-cancer-diagnosis/types-of-breast-cancer/triple-negative.html)
3. [Deep Learning for Cancer Cell Detection and Segmentation: A Survey](https://www.researchgate.net/publication/334080872_Deep_Learning_for_Cancer_Cell_Detection_and_Segmentation_A_Survey)
4. [Transfusion: Understanding Transfer Learning for Medical Imaging](https://arxiv.org/abs/1902.07208)
5. [Dataset](https://zenodo.org/record/1175282#.Xl_4nZMzZQJ)

**Note:** Not an exhaustive list