# Module 2: U-Net for Cell Segmentation
---

Welcome to the practical session of [Data-Driven Life Sciences course module 4](https://ddls.aicell.io/course/ddls-2025/module-2/lab/), created by [Estibaliz Gómez de Mariscal](https://esgomezm.github.io/). And, Professor Wei Ouyang, teaching assistants Songtao Cheng, and Nils Mechtel made some modifications.

## Introduction
This is a notebook that shows how to design and train a [U-Net](https://en.wikipedia.org/wiki/U-Net)-like network to segment cells in Phase Contrast Microscopy images using Keras and Tensorflow. The aim is to train the network using original phase contrast microscopy images as input, and one label image per category (background, foreground and contours) as output.

<figure>
<center>
<img src="https://drive.google.com/uc?id=18-cP68ms6vg42V2EHzheTatuJLQ0sJJz" width="750">
</figure>

These notebook is based on the previous development of the following authors for the [NEUBIAS text book 2021](https://github.com/NEUBIAS/neubias-springer-book-2021).

**Authors**: **Authors**: [Estibaliz Gómez-de-Mariscal](https://henriqueslab.github.io/team/2021-10-01-EGdM/), [Daniel Franco-Barranco](https://danifranco.github.io), [Arrate Muñoz-Barrutia](https://image.hggm.es/es/arrate-munoz) and [Ignacio Arganda-Carreras](https://sites.google.com/site/iargandacarreras/).

## Data
The image data used in the notebook was provided by Dr. T. Becker. Fraunhofer Institution for Marine Biotechnology, Lübeck, Germany to the [Cell Tracking Challenge](http://celltrackingchallenge.net/). The acquisition details can be found [here](http://celltrackingchallenge.net/2d-datasets/#bg-showmore-hidden-5ea69464435099020168051).
The training data set consists of 2D images of pancreatic stem cells on a polystyrene substrate. The field of view measures 1152 x 922 microns approx., with a resolution of 1.6 x 1.6 um/pixel. In this notebook we will use the gold reference tracking annotations (ST) of the training data (frames 150-250 of both videos). The annotations were binarized for this use-case. Find the data adapted for this notebook [here](https://github.com/esgomezm/NEUBIAS_chapter_DL_2020/releases/download/1/data4notebooks.zip):

Frames of sequence 01 were used as training data and frames of the sequence 02 were used validation (frames 150, 140, 150, ..., 250) and test data (frames 151, 152, ..., 248, 249).

The data is organized in different folders as follows:

```
./
    |-- train_input
    |    |      t150.tif
    |    |      ...
    |-- train_binary_masks
    |    |      man_seg150.tif
    |    |        ...
    |-- train_contours
    |    |      man_seg150.png
    |    |        ...
    |-- validation_input
    |    |      t150.tif
    |    |      ...
    |-- validation_binary_masks
    |    |      man_seg150.tif
    |    |        ...
    |-- validation_contours
    |    |      man_seg150.png
    |    |        ...
    |-- test_input
    |    |      t151.tif
    |    |      ...
    |-- test_binary_masks
    |    |      man_seg151.tif
    |    |        ...
    |-- test_contours
    |    |      man_seg151.png
    |    |        ...
```


## Network arquitecture:

We will train the standard encoder-decoder called U-Net.
<figure>
<center>
<img src="https://drive.google.com/uc?id=14zaw3eomx_2F__8STpzPaedi_r-gbZYR" width="750">
</figure>

This network was introduced by

- U-Net: Convolutional Networks for Biomedical Image Segmentation by Ronneberger et al. published on arXiv in 2015 (https://arxiv.org/abs/1505.04597)

and

- U-Net: deep learning for cell counting, detection, and morphometry by Thorsten Falk et al. in Nature Methods 2019 (https://www.nature.com/articles/s41592-018-0261-2) And source code found in: https://github.com/zhixuhao/unet by Zhixuhao

## Getting started

### Familiarize yourself with the topic:

- Ensure you're comfortable with bioimage analysis, data augmentation, and U-Nets. You can create a personalized prompt in ChatGPT to help guide you through these topics.
- Remember to upload your ChatGPT conversation history in the submission form.

### Important Note for This Lab Notebook:

- **🌞 Tasks Introduction:** Sections marked with a 🌞 symbol introduce an exercise or question. Please read these sections carefully to understand the concepts and tasks involved.

- **⭐ Your Answer Here:** Cells marked with a ⭐ symbol indicate where you need to write your answer. Please provide your code or answer there.

### Now, let's start:


### Download the ZIP file with the image data and unzip it in Google's content.

In [None]:
import zipfile

# Download file with image data
!wget 'https://github.com/esgomezm/NEUBIAS_chapter_DL_2020/releases/download/1/data4notebooks.zip'
path2zip= 'data4notebooks.zip'

# Extract locally
with zipfile.ZipFile(path2zip, 'r') as zip_ref:
    zip_ref.extractall('dataset/')

We should be able to read the list of **101 training images**, together with their corresponding binary masks and cell contours.

In [None]:
import os
# Path to the training images
train_input_path = 'dataset/train_input'
train_masks_path = 'dataset/train_binary_masks'
train_contours_path = 'dataset/train_contours'
# Read the list of file names and sort them to have a match between images and masks
train_input_filenames = [x for x in os.listdir( train_input_path ) if x.endswith(".tif")]
train_input_filenames.sort()
train_masks_filenames = [x for x in os.listdir( train_masks_path ) if x.endswith(".tif")]
train_masks_filenames.sort()
train_contours_filenames = [x for x in os.listdir( train_contours_path ) if x.endswith(".png")]
train_contours_filenames.sort()

print( 'Number of training input images: ' + str( len(train_input_filenames)) )
print( 'Number of training binary mask images: ' + str( len(train_masks_filenames)) )
print( 'Number of training contour images: ' + str( len(train_contours_filenames)) )

Next, we read all those images into memory and display one with its corresponding labels (masks and countours).

In [None]:
import cv2
from matplotlib import pyplot as plt

# Read training images (input, mask and contours)
train_img = [cv2.imread(os.path.join(train_input_path, x),
                        cv2.IMREAD_ANYDEPTH) for x in train_input_filenames ]
train_masks = [cv2.imread(os.path.join(train_masks_path, x),
                          cv2.IMREAD_ANYDEPTH)>0 for x in train_masks_filenames ]
train_contours = [cv2.imread(os.path.join(train_contours_path, x),
                             cv2.IMREAD_ANYDEPTH)>0 for x in train_contours_filenames ]

# display the image
plt.figure(figsize=(10,5))
plt.subplot(1, 3, 1)
plt.imshow( train_img[0], 'gray' )
plt.axis('off')
plt.title( 'Full-size training image' )
# its "mask"
plt.subplot(1, 3, 2)
plt.imshow( train_masks[0], 'gray' )
plt.axis('off')
plt.title( 'Binary mask' )
# and cell contours
plt.subplot(1, 3, 3)
plt.imshow( train_contours[0], 'gray' )
plt.axis('off')
plt.title( 'Object contour' )


To facilitate their processing, we concatenate the binary masks and the contours to get one array with the training data.

In [None]:
import numpy as np
train_output = [np.transpose(np.array([train_masks[i], train_contours[i]]),
                             [1,2,0]) for i in range(len(train_masks))]

Next, we read images that will be used for validation and inspect some of them visually.

In [None]:
# Path to the validation images
val_input_path = 'dataset/validation_input'
val_masks_path ='dataset/validation_binary_masks'
val_contours_path = 'dataset/validation_contours'

# Read the list of file names and sort them to have a match between images and masks
val_input_filenames = [x for x in os.listdir(val_input_path ) if x.endswith(".tif")]
val_input_filenames.sort()
val_masks_filenames = [x for x in os.listdir(val_masks_path ) if x.endswith(".tif")]
val_masks_filenames.sort()
val_contours_filenames = [x for x in os.listdir(val_contours_path ) if x.endswith(".png")]
val_contours_filenames.sort()

print( 'Images loaded: ' + str( len(val_input_filenames)) )

# Read training images
val_img = [cv2.imread(os.path.join(val_input_path, x), cv2.IMREAD_ANYDEPTH) for x in val_input_filenames ]
val_masks = [cv2.imread(os.path.join(val_masks_path, x), cv2.IMREAD_ANYDEPTH)>0 for x in val_masks_filenames ]
val_contours = [cv2.imread(os.path.join(val_contours_path, x), cv2.IMREAD_ANYDEPTH)>0 for x in val_contours_filenames ]

# Display the image
plt.figure(figsize=(10,5))
plt.subplot(1, 3, 1)
plt.imshow(val_img[0], 'gray' )
plt.axis('off')
plt.title( 'Full-size training image' )
# Its "mask"
plt.subplot(1, 3, 2)
plt.imshow(val_masks[0], 'gray' )
plt.axis('off')
plt.title( 'Binary mask' )
# And cell contours
plt.subplot(1, 3, 3)
plt.imshow(val_contours[0], 'gray' )
plt.axis('off')
plt.title( 'Object contour' )

Again, to facilitate their processing, we concatenate the binary masks and the contours to get one array with the training data.

In [None]:
# concatenate binary masks and contours
val_output = [np.transpose(np.array([val_masks[i],val_contours[i]]),
                           [1,2,0]) for i in range(len(val_masks))]

## Preparing the training data
Now, we are going to create the training set by randomly cropping the input images (and their corresponding labels) into **patches of 256 x 256 pixels**.

All the images entering the network need to be normalized so their intensities are all of them in the range [0,1]. For this, we will create a function that normalizes the intensities values.

🌞 <font color='orange'>**Exercises**:</font>
1. Program a function that takes an input image `x` with it's corresponding ground tuth `y` and crops random patches of a given shape.

2. Program a function that performs the normalization of the intensity values of an input image `x` using percentiles with `np.percentile`.

Notice that the ground truth has also its intensities scaled between 0.0 and 1.0.



In [None]:
import numpy as np

# Function to create random patches
def create_random_patches(imgs, masks, num_patches, shape):
    '''Create a list of image patches out of a list of images.
    Args:
        imgs (list): input images.
        masks (list): binary masks (output images) corresponding to imgs.
        num_patches (int): number of patches for each image.
        shape (2D array): size of the patches. Example: [256, 256].

    Returns:
        list of image patches and patches of corresponding labels (background, foreground and contours)
    '''
    input_patches = []
    output_patches = []

    height, width = shape

    for img, mask in zip(imgs, masks):
        img_height, img_width = img.shape[:2]

        for _ in range(num_patches):
            # Randomly select the top-left corner for cropping
            x = np.random.randint(0, img_width - width)
            y = np.random.randint(0, img_height - height)

            # Extract the patch from both the image and its corresponding mask
            input_patch = img[y:y+height, x:x+width]
            output_patch = mask[y:y+height, x:x+width]

            input_patches.append(input_patch)
            output_patches.append(output_patch)

    return input_patches, output_patches

# Use the method to create six 256x256 pixel-sized patches per image
train_input_patches, train_output_patches = create_random_patches(train_img, train_output, 6, [256, 256])

# Normalization function using percentiles
def normalizePercentile(x, pmin=1, pmax=99.8, axis=None, eps=1e-20):
    """Percentile-based image normalization."""
    mi = np.percentile(x, pmin, axis=axis, keepdims=True)
    ma = np.percentile(x, pmax, axis=axis, keepdims=True)

    # Normalize the image and ensure the values fall between 0 and 1
    x = (x - mi) / (ma - mi + eps)
    return np.clip(x, 0, 1)  # Clip values to be between 0 and 1

# Normalize the training input patches
X_train = [normalizePercentile(x) for x in train_input_patches]  # Normalize between 0 and 1



To follow Tensorflow standards, the input and output of the network have to be reshaped to 256 x 256 x 1. Therefore, the array containing the input images should have shape `[n, 256, 256, 1]` where `n` is the number of patches that will be used for the training.


In [None]:
# In X_train we will store the input images
X_train = np.expand_dims(X_train, axis=-1)
print('There are {} patches to train the network'.format(len(X_train)))

We want to predict three labels (background, foreground and cell-contours). One way to provide the data is with one hot encoding (e.g.,

```
[1, 2, 3, 2] --> [[1, 0, 0, 0],
                  [0, 1, 0, 1],
                  [0, 0, 1, 0]]
```

Create the ground truth array `Y_train` with a **one hot encoding format**. Note that you will need as many matrices as labels (i.e., 3) for each image.

In [None]:
def one_hot_encode(masks, num_classes=3):
    '''
    Convert label masks to one-hot encoding format.

    Args:
        masks (list or array): list of label masks (each mask should have pixel values corresponding to 3 classes).
        num_classes (int): number of classes (background, foreground, and contours).

    Returns:
        numpy array: One-hot encoded labels of shape (num_images, height, width, num_classes)
    '''
    # Check if masks are 2D or 3D and handle accordingly
    if len(masks[0].shape) == 2:
        height, width = masks[0].shape
    elif len(masks[0].shape) == 3:
        height, width, _ = masks[0].shape  # Ignore the channel dimension if present

    # Create a placeholder for the one-hot encoded labels
    Y_train = np.zeros((len(masks), height, width, num_classes), dtype=np.uint8)

    for i, mask in enumerate(masks):
        # If the mask has 3 dimensions, we check if the number of channels is equal to the number of classes, otherwise print an error
        if mask.ndim == 3:
          if mask.shape[-1] == num_classes:
            Y_train[i] = mask
          else:
            # Iterate over each class
            Y_train[i, :, :, 1] = mask[:, :, 0]  # Foreground (cells)
            Y_train[i, :, :, 2] = mask[:, :, 1]
            Y_train[i, :, :, 0] = 1 - (mask[:, :, 0] + mask[:, :, 1])
        else:
          # Assign each class to its corresponding channel
          Y_train[i, :, :, 0] = (mask == 0)  # Background
          Y_train[i, :, :, 1] = (mask == 1)  # Foreground (cells)
          Y_train[i, :, :, 2] = (mask == 2)  # Contours

    return Y_train

# Use the function to create the one-hot encoded ground truth
Y_train = one_hot_encode(train_output_patches, num_classes=3)


Display the results

In [None]:
# Display one patch
plt.figure(figsize=(25,5))
plt.subplot(1, 5, 1)
plt.imshow( X_train[0,:,:,0], 'gray' )
plt.axis('off')
plt.title( 'Training patch' )
# Background class
plt.subplot(1, 5, 2)
plt.imshow( Y_train[0,:,:,0], 'gray' )
plt.axis('off')
plt.title( 'Binary patch (ground truth)' )
# Foreground class
plt.subplot(1, 5, 3)
plt.imshow( Y_train[0,:,:,1], 'gray' )
plt.axis('off')
plt.title( 'Binary patch (ground truth)' )
# Object contours
plt.subplot(1, 5, 4)
plt.imshow( Y_train[0,:,:,2], 'gray' )
plt.axis('off')
plt.title( 'Object contours patch (ground truth)' )
# Reversed one hot representation
plt.subplot(1, 5, 5)
plt.imshow( np.argmax(Y_train[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Unique labelling of each pixel (ground truth)' )

And now we do the same for the validation images:

In [None]:
# We first create the validation patches
val_input_patches, val_output_patches = create_random_patches( val_img, val_output,
                                                              6, [256,256])

# In X_val we will store the input images
X_val = [normalizePercentile(x) for x in val_input_patches] # normalize between 0 and 1
X_val = np.expand_dims(X_val, axis=-1)

print('There are {} patches to validate the network'.format(len(X_val)))

# In Y_val we will store the one-hot respresentation of the labels
Y_val = [np.stack([1 - x[:,:,0] - x[:,:,1], x[:,:,0], x[:,:,1]],
                  axis=-1) for x in val_output_patches ]
Y_val = np.asarray(Y_val)

# Display one patch
plt.figure(figsize=(25,5))
plt.subplot(1, 5, 1)
plt.imshow( X_val[0,:,:,0], 'gray' )
plt.axis('off')
plt.title( 'Validation patch' )
# Background class
plt.subplot(1, 5, 2)
plt.imshow( Y_val[0,:,:,0], 'gray' )
plt.axis('off')
plt.title( 'Binary patch (ground truth)' )
# Foreground class
plt.subplot(1, 5, 3)
plt.imshow( Y_val[0,:,:,1], 'gray' )
plt.axis('off')
plt.title( 'Binary patch (ground truth)' )
# Object contours
plt.subplot(1, 5, 4)
plt.imshow( Y_val[0,:,:,2], 'gray' )
plt.axis('off')
plt.title( 'Object contours patch (ground truth)' )
# Reversed one hot representation
plt.subplot(1, 5, 5)
plt.imshow( np.argmax(Y_val[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Unique labelling of each pixel (ground truth)' )

🌞 <font color='orange'>**Questions for discussion:**</font>
- In this exercise we are loading the entire dataset in memory. Do you think this is an optimal solution for a real application? Specially when we have more than 100 images.
- Why do you think that we need to crop patches?
- How do you decide the size of the pathces?
- Explain in your own words what one-hot encoding is and why it is used in this notebook. How is it different from a simple binary representation of the labels?
- Why do we need to normalize the intensity values of the input image? Can you think about would it be done for a different image modality (RGB) or for example, a whole slide images? Moreover, how can this affect the performance of the network with new images that have a different size or for which we did not crop patches?



---
⭐ Double click to write down your observation here


```
Answer:




```
---

## Custom segmentation metric: Jaccard index
We define as well the [Jaccard index](https://en.wikipedia.org/wiki/Jaccard_index) (also known as Intersection over the Union or IoU) to monitor de segmentation performance.

**Note**: by default we skip the background label in the calculation since most of the pixels are background and the metric value would be artificially high otherwise.

In [None]:
import tensorflow as tf

def jaccard_index( y_true, y_pred, skip_background=True ):
    ''' Define Jaccard index for multiple labels.
        Args:
            y_true (tensor): ground truth masks.
            y_pred (tensor): predicted masks.
            skip_background (bool, optional): skip 0-label from calculation.
        Return:
            jac (tensor): Jaccard index value
    '''
    # We read the number of classes from the last dimension of the true labels
    num_classes = tf.shape(y_true)[-1]
    # One_hot representation of predicted segmentation after argmax
    y_pred_ = tf.one_hot(tf.math.argmax(y_pred, axis=-1), num_classes)
    y_pred_ = tf.cast(y_pred_, dtype=tf.int32)
    # y_true is already one-hot encoded
    y_true_ = tf.cast(y_true, dtype=tf.int32)
    # Skip background pixels from the Jaccard index calculation
    if skip_background:
      y_true_ = y_true_[...,1:]
      y_pred_ = y_pred_[...,1:]

    TP = tf.math.count_nonzero(y_pred_ * y_true_)
    FP = tf.math.count_nonzero(y_pred_ * (y_true_ - 1))
    FN = tf.math.count_nonzero((y_pred_ - 1) * y_true_)

    jac = tf.cond(tf.greater((TP + FP + FN), 0), lambda: TP / (TP + FP + FN),
                  lambda: tf.cast(0.000, dtype='float64'))

    return jac

## Network definition
Next, we define our U-Net-like network, with 3 resolution levels in the contracting path, a bottleneck, and 3 resolution levels in the expanding path.



In [None]:
# Create U-Net for segmentation
from tensorflow import keras
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, UpSampling2D
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Conv2D, Conv2DTranspose
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import concatenate
from tensorflow.keras.optimizers import Adam
# We leave the height and width of the input image as "None" so the network can
# later be used on images of any size.
inputs = Input((None, None, 1))

# Contracting path
c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (inputs)
c1 = Dropout(0.1) (c1)
c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c1)
p1 = AveragePooling2D((2, 2)) (c1)

c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p1)
c2 = Dropout(0.2) (c2)
c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c2)
p2 = AveragePooling2D((2, 2)) (c2)

c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p2)
c3 = Dropout(0.3) (c3)
c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c3)
p3 = AveragePooling2D((2, 2)) (c3)

# Bottleneck
c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p3)
c4 = Dropout(0.4) (c4)
c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c4)

# Expanding path
u5 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c4)
u5 = concatenate([u5, c3])
c5 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u5)
c5 = Dropout(0.3) (c5)
c5 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c5)

u6 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c5)
u6 = concatenate([u6, c2])
c6 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u6)
c6 = Dropout(0.2) (c6)
c6 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c6)

u7 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c6)
u7 = concatenate([u7, c1], axis=3)
c7 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u7)
c7 = Dropout(0.1) (c7)
c7 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c7)

# The output will consist of 3 neurons (one per class) with softmax activation
# so they represent probabilities
outputs = Conv2D(3, (1, 1), activation='softmax') (c7)

model = Model(inputs=[inputs], outputs=[outputs])

model.summary()

## Training the network
Now we are almost ready to train our network! Some important training parameters to take into account:
*   `Epochs`: which defines the maximum number of epochs the model will be trained. Initially set to 100.
*   `Patience`: number of epochs that produced the monitored quantity (validation Jaccard index) with no improvement after which training will be stopped. Initially set to 50.
*   `Batch size`:  the number of training examples in one forward/backward pass. Initially set to 10.
*   `Learning rate`:  the parameter that determines the step size to update the weights of the network at each iteration while training the model.


Since we have more than 2 output classes, we use the categorical cross-entropy (CCE) between the expected and the predicted pixel values as the loss function, and we also include the Jaccard index as a control metric. The Jaccard index obtained for the validation set during the training is taken into account to define an early stopping schedule for the training. This is important when you want to control the training time. Otherwise, you can wait until the network finishes the training and store the weights of the checkpoint that was doing better in the validation dataset.


In [None]:
from tensorflow.keras.callbacks import EarlyStopping

# Training parameters
numEpochs = 5 # suggested maximum number of epochs to train is 100
patience = 5   # number of epochs to wait before stopping if no improvement
batchSize = 10  # number of samples per batch
lr = 0.0003 # learning rate
# Define early stopper to finish the training when the network does not improve
earlystopper = EarlyStopping(patience=patience, verbose=1, restore_best_weights=True,
                             monitor='val_jaccard_index', mode='max')


# Finally compile the model with Adam as optimizer, CCE as loss function and IoU
# as metric
opt = Adam(learning_rate=lr)  # Adam with specified learning rate
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=[jaccard_index])

# Train!
history = model.fit( X_train, Y_train, validation_data = (X_val, Y_val),
                    batch_size = batchSize, epochs=numEpochs,
                    callbacks=[earlystopper])

# Save the model weights to a HDF5 file
model.save_weights('unet_pancreatic_cell_segmentation_best.weights.h5')

# Save the model to a HDF5 file
model.save('unet_pancreatic_cell_segmentation_model.h5')


We can now plot the loss and metric curves for the training and validation sets.


In [None]:
plt.figure(figsize=(14,5))

# Summarize history for loss
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')

# Summarize history for Jaccard index
plt.subplot(1, 2, 2)
plt.plot(history.history['jaccard_index'])
plt.plot(history.history['val_jaccard_index'])
plt.title('model Jaccard index')
plt.ylabel('Jaccard index')
plt.xlabel('epoch')
plt.legend(['train_jacc', 'val_jacc'], loc='lower right')
plt.show()

🌞 <font color='orange'>**Exercise**:</font> Play with the batch size and the learning rate. For example, check whether the network learns anything when the learning rate is too high (e.g., 0.01) or whether batch size has any effect on the final result. For this you can reduce the number of epochs so you do not need to wait for long training times.


---
⭐ Double click to write down your observation here


```
Answer:




```
---

## Check performance in the test set
Finally we can load some test images for testing.

🌞 <font color='orange'>**Exercises:**</font>
1. Load all the test images as it was done for the training and validation. Note that you will need to preprocess them as well. Call the variables `X_test` and `Y_test`. Also note that the original images have a size that is not divisible by a factor of 8 so the decoder side of the U-Net cannot create patches of the same shape as the input. Do you think that the network can process something that has a shape different from [256, 256]? how did we make this posssible in the architecture?

2. On a paper, try computing the image shape resulting through each of the branches of the U-Net when down-sampling. You should see that if an image has input shape [234,234] for example, you will have some incompatibilities with the skip connections.
Generally one would create a tiling strategy to reconstruct the entire images.
To make it simpler, we will crop a patch as large as possible that is still divisible by 8.

We can evaluate the network performance in test using both the CCE and Jaccar index.


In [None]:
# Evaluate trained network on test images

# ⭐ Write your code here

# Normalize input images
X_test =
# One-hot label representation
Y_test =

# Evaluate the model on the test data using `evaluate`
results = model.evaluate(X_test, Y_test, batch_size=1)
print('test loss CCE: {0}, Jaccard index: {1}'.format(results[0], results[1]))



And also display some patches for qualitative evaluation.

In [None]:
print('\n# Generate predictions for 3 samples')
predictions = model.predict(X_test[:1], batch_size=1)
masks_prediction = np.array((predictions[0,:,:,0], predictions[0,:,:,1]))
contours_prediction = predictions[0,:,:,2]
print('predictions shape:', predictions.shape)

# Display corresponding first 3 patches
plt.figure(figsize=(48,15))
plt.subplot(3, 7, 1)
plt.imshow( X_test[0], 'gray' )
plt.axis('off')
plt.title( 'Test patch at low resolution' )
# Side by side with its "ground truth"
plt.subplot(3, 7, 2)
plt.imshow( np.argmax(Y_test[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Ground truth' )
# One hot final segmentation with argmax
plt.subplot(3, 7, 3)
plt.imshow(np.argmax(predictions[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Final segmentation segmentation' )
# Foreground prediction
plt.subplot(3, 7, 4)
plt.imshow( predictions[0,:,:,1], 'CMRmap')
plt.axis('off')
plt.title( 'Foreground prediction' )
# Contours predictions
plt.subplot(3, 7, 5)
plt.imshow( predictions[0,:,:,2], 'CMRmap')
plt.axis('off')
plt.title( 'Contours Prediction' )
# ZOOM test ground truth
plt.subplot(3, 7, 6)
plt.imshow( np.argmax(Y_test[0,100:200, 100:200], axis=-1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Zoomed ground truth' )
# ZOOM Ground Truth
plt.subplot(3, 7, 7)
plt.imshow( np.argmax(predictions[0][100:200,100:200], axis = -1), 'CMRmap', interpolation='nearest')
plt.axis('off')
plt.title( 'Zoomed prediction' )


predictions = model.predict(np.expand_dims(X_test[50], axis = 0), batch_size=1)

plt.subplot(3, 7, 8)
plt.imshow( X_test[50], 'gray' )
plt.axis('off')
plt.title( 'Test patch at low resolution' )
# Side by side with its "ground truth"
plt.subplot(3, 7, 9)
plt.imshow( np.argmax(Y_test[50], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Ground truth' )
# and one hot final segmentation with argmax
plt.subplot(3, 7, 10)
plt.imshow(np.argmax(predictions[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Final segmentation segmentation' )
# foreground prediction
plt.subplot(3, 7, 11)
plt.imshow( predictions[0,:,:,1], 'CMRmap')
plt.axis('off')
plt.title( 'Foreground prediction' )
# contours predictions
plt.subplot(3, 7, 12)
plt.imshow( predictions[0,:,:,2], 'CMRmap')
plt.axis('off')
plt.title( 'Contours Prediction' )
# ZOOM test ground truth
plt.subplot(3, 7, 13)
plt.imshow( np.argmax(Y_test[50,100:200, 100:200], axis=-1), 'CMRmap', interpolation='nearest'  )
plt.axis('off')
plt.title( 'Zoomed ground truth' )
#ZOOM Ground Truth
plt.subplot(3, 7, 14)
plt.imshow( np.argmax(predictions[0][100:200,100:200], axis = -1), 'CMRmap', interpolation='nearest')
plt.axis('off')
plt.title( 'Zoomed prediction' )

predictions = model.predict(np.expand_dims(X_test[80], axis = 0), batch_size=1)

plt.subplot(3, 7, 15)
plt.imshow( X_test[80], 'gray' )
plt.axis('off')
plt.title( 'Test patch at low resolution' )
# Side by side with its "ground truth"
plt.subplot(3, 7, 16)
plt.imshow( np.argmax(Y_test[80], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Ground truth' )
# and one hot final segmentation with argmax
plt.subplot(3, 7, 17)
plt.imshow(np.argmax(predictions[0], axis = -1), 'CMRmap', interpolation='nearest' )
plt.axis('off')
plt.title( 'Final segmentation segmentation' )
# foreground prediction
plt.subplot(3, 7, 18)
plt.imshow( predictions[0,:,:,1], 'CMRmap')
plt.axis('off')
plt.title( 'Foreground prediction' )
# contours predictions
plt.subplot(3, 7, 19)
plt.imshow( predictions[0,:,:,2], 'CMRmap')
plt.axis('off')
plt.title( 'Contours Prediction' )
# ZOOM test ground truth
plt.subplot(3, 7, 20)
plt.imshow( np.argmax(Y_test[80,100:200, 100:200], axis=-1), 'CMRmap', interpolation='nearest')
plt.axis('off')
plt.title( 'Zoomed ground truth' )
#ZOOM Ground Truth
plt.subplot(3, 7, 21)
plt.imshow( np.argmax(predictions[0][100:200,100:200], axis = -1), 'CMRmap', interpolation='nearest')
plt.axis('off')
plt.title( 'Zoomed prediction' )

## Build a Web App to Deploy Your Trained U-Net

You’ve trained a U-Net for image segmentation. In this lab, you’ll package it as a simple web app so colleagues without programming experience can try it in a browser: upload a TIFF → see the segmentation result.

### Get prepared
You can learn this by discussing with your favourate chat assistant, chatGPT, or the GitHub Copilot, or Gemini CLI. You can ask questions such as:
 - Write a prompt which explains what you are trying to do, e.g. you want to learn the basic of web app development in Python with the aim of creating your own image segmentation app based on U-Net.
 - what is a web application? How does it set up normally? 
 - What is frontend and backend. 
 - Get familiarize yourself with popular web app server framework such as `FastAPI`. 
 - And then for the frontend, get yourself familiarize with the basic of HTML, CSS and Javascript.

### Create Instructions for your AI Agent

Once you get the basic idea on how web app works, proceed with creating your `GEMINI.md` file under the `Module2` folder. The goal is to prepare instructions for your gemini cli agent to help you build the web app -- this is a crucial context engineering step. Please be patient on this step, you need to sit down and write very detailed instructions to provide enough context for the AI to help you. 

There are many existing examples you can learn from, for example: https://github.com/dontriskit/awesome-ai-system-prompts you can find many real world examples, browe it through and you will see examples such as [lovable prompt](https://github.com/dontriskit/awesome-ai-system-prompts/blob/main/Loveable/Prompt.md). You don't need to understand everything, but reading the text part will get you some basic understanding on how production-level prompt looks like.


Now follow the [core principles](https://github.com/dontriskit/awesome-ai-system-prompts?tab=readme-ov-file#the-foundation-core-principles-of-agentic-prompts) outlined in the read me file of this repo. try to include the following details in your `GEMINI.md` using markdown syntaxt:
 - Set a role for the AI agent
 - Describe the context, explain clearly what you have done, what you want to achieve
 - **Importantly:** include the full file path to the trained U-Net model file, tell the AI agent the framework is tensorflow, and the input image format. Also mention that you have a notebook file which produced the model.
 - Tell it to use `fastapi` for the backend server
 - And for the frontend, you want to create a single page web app using CSS framework `tailwind CSS`. 
 - Use AI assistant to polish your `GEMINI.md` file.


### Start building

Now move on with building the web app by starting the `gemini` CLI in your VS Code terminal.

Tell it to create the web app to serve the U-Net model, and you want to be able to launch the server later, and be able to upload an tiff image and see the segmentation in your web browser.

After building the app, you need to ask gemini the command to start the server. And you might need **Port Forwarding** (see below) to actually see the app in your browser.

To test the app, here is [an example image](https://raw.githubusercontent.com/aicell-lab/ddls-course/main/static/uploads/ddls_2025_lab2_example_image.tif) you can download, then select in your U-Net app to show whether it works.

If it doesn't work, tell what is the issue and what you want to change. Sometimes, you might need to use the browser developer tool to see error from the browser console to debug frontend code, e.g. [in Chrome](https://developer.chrome.com/docs/devtools/open). 

## Note on Port Forwarding

Since you’re running your FastAPI server **on a remote machine** (e.g., a university server or a cloud VM), it’s not automatically visible on the public internet. By default, only you can reach it from inside the machine. To make your web app usable from your browser — and shareable with others — you need to create a **tunnel**. This process is called **port forwarding**.

### What is port forwarding?

* Your FastAPI server runs on a specific port (e.g., `http://localhost:8000`).
* But “localhost” on the remote server is not the same as “localhost” on your laptop.
* Port forwarding creates a secure bridge: traffic to a remote port is forwarded through VS Code to a special URL that you (and optionally others) can access in a browser.

In short: **without port forwarding, your app stays hidden; with it, you get a shareable link.**

### How to set it up in VS Code

1. Start your FastAPI server inside the VS Code terminal:

   ```bash
   uvicorn backend.app:app --reload --port 8000
   ```
2. In VS Code, look at the bottom panel tabs (“PROBLEMS”, “OUTPUT”, “TERMINAL”, etc.).
3. Click on the **PORTS** tab.
4. You should see port `8000` listed (if not, click ➕ and add it manually).
5. In the PORTS panel, you’ll see a globe icon 🌐. Click it to open the forwarded URL in your browser.
6. By default, the link is **Private** (only you). To share:

   * Right-click the port entry → **Change Port Visibility** → set to **Public**.
   * Now you’ll have a globally accessible URL to share with colleagues.

⚠️ **Important:** Be mindful that making the app public means anyone with the link can use it. Do this only for demo purposes.