# Image Segmentation with U-Net on the Oxford-IIIT Pet Dataset

## Introduction

In this notebook, we will implement an image segmentation model using the U-Net architecture on the Oxford-IIIT Pet Dataset. The notebook includes detailed explanations and comments in the code to facilitate understanding. We will:

- Load and preprocess the dataset.
- Visualize sample images and their corresponding masks.
- Build a U-Net model with detailed comments.
- Train the model.
- Evaluate the model's performance and visualize the predictions.
- Explore further improvements like data augmentation and additional metrics.

---

## Import Libraries

First, we import the necessary libraries.

In [1]:
# Import TensorFlow and TensorFlow Datasets
import tensorflow as tf
import tensorflow_datasets as tfds

# Import Matplotlib for visualization
import matplotlib.pyplot as plt

# Import Keras layers and models for building the U-Net
from tensorflow.keras import layers, models

---
## Load the Oxford-IIIT Pet Dataset
We will use TensorFlow Datasets (TFDS) to load the Oxford-IIIT Pet Dataset, which includes images of pets and their corresponding segmentation masks.

In [2]:
# Load the dataset with info
dataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True, data_dir="./data/")

- data_dir="./data/" specifies the directory where the dataset will be stored.
- with_info=True returns the dataset info, which contains metadata like the number of examples.

---
## Explore the Raw Mask Data
Before preprocessing, let's examine the raw segmentation masks to understand their structure and unique values.

In [3]:
# Access the training split of the dataset
raw_train_dataset = dataset['train']

# Iterate over one example to inspect the mask values
for datapoint in raw_train_dataset.take(1):
    # Extract the segmentation mask
    raw_mask = datapoint['segmentation_mask']
    # Get the unique values in the mask
    unique_values_raw = tf.unique(tf.reshape(raw_mask, [-1])).y.numpy()
    print("Unique values in the raw mask:", unique_values_raw)


Unique values in the raw mask: [2 3 1]


Explanation:

- The raw masks contain pixel values ranging from 0 to 3:
    - 0: Background
    - 1: Pet (foreground)
    - 2: Border/outline
    - 3: Not used in our case

Our goal is to simplify the masks to have only two classes: background (0) and pet (1).

---
## Set Image Size
We define the image size to which all images and masks will be resized. Adjust IMG_SIZE based on your computational resources.

In [4]:
# Define image size for resizing
IMG_SIZE = 128  # You can adjust this value (e.g., 128, 256)


---
## Define Preprocessing Functions
### Normalize Function
The normalize function performs the following tasks:

Converts input images to float values in the range [0, 1].
Adjusts the masks to have only two values: 0 (background) and 1 (pet).

In [5]:
def normalize(input_image, input_mask):
    """Normalize images to [0,1] and adjust masks to have values 0 and 1."""
    # Convert image to float32 and scale from [0, 255] to [0, 1]
    input_image = tf.cast(input_image, tf.float32) / 255.0
    # Ensure mask is of type int32
    input_mask = tf.cast(input_mask, tf.int32)
    
    # Map label 2 (border/outline) and label 3 to label 1 (foreground)
    input_mask = tf.where(input_mask == 2, 1, input_mask)
    input_mask = tf.where(input_mask == 3, 1, input_mask)
    
    # Ensure that background remains 0, and foreground is 1
    input_mask = tf.where(input_mask != 1, 0, input_mask)
    
    return input_image, input_mask


### Load and Preprocess Training Images
The load_image_train function:

Resizes images and masks to the defined IMG_SIZE.
Applies random data augmentation (horizontal flip).
Normalizes the images and masks.

In [9]:
def load_image_train(datapoint):
    """Preprocess training images and masks with augmentation."""
    # Resize the input image and mask
    input_image = tf.image.resize(datapoint['image'], (IMG_SIZE, IMG_SIZE))
    input_mask = tf.image.resize(
        datapoint['segmentation_mask'], (IMG_SIZE, IMG_SIZE), method='nearest')

    # Data augmentation: Random horizontal flip
    if tf.random.uniform(()) > 0.5:
        # Flip image and mask horizontally
        input_image = tf.image.flip_left_right(input_image)
        input_mask = tf.image.flip_left_right(input_mask)

    # Normalize the image and adjust the mask labels
    input_image, input_mask = normalize(input_image, input_mask)
    return input_image, input_mask

### Load and Preprocess Test Images
The load_image_test function:

Resizes images and masks to the defined IMG_SIZE.
Normalizes the images and masks (without augmentation).

In [8]:
def load_image_test(datapoint):
    """Preprocess test images and masks without augmentation."""
    # Resize the input image and mask
    input_image = tf.image.resize(datapoint['image'], (IMG_SIZE, IMG_SIZE))
    input_mask = tf.image.resize(
        datapoint['segmentation_mask'], (IMG_SIZE, IMG_SIZE), method='nearest')

    # Normalize the image and adjust the mask labels
    input_image, input_mask = normalize(input_image, input_mask)
    return input_image, input_mask

---
## Prepare the Dataset for Training and Testing
We now apply the preprocessing functions to the training and test datasets and prepare them for batching and shuffling.

In [10]:
# Dataset parameters
TRAIN_LENGTH = info.splits['train'].num_examples  # Number of training examples
BATCH_SIZE = 16                                   # Batch size
BUFFER_SIZE = 1000                                # Buffer size for shuffling

# Prepare training dataset
train_dataset = dataset['train'].map(
    load_image_train, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.cache()             # Cache the dataset for performance
train_dataset = train_dataset.shuffle(BUFFER_SIZE)
train_dataset = train_dataset.batch(BATCH_SIZE)
train_dataset = train_dataset.repeat()            # Repeat the dataset indefinitely
train_dataset = train_dataset.prefetch(buffer_size=tf.data.AUTOTUNE)

# Prepare test dataset
test_dataset = dataset['test'].map(load_image_test)
test_dataset = test_dataset.batch(BATCH_SIZE)


---
## Verify the Data Preprocessing
Let's check the unique values in the masks after normalization to ensure that they only contain 0 and 1.

In [12]:
# Check unique values in the masks after preprocessing
for image, mask in train_dataset.take(1):
    unique_values_after = tf.unique(tf.reshape(mask, [-1])).y.numpy()
    print("Unique values in the mask after normalization:", unique_values_after)


Unique values in the mask after normalization: [1]


Also, let's check the shapes and data types of images and masks.

In [13]:
# Inspect shapes and data types
for image, mask in train_dataset.take(1):
    print("Image shape:", image.shape)        # Should be (BATCH_SIZE, IMG_SIZE, IMG_SIZE, 3)
    print("Image dtype:", image.dtype)        # Should be float32
    print("Mask shape:", mask.shape)          # Should be (BATCH_SIZE, IMG_SIZE, IMG_SIZE, 1)
    print("Mask dtype:", mask.dtype)          # Should be int32


Image shape: (16, 128, 128, 3)
Image dtype: <dtype: 'float32'>
Mask shape: (16, 128, 128, 1)
Mask dtype: <dtype: 'int32'>


---
## Visualize Sample Images and Masks
Let's visualize a sample image and its corresponding mask to ensure that the preprocessing steps are correct.

In [15]:
def display_sample(display_list):
    """Display image and mask side by side."""
    plt.figure(figsize=(15, 5))

    title = ['Input Image', 'True Mask']

    for i in range(len(display_list)):
        plt.subplot(1, len(display_list), i+1)
        plt.title(title[i])
        # Display the image
        plt.imshow(tf.keras.utils.array_to_img(display_list[i]))
        plt.axis('off')
    plt.show()
