<a href="https://colab.research.google.com/github/kidais-lab/Mad-Libs-Generator/blob/master/Image_Segmentation_using_Deep_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Image Segmentation using Deep Learning**

**In this project, we implemented an image segmentation model using Convolutional Neural Networks (CNNs) with TensorFlow and Keras. The purpose of this project is to segment images from the Oxford-IIIT Pet dataset, which involves classifying each pixel in the image into one of several classes (e.g., pet, background).**

This project demonstrates the application of deep learning techniques for image segmentation. By following the steps outlined, we successfully built and trained a CNN model capable of segmenting images from the Oxford-IIIT Pet dataset. The model's performance was evaluated using visualizations of the predicted masks, showcasing its ability to accurately classify pixels in the images

# Step 1: Downloading and Extracting the Dataset

This step downloads the Oxford-IIIT Pet dataset, which consists of images of pets and their annotations for segmentation tasks. The dataset is downloaded in compressed tar.gz format and extracted for further use.

In [None]:
# Download the images and annotations datasets from the Oxford Pets dataset
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz

# Extract the downloaded tar.gz files
!tar -xf images.tar.gz
!tar -xf annotations.tar.gz

--2024-05-15 10:45:07--  http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
Resolving www.robots.ox.ac.uk (www.robots.ox.ac.uk)... 129.67.94.2
Connecting to www.robots.ox.ac.uk (www.robots.ox.ac.uk)|129.67.94.2|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz [following]
--2024-05-15 10:45:08--  https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
Connecting to www.robots.ox.ac.uk (www.robots.ox.ac.uk)|129.67.94.2|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://thor.robots.ox.ac.uk/~vgg/data/pets/images.tar.gz [following]
--2024-05-15 10:45:08--  https://thor.robots.ox.ac.uk/~vgg/data/pets/images.tar.gz
Resolving thor.robots.ox.ac.uk (thor.robots.ox.ac.uk)... 129.67.95.98
Connecting to thor.robots.ox.ac.uk (thor.robots.ox.ac.uk)|129.67.95.98|:443... connected.
HTTP request sent, awaiting response... 301 Moved Perman

# Step 2: Preparing the Data

This step sets up the directories and loads the file paths for the input images and their corresponding target annotation images. It helps in organizing the data for easy access and further processing.

In [None]:
import os

# Define the directories for input images and target annotations
input_dir = "images/"
target_dir = "annotations/trimaps/"

# Get sorted lists of all input image file paths and target annotation file paths
input_img_paths = sorted(
    [os.path.join(input_dir, fname) for fname in os.listdir(input_dir) if fname.endswith(".jpg")]
)
target_paths = sorted(
    [os.path.join(target_dir, fname) for fname in os.listdir(target_dir) if fname.endswith(".png") and not fname.startswith(".")]
)

import matplotlib.pyplot as plt
from tensorflow.keras.utils import load_img, img_to_array

# Display an example input image
plt.axis("off")
plt.imshow(load_img(input_img_paths[9]))

# Step 3: Displaying Target Images

This step includes a function to visualize the target annotation images, which are the ground truth segmentation masks. Visualizing these masks helps in understanding the segmentation labels.

In [None]:
# Function to display target images
def display_target(target_array):
    # Normalize the target array values for visualization
    normalized_array = (target_array.astype("uint8") - 1) * 127
    plt.axis("off")
    plt.imshow(normalized_array[:, :, 0])

# Load and display an example target image
img = img_to_array(load_img(target_paths[9], color_mode="grayscale"))
display_target(img)

# Step 4: Processing and Shuffling the Data

In this step, images and target masks are loaded, resized, and stored in arrays. The data is then shuffled to ensure that the training process is not biased. This step is crucial for preparing the data for model training.

In [None]:
import numpy as np
import random

# Define image size and number of images
img_size = (200, 200)
num_imgs = len(input_img_paths)

# Shuffle the input and target paths using a fixed seed for reproducibility
random.Random(1337).shuffle(input_img_paths)
random.Random(1337).shuffle(target_paths)

# Function to load and resize input images
def path_to_input_image(path):
    return img_to_array(load_img(path, target_size=img_size))

# Function to load and resize target images
def path_to_target(path):
    img = img_to_array(load_img(path, target_size=img_size, color_mode="grayscale"))
    img = img.astype("uint8") - 1
    return img

# Initialize arrays to hold input images and target masks
input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype="float32")
targets = np.zeros((num_imgs,) + img_size + (1,), dtype="uint8")

# Load images and targets into the arrays
for i in range(num_imgs):
    input_imgs[i] = path_to_input_image(input_img_paths[i])
    targets[i] = path_to_target(target_paths[i])

# Step 5: Splitting the Data into Training and Validation Sets

This step splits the dataset into training and validation sets. The training set is used to train the model, while the validation set is used to evaluate the model's performance during training.

In [None]:
# Define the number of validation samples
num_val_samples = 1000

# Split the data into training and validation sets
train_input_imgs = input_imgs[:-num_val_samples]
train_targets = targets[:-num_val_samples]
val_input_imgs = input_imgs[-num_val_samples:]
val_targets = targets[-num_val_samples:]

# Step 6: Building the Model

This step defines a Convolutional Neural Network (CNN) model for image segmentation. The model architecture includes several convolutional and transpose convolutional layers for downsampling and upsampling the image, respectively. The goal is to create a model that can learn to segment the images.

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

# Function to build the model
def get_model(img_size, num_classes):
    inputs = keras.Input(shape=img_size + (3,))
    x = layers.Rescaling(1./255)(inputs)

    # Encoder: Convolutional layers to downsample the input image
    x = layers.Conv2D(64, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2D(256, 3, strides=2, padding="same", activation="relu")(x)
    x = layers.Conv2D(256, 3, activation="relu", padding="same")(x)

    # Decoder: Transpose convolutional layers to upsample back to original image size
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same", strides=2)(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same", strides=2)(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same", strides=2)(x)

    # Output layer with softmax activation for multi-class segmentation
    outputs = layers.Conv2D(num_classes, 3, activation="softmax", padding="same")(x)

    # Create the model
    model = keras.Model(inputs, outputs)
    return model

# Build the model with the specified image size and number of classes
model = get_model(img_size=img_size, num_classes=3)
model.summary()

NameError: name 'img_size' is not defined

# Step 7: Compiling and Training the Model

This step compiles the model with a specific optimizer and loss function, then trains the model using the training data while validating it on the validation data. The training process includes saving the best model based on validation performance.

In [None]:
# Compile the model with RMSprop optimizer and sparse categorical cross-entropy loss
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy")

# Define callbacks for the training process
callbacks = [
    keras.callbacks.ModelCheckpoint("oxford_segmentation.keras", save_best_only=True)
]

# Train the model with the training data and validate with the validation data
history = model.fit(
    train_input_imgs, train_targets,
    epochs=50,
    callbacks=callbacks,
    batch_size=64,
    validation_data=(val_input_imgs, val_targets)
)

# Step 8: Plotting Training and Validation Loss

This step plots the training and validation loss over the epochs to visualize the model's performance during training. It helps in understanding whether the model is overfitting or underfitting.

In [None]:
# Plot training and validation loss over epochs
epochs = range(1, len(history.history["loss"]) + 1)
loss = history.history["loss"]
val_loss = history.history["val_loss"]
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()

# Step 9: Loading the Model and Visualizing Predictions

This step loads the best model saved during training and uses it to make predictions on the validation data. It then visualizes the predictions to evaluate the model's segmentation performance.

In [None]:
from tensorflow.keras.utils import array_to_img

# Load the best model saved during training
model = keras.models.load_model("oxford_segmentation.keras")

# Display a validation image and its predicted segmentation mask
i = 4
test_image = val_input_imgs[i]
plt.axis("off")
plt.imshow(array_to_img(test_image))

# Predict the mask for the test image
mask = model.predict(np.expand_dims(test_image, 0))[0]

# Function to display the predicted mask
def display_mask(pred):
    mask = np.argmax(pred, axis=-1)
    mask *= 127
    plt.axis("off")
    plt.imshow(mask)

display_mask(mask)

NameError: name 'keras' is not defined