# Building and Training a UNet Model on Moon Data
---

### About the Dataset

This dataset contains 9766 realistic renders of lunar landscapes and their masks (segmented into three classes: sky, small rocks, bigger rocks). Additionally, a csv file of bounding boxes and cleaned masks of ground truths are provided.

An interesting feature of this dataset is that the images are synthetic; they were created using Planetside Software's Terragen. This isn't too obvious immediately as the renderings are highly realistic but it does make more sense after taking into account the scarcity of space imagery data.

Acknowledgment: Romain Pessia and Genya Ishigami of the Space Robotics Group, Keio University, Japan. You can find the dataset https://www.kaggle.com/romainpessia/artificial-lunar-rocky-landscape-dataset

### Reminder to turn on your GPU accelerator, from right hand side of your kaggle notebook, under Settings.

### Importing libraries



In [None]:
import os
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import tensorflow as tf
import keras

## Data Preprocessing

In [None]:
# Save the render and clean paths for img_dir and mask_dir respectively
img_dir =
mask_dir =


# Create lists of images and masks present in the respective directories
images =
masks =

In [None]:
# Check if the output is as expected before moving further

## Check first 5 elements of images and masks lists


## For this session, we will just use first 2000 images and masks as our dataset

In [None]:
# Use slicing concept to get only top 2000 images
images =
masks =

In [None]:
# Check the first image in images list --> Visualize it


In [None]:
# Check the respective mask --> Visualize it


In [None]:
# Check the shape of the image


### Originally, our images size is (720, 480) but we will reduce the size for better and faster processing. Since we are focusing on the clean masks so it will not effect much.

### Ground masks are more detailed and have so much noise. We'll keep things easy for our lecture. However, feel free to use ground masks and play around to explore more.

In [None]:
# Create H and W constants to save the height and the width of the image to pass to model and set it to 256



# Empty list to store preprocessed images and masks
X_img = []
y_mask = []

# Loop through each image and mask and implement the preprocessing steps
'''
Preprocessing Steps:-

For image:-
1. Resize the image to 256 x 256
2. Normalize the image
3. Keep the data type as float (values between 0 to 1)
For mask:0
1. Resize the mask
2. Keep the data type as integer (Values between 0 - 255)
'''
for x, y in (zip(images, masks)):
    # preprocess image


    # preprocess mask


    # append the image and mask to respective list



In [None]:
# Convert X_img and y_mask lists to numpy array


# 1600 datapoints as training dataset and 400 for validation dataset using slicing
X_train =
X_valid =

y_train =
y_valid =


In [None]:
# Check shape of X_train


In [None]:
# Create a subplot to visualize image in X_train and respective y_train to see if everything is working fine


Check this article to know more about how to build optimized data pipeline using tf
https://www.tensorflow.org/guide/data_performance

# Data Pipeline

### One hot encoding

![](https://i.imgur.com/mtimFxh.png)

#### Similarly, we'll one hot encode our labels to 4 different channels for four classes

In [None]:
batch_size =
num_classes =

'''Here the from_tensor_slices function is called to make dataset objects of our training and validation sets'''
# calling tf_dataset
train_dataset =
valid_dataset =

Read more about prefetching and AUTOTUNE here: https://www.tensorflow.org/guide/data_performance#optimize_performance

## Naive Approach
![](https://www.tensorflow.org/guide/images/data_performance/naive.svg)


## After prefetching

![](https://www.tensorflow.org/guide/images/data_performance/prefetched.svg)

In [None]:
# Batch the data and prefetch it
train_dataset =
valid_dataset =

In [None]:
sample = iter(valid_dataset)
data = next(sample)
data[0].shape
# batch size, height, width, channels

In [None]:
data[1].shape
# batch size, height, width, channels/classes

## Creating U-net Architecture

**For Contracting Path:** the **conv_block** function is called four time which will create four block with pooling (pool = True). The process is repeated 3 more times.

**For Bridge:** the **conv_block** function is called one time without pooling (pool=False).

**For Expansive Path: UpSampling2D** is used to expands the size of images. This expanded  image is concatenated with the corresponding image from the contracting path, The reason here is to combine the information from the previous layers in order to get a more precise prediction. And now **conv_block** function is called without pooling (pool=False). The process is repeated 3 more times.

The last step is to reshape the image to satisfy our prediction requirements. The last layer is a convolution layer with 1 filter of size 1x1.

In [None]:
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Activation, MaxPool2D, UpSampling2D, Concatenate
from tensorflow.keras.models import Model

'''conv_block it is used to create one block with two convolution layer
followed by BatchNormalization and activation function relu.
If the pooling is required then Maxpool2D is applied and return it else not.'''
# function to create convolution block
def conv_block(inputs, filters, pool=True):
  pass

'''build_unet it is used to create the U-net architecture.'''
# function to build U-net
def build_unet(shape, num_classes):
    """ Input """

    """ Encoder """


    """ Bridge """

    """ Decoder """
    # Reference for UpSampling2D: https://www.tensorflow.org/api_docs/python/tf/keras/layers/UpSampling2D

    """ Output layer """

    pass

In [None]:
# Calling build_unet function
model = build_unet()

# Get the model summary


## Load model and compile

In [None]:
# install segmentation_models to get iou_score from metrics
!pip install segmentation_models

In [22]:
import os
import keras

os.environ["SM_FRAMEWORK"] = "tf.keras"

In [23]:
import segmentation_models as sm
from segmentation_models.metrics import iou_score

sm.set_framework('tf.keras')
keras.backend.set_image_data_format('channels_last')

In [None]:
""" Hyperparameters """
lr = 1e-4
epochs = 5

"""Model"""
model.compile(loss="categorical_crossentropy",       # jacard loss (try it!), dice_loss
              optimizer=tf.keras.optimizers.Adam(lr),
              metrics=[iou_score])


train_steps = len(X_train)//batch_size
valid_steps = len(X_valid)//batch_size

In [None]:
print(train_steps)
print(valid_steps)

## Train model

In [None]:
'''model.fit is used to train the model'''
model_history = model.fit(train_dataset,
        steps_per_epoch=train_steps,
        validation_data=valid_dataset,
        validation_steps=valid_steps,
        epochs=epochs
    )

## Predict from model

In [None]:
# function to predict result
def predict_image(img_path, mask_path, model):
    H = 256
    W = 256
    num_classes = 4

    img = cv2.imread(img_path, cv2.IMREAD_COLOR)
    img = cv2.resize(img, (W, H))
    img = img / 255.0
    img = img.astype(np.float32)

    ## Read mask
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    mask = cv2.resize(mask, (W, H))   ## (256, 256)
    mask = np.expand_dims(mask, axis=-1) ## (256, 256, 1)
    mask = mask * (255/num_classes)
    mask = mask.astype(np.int32)
    mask = np.concatenate([mask, mask, mask], axis=2)

    ## Prediction
    pred_mask = model.predict(np.expand_dims(img, axis=0))[0] # (1, 256, 256, 3)
    pred_mask = np.argmax(pred_mask, axis=-1) # Output of pred_mask will be PROBABILITIES
    pred_mask = np.expand_dims(pred_mask, axis=-1)
    pred_mask = pred_mask * (255/num_classes)
    pred_mask = pred_mask.astype(np.int32)
    pred_mask = np.concatenate([pred_mask, pred_mask, pred_mask], axis=2)

    return img, mask, pred_mask

In [None]:
# function to display result
def display(display_list):
  plt.figure(figsize=(12, 10))

  title = ['Input Image', 'True Mask', 'Predicted Mask', 'Mask On Image']

  for i in range(len(display_list)):
    plt.subplot(1, len(display_list), i+1)
    plt.title(title[i])
    plt.imshow(tf.keras.preprocessing.image.array_to_img(display_list[i]))
    plt.axis('off')
  plt.show()

In [None]:
img_path = '../input/artificial-lunar-rocky-landscape-dataset/images/render/render0041.png'
mask_path = '../input/artificial-lunar-rocky-landscape-dataset/images/clean/clean0041.png'

img, mask, pred_mask = predict_image(img_path, mask_path, model)

display([img, mask, pred_mask])

## A practical note: different backbones in modern U-Nets

So far, you have looked at how the U-Net architecture was implemented in the original work by Ronneberger et al. Over the years, many people have experienced with different setups for U-Nets, including pretraining on e.g. ImageNet and then finetuning to their specific image segmentation tasks.

This means that today, you will likely use a U-Net that no longer utilizes the original architecture as proposed above - but it's still a good starting point, because the contractive path, expansive path and the skip connections remain the same.

**Common backbones for U-Net architectures these days are ResNet, ResNeXt, EfficientNet and DenseNet architectures. Often, these have been pretrained on the ImageNet dataset, so that many common features have already been learned. By using these backbone U-Nets, initialized with pretrained weights, it's likely that you can reach convergence on your segmentation problem much faster.**

That's it! You have now a high-level understanding of U-Net and its components sunglasses.

## In the next module, we will learn how you can use segmentation_models using Transfer learning to use UNet architecture with different pretrained models as backbone.