# Introduction:

Background separation is one of the most used features in cameras nowadays. And, of course, they use AI for that. So, today, we will be creating a background remover AI using semantic segmentation.

We will be using Tiramisu model architecture ([https://arxiv.org/abs/1611.09326](https://arxiv.org/abs/1611.09326)) as our model and we will be training that model on Matting human dataset provided by AISegment.com ([https://www.kaggle.com/laurentmih/aisegmentcom-matting-human-datasets](https://www.kaggle.com/laurentmih/aisegmentcom-matting-human-datasets)).

Now, let's start.

In [0]:
%tensorflow_version 2.x

Import all the required libraries.

In [0]:
model = tf.keras.models.load_model('/content/drive/My Drive/model.h5')

In [0]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import layers

In [0]:
from tensorflow.keras.layers import *
from tensorflow.keras.regularizers import l2
from tensorflow.keras.models import Model

In [0]:
import os
import zipfile

Creating Tiramisu model. You could see the image below to better understand what's going on in the code. ![alt text](https://d3i71xaburhd42.cloudfront.net/1d9df46f672b1e22b6f210343be8684f88c0ccca/1-Figure1-1.png)

In [0]:
#Creating dense block
def block(x, num_layers):
  for i in range(num_layers):
    t = x
    x = layers.BatchNormalization(axis=1, beta_regularizer=tf.keras.regularizers.l2(0.0001), gamma_regularizer=tf.keras.regularizers.l2(0.0001))(x)
    x = layers.Activation("relu")(x)
    x = layers.Conv2D(16, (3, 3), padding="same", kernel_initializer="he_uniform")(x)
    x = layers.Dropout(0.2)(x)
    x = layers.concatenate([x, t])
  return x

#Creating Transition down 
def transition_down(x, num_features):
    x = layers.BatchNormalization(axis=1, beta_regularizer=tf.keras.regularizers.l2(0.0001), gamma_regularizer=tf.keras.regularizers.l2(0.0001))(x)
    x = layers.Activation("relu")(x)
    x = layers.Conv2D(num_features, (1, 1), padding="same", kernel_initializer="he_uniform")(x)
    x = layers.Dropout(0.2)(x)
    x = layers.MaxPooling2D((2, 2), strides=2, padding="same")(x)
    return x

# Creating Transition up
def transition_up(x, num_features):
  x = layers.Conv2DTranspose(num_features, strides=2, kernel_size=(3, 3), padding="same")(x)
  return x

#The function to create Tiramisu model.
def create_tiramisu(n_outputs,inputs):
  n_pool = 5
  growth_rate = 16
  num_features = 48
  layer_per_block = [4, 5, 7, 10, 12, 15, 12, 10, 7, 5, 4]
  
  x = layers.Conv2D(48, (3, 3), padding="same")(inputs)
  skip_connections = []
  for i in range(n_pool):
    x = block(x, layer_per_block[i])
    skip_connections.append(x)
    num_features += growth_rate * layer_per_block[i]
    x = transition_down(x, num_features)

  x = block(x, layer_per_block[n_pool])
  skip_connections = skip_connections[::-1]

  for i in range(n_pool):
    num_features = growth_rate * layer_per_block[n_pool + i]
    x = transition_up(x, num_features)
    x = layers.concatenate([x, skip_connections[i]])
    x = block(x, layer_per_block[n_pool+i+1])

  x = layers.Conv2D(n_outputs, kernel_size=(1, 1), padding='same', kernel_initializer="he_uniform")(x)
  return layers.Activation('softmax')(x)

Instantiating Tiramisu model.

In [0]:
inputs = layers.Input((224, 224, 3)) #Shape of input image
outputs = create_tiramisu(2,inputs)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

Now, we have to download the dataset from kaggle.

In [0]:
os.environ['KAGGLE_USERNAME'] = "ENTER YOUR USERNAME"
os.environ['KAGGLE_KEY'] = "ENTER YOUR KEY"
!kaggle datasets download -d laurentmih/aisegmentcom-matting-human-datasets

Extracting the zip file.

In [0]:
with zipfile.ZipFile("/content/aisegmentcom-matting-human-datasets.zip","r") as zip_ref:
    zip_ref.extractall("/content/")

In [0]:
masks = []
images = []

Now, let's read the image file, resize it and append it to a list. The data in image segmentation task is always the actual image and the label is it's mask image. So, we will be creating list of actual image in next block and image masks in block after that.

Note: We will not be using all the images provided by the dataset due to memory limitation.

In [0]:
path = "/content/matting_human_half/matting/1803151818/"

for folder in sorted(os.listdir(path)):
  for filename in sorted(os.listdir(os.path.join(path, folder))):
    img = plt.imread(os.path.join(path, folder, filename))
    img = tf.image.resize(img, (224, 224))
    img = img[:,:,3]
    masks.append(np.round(img))

In [0]:
path = "/content/matting_human_half/clip_img/1803151818/"

for folder in sorted(os.listdir(path)):
  for filename in sorted(os.listdir(os.path.join(path, folder))):
    img = plt.imread(os.path.join(path, folder, filename))
    img = tf.image.resize(img, (224, 224))/255
    images.append(img)

Creating the dataset in a format accepted by tensorflow for training.

In [0]:
train_data = tf.data.Dataset.from_tensor_slices((images, masks))

Batching the dataset on the batch size of 8.

In [0]:
data = train_data.batch(8).prefetch(1)

Now, let's compile the model created above with loss function as sparse categorical cross-entropy and RMSProp as optimizer.

In [0]:
model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.RMSprop(1e-3, decay=1-0.99995), metrics=["accuracy"])

Here comes the exciting part. Let's train the model. We will training it on a few epochs as it takes a lot of time to train.

In [0]:
his = model.fit(data,epochs=3)

## Testing

Finally, let's test the performance of our model. Load an arbitary image and resize it to our requirements.

In [0]:
img_path = "/content/img.jpg"
img = plt.imread(img_path)
plt.imshow(img)
img = tf.image.resize(img, (224, 224))
test = tf.expand_dims(img,0)

In [0]:
img_path = "/content/maks.png"
mask = plt.imread(img_path)
mask= tf.image.resize(mask, (224, 224))

Do predictions from model.

In [0]:
pred = model.predict(test)

Showing the final image.

In [0]:
temp = tf.argmax(pred[0], axis=-1)
mask = mask.numpy()
mask[:,:,3] = temp
plt.imshow(mask)

In [0]:
plt.imsave('output.png',mask)

# Conclusion:

As you can see, the performance of the trained model is average. This is because of training on less dataset and less epochs. Upon training on an environment with large memory, we could achieve even more.