# HydraNet for Self-Driving Cars
In this notebook, you're going to learn how to build a Neural Network that has:
* Input: **a monocular RGB Image**
* Output: **a Depth Map**, and **a Segmentation Map**

A single model, two different outputs. For that, out model will need to use a principle called Multi Task Learning.<p>

# 1 - Imports

In [1]:
!pip install -U tensorflow

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorflow
  Downloading tensorflow-2.11.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (588.3 MB)
[K     |████████████████████████████████| 588.3 MB 18 kB/s 
[?25hCollecting tensorflow-estimator<2.12,>=2.11.0
  Downloading tensorflow_estimator-2.11.0-py2.py3-none-any.whl (439 kB)
[K     |████████████████████████████████| 439 kB 40.4 MB/s 
Collecting keras<2.12,>=2.11.0
  Downloading keras-2.11.0-py2.py3-none-any.whl (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 48.0 MB/s 
Collecting tensorboard<2.12,>=2.11
  Downloading tensorboard-2.11.0-py3-none-any.whl (6.0 MB)
[K     |████████████████████████████████| 6.0 MB 51.6 MB/s 
Collecting flatbuffers>=2.0
  Downloading flatbuffers-22.12.6-py2.py3-none-any.whl (26 kB)
Installing collected packages: tensorflow-estimator, tensorboard, keras, flatbuffers, tensorflow
  Attempting uninstall: tensorflow-estima

In [2]:
!wget https://hydranets-data.s3.eu-west-3.amazonaws.com/hydranets-data.zip && unzip -q hydranets-data.zip && mv hydranets-data/* . && rm hydranets-data.zip && rm -rf hydranets-data

--2022-12-23 00:57:58--  https://hydranets-data.s3.eu-west-3.amazonaws.com/hydranets-data.zip
Resolving hydranets-data.s3.eu-west-3.amazonaws.com (hydranets-data.s3.eu-west-3.amazonaws.com)... 52.95.156.44
Connecting to hydranets-data.s3.eu-west-3.amazonaws.com (hydranets-data.s3.eu-west-3.amazonaws.com)|52.95.156.44|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 110752264 (106M) [application/zip]
Saving to: ‘hydranets-data.zip’


2022-12-23 00:58:04 (22.2 MB/s) - ‘hydranets-data.zip’ saved [110752264/110752264]



In [3]:
# Install the tensorflow-addons and onnx-tensorflow
!git clone https://github.com/onnx/onnx-tensorflow.git && cd onnx-tensorflow && pip install -e .
!pip install tensorflow-addons

Cloning into 'onnx-tensorflow'...
remote: Enumerating objects: 6516, done.[K
remote: Counting objects: 100% (465/465), done.[K
remote: Compressing objects: 100% (202/202), done.[K
remote: Total 6516 (delta 323), reused 380 (delta 259), pack-reused 6051[K
Receiving objects: 100% (6516/6516), 1.98 MiB | 15.12 MiB/s, done.
Resolving deltas: 100% (5050/5050), done.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/onnx-tensorflow
Collecting onnx>=1.10.2
  Downloading onnx-1.13.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.5 MB)
[K     |████████████████████████████████| 13.5 MB 12.4 MB/s 
Collecting tensorflow_addons
  Downloading tensorflow_addons-0.19.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 13.1 MB/s 
Collecting protobuf<4,>=3.20.2
  Downloading protobuf-3.20.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.

In [4]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [5]:
%matplotlib inline
import sys
sys.path.append("./onnx-tensorflow")
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
import cv2
import tensorflow as tf
import tensorflow_addons as tfa
import math
import onnx
from onnx_tf.backend import prepare

# 2 — Creating the HydraNet
We now have 2 DataLoaders: one for training, and one for validation/test. <p>

In the next step, we're going to define our model, following the paper [Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations](https://arxiv.org/pdf/1809.04766.pdf) —— If you haven't read it yet, now is the time.<p>

A Note — This notebook has been adapted from DrSleep, a researcher named Vladimir, who authorized me to adapt it for education purposes. [Here's the notebook I'm refering to](https://github.com/DrSleep/multi-task-refinenet/blob/master/src/notebooks/ExpNYUDKITTI_joint.ipynb/).

<p>

> ![](https://d3i71xaburhd42.cloudfront.net/435d4b5c30f10753d277848a17baddebd98d3c31/2-Figure1-1.png)

Our model takes an input RGB image, make it go through an encoder, a lightweight refinenet decoder, and then has 2 heads, one for each task.<p>
Things to note:
* The only **convolutions** we'll need will be 3x3 and 1x1
* We also need a **MaxPooling 5x5**
* **CRP-Blocks** are implemented as Skip-Connection Operations
* **Each Head is made of a 1x1 convolution followed by a 3x3 convolution**, only the data and the loss change there


## 2.1 — Create a HydraNet class

```
class HydraNet(tf.keras.Model):
    def __init__(self):        
        super(HydraNet, self).__init__() # Python 3
        self.num_tasks = 2
        self.num_classes = 6
```

```
net = HydraNet()
```

```
Layer(1) S1
    conv2d(32, k=3, s=2, padding=1, bias=False)
    batchnorm(eps=1e-05, momentum=0.1)
    relu(6) 
    
Layer(2) IRB
    conv2d(32, k=1, s=1, bias=False)
    batchnorm(eps=1e-05, momentum=0.1)
    relu(6) 
    
    
```    

## 2.2 — Defining the Encoder: A MobileNetv2
![](https://iq.opengenus.org/content/images/2020/11/conv_mobilenet_v2.jpg)

In [6]:
def conv3x3(filters, stride=1, bias=False, dilation=1, groups=1):
    # 3x3 convolution
    return tf.keras.layers.Conv2D(filters, kernel_size=3, strides=stride,
                     padding='same', dilation_rate=dilation, use_bias=bias, groups=groups)

In [7]:
# Test conv3x3
conv3x3(filters=32)

<keras.layers.convolutional.conv2d.Conv2D at 0x7f811e0535e0>

In [8]:
def conv1x1(filters, stride=1, bias=False, groups=1):
    # 1x1 convolution
    return tf.keras.layers.Conv2D(filters, kernel_size=1, strides=stride,
                     padding='valid', use_bias=bias, groups=groups)

In [9]:
# Test conv1x1
conv1x1(filters=32)

<keras.layers.convolutional.conv2d.Conv2D at 0x7f80a4a38850>

In [10]:
def batchnorm():
    # batch norm 2d
    batch_norm = tf.keras.layers.BatchNormalization(epsilon=1e-5, momentum=0.1)
    batch_norm.trainable = True
    return batch_norm

In [11]:
# Test batchnorm
batchnorm()

<keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f80a4158dc0>

In [12]:
def convbnrelu(filters, kernel_size, stride=1, groups=1, act=True):
    # conv-batchnorm-relu
    if int(kernel_size/2) == 1 :
        padding = 'same'
    if int(kernel_size/2) == 0 :
        padding = 'valid'
    if act:
        return tf.keras.Sequential([tf.keras.layers.Conv2D(filters, kernel_size, strides=stride, padding=padding, groups=groups, use_bias=False),
                             batchnorm(),
                             tf.keras.layers.ReLU(max_value=6)])
    else:
        return tf.keras.Sequential([tf.keras.layers.Conv2D(filters, kernel_size, strides=stride, padding=padding, groups=groups, use_bias=False),
                             batchnorm()])

In [13]:
# Test convbnrelu
display(convbnrelu(32,3,1,1,True).layers)
print()
display(convbnrelu(32,3,1,1,False).layers)

[<keras.layers.convolutional.conv2d.Conv2D at 0x7f811e053dc0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f811e076ac0>,
 <keras.layers.activation.relu.ReLU at 0x7f80a4b1bac0>]




[<keras.layers.convolutional.conv2d.Conv2D at 0x7f811e053dc0>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x7f811e076d90>]

In [14]:
class InvertedResidualBlock(tf.keras.Model) :
    def __init__(self,in_planes, filters, expansion_factor, stride) :
        super(InvertedResidualBlock, self).__init__()
        intermed_planes = in_planes * expansion_factor
        self.residual = (in_planes == filters) and (stride == 1) # Boolean/Condition
        self.IBR = tf.keras.Sequential([convbnrelu(in_planes, kernel_size=1, stride=stride, act=True), 
                               convbnrelu(intermed_planes, kernel_size=3, 
                                          stride=stride, groups=intermed_planes, act=True), 
                               convbnrelu(filters, kernel_size=1, stride=stride, act=False)])
        
    def call(self, inputs) :
        x = self.IBR(inputs)
        if self.residual :
            return (x + inputs)
        else :
            return x

In [15]:
# class MobileNetV2(tf.keras.Model) :
#     def __init__(self):        
#         super(MobileNetV2, self).__init__()

#         self.LAYERS=[]
#         mobilenet_config = [[1, 16, 1, 1], # expansion rate, output channels, number of repeats, stride
#                         [6, 24, 2, 2],
#                         [6, 32, 3, 2],
#                         [6, 64, 4, 2],
#                         [6, 96, 3, 1],
#                         [6, 160, 3, 2],
#                         [6, 320, 1, 1],
#                         ]
#         self.in_channels = 32 # number of input channels
#         self.num_layers = len(mobilenet_config)
#         self.layer1 = convbnrelu(filters=32, kernel_size=3, stride=2) # This is the first layer of the first 
#         layer1_model = tf.keras.Sequential(self.layer1)
#         layer1_model._name = 'layer1'
#         self.LAYERS.append(layer1_model)

#         c_layer = 2
#         for t,c,n,s in (mobilenet_config):
#             layers = []

#             for idx in range(n):
#                 layers.append(InvertedResidualBlock(self.in_channels, c, expansion_factor=t, stride=s if idx == 0 else 1))
#                 self.in_channels = c
#             self.LAYERS.append(tf.keras.Sequential(layers))
#             c_layer += 1

# #         for model, i in zip(LAYERS[1:], range(2,9)):
# #             model._name = f'layer{i}'

#     def call(self, inputs) :
#         l1 = self.LAYERS[0](inputs)
#         l2 = self.LAYERS[1](l1)
#         l3 = self.LAYERS[2](l2)
#         l4 = self.LAYERS[3](l3)
#         l5 = self.LAYERS[4](l4)
#         l6 = self.LAYERS[5](l5)
#         l7 = self.LAYERS[6](l6)
#         l8 = self.LAYERS[7](l7)
        
#         return l3, l4, l5, l6, l7, l8

class MobileNetv2(tf.keras.Model):
    def __init__(self, return_idx=[6]):
        super().__init__()
        # expansion rate, output channels, number of repeats, stride
        self.mobilenet_config = [
        [1, 16, 1, 1],
        [6, 24, 2, 2],
        [6, 32, 3, 2],
        [6, 64, 4, 2],
        [6, 96, 3, 1],
        [6, 160, 3, 2],
        [6, 320, 1, 1],
        ]
        self.in_channels = 32  # number of input channels
        self.num_layers = len(self.mobilenet_config)
        self.layer1 = convbnrelu(self.in_channels, kernel_size=3, stride=2)
    
        self.return_idx = [1, 2, 3, 4, 5, 6]
        #self.return_idx = make_list(return_idx)

        c_layer = 2
        for t, c, n, s in self.mobilenet_config:
            layers = []
            for idx in range(n):
                layers.append(InvertedResidualBlock(self.in_channels,c,expansion_factor=t,stride=s if idx == 0 else 1,))
                self.in_channels = c
            setattr(self, "layer{}".format(c_layer), tf.keras.Sequential(layers))
            c_layer += 1

        self._out_c = [self.mobilenet_config[idx][1] for idx in self.return_idx] # Output: [24, 32, 64, 96, 160, 320]

    def call(self, x):
        outs = []
        x = self.layer1(x)
        outs.append(self.layer2(x))  # 16, x / 2
        outs.append(self.layer3(outs[-1]))  # 24, x / 4
        outs.append(self.layer4(outs[-1]))  # 32, x / 8
        outs.append(self.layer5(outs[-1]))  # 64, x / 16
        outs.append(self.layer6(outs[-1]))  # 96, x / 16
        outs.append(self.layer7(outs[-1]))  # 160, x / 32
        outs.append(self.layer8(outs[-1]))  # 320, x / 32
        return [outs[idx] for idx in self.return_idx]
     

## 2.3 — Defining the Decoder - A Multi-Task Lighweight RefineNet
Paper: https://arxiv.org/pdf/1810.03272.pdf

![](https://d3i71xaburhd42.cloudfront.net/4d653b19ce1c7cba79fc2f11271fb90f7744c95c/4-Figure1-1.png)

In [16]:
class CRPBlock(tf.keras.Model):
    """CRP definition"""
    def __init__(self, in_planes, out_planes, n_stages, groups=False):
        super(CRPBlock, self).__init__() #Python 3
        for i in range(n_stages):
            setattr(self, '{}_{}'.format(i + 1, 'outvar_dimred'),
                    conv1x1(out_planes, stride=1,
                            bias=False, groups=in_planes if groups else 1)) #setattr(object, name, value)

        self.stride = 1
        self.n_stages = n_stages
        self.maxpool = tf.keras.layers.MaxPool2D(pool_size=5, strides=1, padding='same')

    def call(self, inputs):
        top = inputs
        for i in range(self.n_stages):
            top = self.maxpool(top)
            top = getattr(self, '{}_{}'.format(i + 1, 'outvar_dimred'))(top)#getattr(object, name[, default])
            inputs = top + inputs
        return inputs

In [17]:
class LightweightRefineNet(tf.keras.Model):
    def __init__(self, num_tasks, num_classes) :
        super(LightweightRefineNet, self).__init__() 
        
        self.num_tasks = num_tasks
        self.num_classes = num_classes
        
        self.conv8 = conv1x1(256, bias=False)
        self.conv7 = conv1x1(256, bias=False)
        self.conv6 = conv1x1(256, bias=False)
        self.conv5 = conv1x1(256, bias=False)
        self.conv4 = conv1x1(256, bias=False)
        self.conv3 = conv1x1(256, bias=False)
        self.crp4 = self._make_crp(256, 256, 4, groups=False)
        self.crp3 = self._make_crp(256, 256, 4, groups=False)
        self.crp2 = self._make_crp(256, 256, 4, groups=False)
        self.crp1 = self._make_crp(256, 256, 4, groups=True)

        self.conv_adapt4 = conv1x1(256, bias=False)
        self.conv_adapt3 = conv1x1(256, bias=False)
        self.conv_adapt2 = conv1x1(256, bias=False)

        self.pre_depth = conv1x1(256, groups=256, bias=False)
        self.depth = conv3x3(1, bias=True)
        self.pre_segm = conv1x1(256, groups=256, bias=False)
        self.segm = conv3x3(self.num_classes, bias=True)
        self.relu = tf.keras.layers.ReLU(6)

        if self.num_tasks == 3:
            self.pre_normal = conv1x1(256, groups=256, bias=False)
            self.normal = conv3x3(3, bias=True)
                                 
                                 
    def _make_crp(self, in_planes, out_planes, stages, groups=False):
        layers = [CRPBlock(in_planes, out_planes,stages, groups=groups)]
        return tf.keras.Sequential(layers)
    
    
    def call(self, l3, l4, l5, l6, l7, l8) :
        l8 = self.conv8(l8)
        l7 = self.conv7(l7)
        l7 = self.relu(l8+l7)
        l7 = self.crp4(l7)
        l7 = self.conv_adapt4(l7)
        l7 = tf.keras.layers.UnSampling2D(size = l6.size()[2:],mode='bilinear', align_corners=False)(l7)

        l6 = self.conv6(l6)
        l5 = self.conv5(l5)
        l5 = self.relu(l5+l6+l7)
        l5 = self.crp3(l5)
        l5 = self.conv_adapt3(l5)
        l5 = tf.keras.layers.UnSampling2D(size = l4.size()[2:],mode='bilinear', align_corners=False)(l5)
        l4 = self.conv4(l4)
        l4 = self.relu(l5+l4)
        l4 = self.crp2(l4)
        l4 = self.conv_adapt2(l4)
        l4 = tf.keras.layers.UnSampling2D(size=l3.size()[2:],mode='bilinear', align_corners=False)(l4)

        l3 = self.conv3(l3)
        l3 = self.relu(l3+l4)
        l3 = self.crp1(l3)

        out_segm = self.pre_segm(l3)
        out_segm = self.relu(out_segm)
        out_segm = self.segm(out_segm)

        out_depth = self.pre_depth(l3)
        out_depth = self.relu(out_depth)
        out_depth = self.depth(out_depth)
        
        if self.num_tasks == 3:
            out_n = self.pre_normal(l3)
            out_n = self.relu(out_n)
            out_n = self.normal(out_n)
            return out_segm, out_depth, out_n
        
        else:
            return out_segm, out_depth


## 2.4 — Define the HydraNet Forward Function

> ![](https://d3i71xaburhd42.cloudfront.net/435d4b5c30f10753d277848a17baddebd98d3c31/2-Figure1-1.png)

# 3 — Run the Model

## 3.1 — Load the Model Weights

In [18]:
# if torch.cuda.is_available():
#     _ = hydranet.cuda()
# _ = hydranet.eval()

In [19]:
class HydraNet(tf.keras.Model):
    def __init__(self, num_tasks, num_classes) :
        super(HydraNet, self).__init__()
        
        self.num_tasks = num_tasks
        self.num_classes = num_classes

        self.encoder = MobileNetv2()
        self.decoder = LightweightRefineNet(num_tasks,num_classes)
        
    def call(self, inputs) :
        l3, l4, l5, l6, l7, l8 = self.encoder(inputs)
        if self.num_tasks == 3 :
            out_depth, out_segm, out_n = self.decoder(l3, l4, l5, l6, l7, l8)
            return out_depth, out_segm, out_n
        else :
            out_depth, out_segm = self.decoder(l3, l4, l5, l6, l7, l8)
            return out_depth, out_segm

In [24]:
hydranet = HydraNet(6,2)

In [40]:
hydranet.compile(optimizer=[tf.keras.optimizers.SGD(learning_rate=1e-2, momentum=0.9, weight_decay=1e-5), 
                            tf.keras.optimizers.SGD(learning_rate=1e-3, momentum=0.9, weight_decay=1e-5)],
                 loss=[tf.keras.losses.SparseCategoricalCrossentropy(), tf.keras.losses.Huber()])


In [None]:
hydranet.fit(train)

In [None]:
|# import onnx
# from onnx_tf.backend import prepare
 
# onnx_model = onnx.load("/content/drive/MyDrive/Colab Notebooks/HydraNets/KITTI/TensorFlow/ExpKITTI_joint.onnx")
# tf_rep = prepare(onnx_model)
# tf_rep.export_graph("model.pb")

In [21]:
# img = np.asarray(Image.open('/content/data/0000000000.png'), dtype=np.float32).transpose(2, 0, 1)[None]
# tf_rep.run(img)

In [None]:
# model = tf.saved_model.load('/content/model.pb')
# infer = model.signatures['serving_default']

# img_var = tf.convert_to_tensor(prepare_img(img).transpose(2, 0, 1)[None])
# img_var = tf.cast(img_var, tf.float32)

In [None]:
# outputs = list(infer.structured_outputs)
# y = infer(img_var)[outputs[0]].numpy()
# # print(y)
# outputs

In [None]:
# !rm -rf '/content/model'#, '/content/model.pb'

## 3.2 — Preprocess Images

In [None]:
IMG_SCALE  = 1./255
IMG_MEAN = np.array([0.485, 0.456, 0.406]).reshape((1, 1, 3))
IMG_STD = np.array([0.229, 0.224, 0.225]).reshape((1, 1, 3))

def prepare_img(img):
    return (img * IMG_SCALE - IMG_MEAN) / IMG_STD

## 3.3 — Load and Run an Image

In [None]:
# Pre-processing and post-processing constants #
CMAP = np.load('cmap_kitti.npy')
NUM_CLASSES = 6

In [None]:
print(CMAP)

In [None]:
import glob
images_files = glob.glob('data/*.png')
idx = np.random.randint(0, len(images_files))

img_path = images_files[idx]
img = np.array(Image.open(img_path))
plt.imshow(img)
plt.show()

In [None]:
def pipeline(img):
    img_var = tf.Variable(tf.convert_to_tensor(prepare_img(img).transpose(2, 0, 1)[None]))
    segm, depth = model(img_var)
    segm = cv2.resize(segm[0, :NUM_CLASSES].data.numpy().transpose(1, 2, 0),
                    img.shape[:2][::-1],
                    interpolation=cv2.INTER_CUBIC)
    depth = cv2.resize(depth[0, 0].cpu().data.numpy(),
                    img.shape[:2][::-1],
                    interpolation=cv2.INTER_CUBIC)
    segm = CMAP[segm.argmax(axis=2)].astype(np.uint8)
    depth = np.abs(depth)
    return depth, segm

In [None]:
depth, segm = pipeline(img)

In [None]:
f, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(30,20))
ax1.imshow(img)
ax1.set_title('Original', fontsize=30)
ax2.imshow(segm)
ax2.set_title('Predicted Segmentation', fontsize=30)
ax3.imshow(depth, cmap="plasma", vmin=0, vmax=80)
ax3.set_title("Predicted Depth", fontsize=30)
plt.show()

## 3.4 — Run on a Video

In [None]:
print(img.shape)
print(depth.shape)
print(segm.shape)

In [None]:
import matplotlib.cm as cm
import matplotlib.colors as co

def depth_to_rgb(depth):
    normalizer = co.Normalize(vmin=0, vmax=80)
    mapper = cm.ScalarMappable(norm=normalizer, cmap='plasma')
    colormapped_im = (mapper.to_rgba(depth)[:, :, :3] * 255).astype(np.uint8)
    return colormapped_im

depth_rgb = depth_to_rgb(depth)
print(depth_rgb.shape)
plt.imshow(depth_rgb)
plt.show()

In [None]:
print(img.shape)
print(depth_rgb.shape)
print(segm.shape)
new_img = np.vstack((img, segm, depth_rgb))
plt.imshow(new_img)
plt.show()

In [None]:
video_files = sorted(glob.glob("data/*.png"))

# Build a HydraNet
hydranet = HydraNet()
hydranet.define_mobilenet()
hydranet.define_lightweight_refinenet()
hydranet._initialize_weights()

# Set the Model to Eval on GPU
if torch.cuda.is_available():
    _ = hydranet.cuda()
_ = hydranet.eval()

# Load the Weights
ckpt = torch.load('ExpKITTI_joint.ckpt')
hydranet.load_state_dict(ckpt['state_dict'])

# Run the pipeline
result_video = []
for idx, img_path in enumerate(video_files):
    image = np.array(Image.open(img_path))
    h, w, _ = image.shape 
    depth, seg = pipeline(image)
    result_video.append(cv2.cvtColor(cv2.vconcat([image, seg, depth_to_rgb(depth)]), cv2.COLOR_BGR2RGB))

out = cv2.VideoWriter('output/out.mp4',cv2.VideoWriter_fourcc(*'MP4V'), 15, (w,3*h))

for i in range(len(result_video)):
    out.write(result_video[i])
out.release()

In [None]:
from IPython.display import HTML
from base64 import b64encode
mp4 = open('output/out.mp4','rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=800 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)

## 3D Segmentation

Did you ever wonder... How is segmentation used in self-driving cars? Like, **once you have the map, what do you do with it**?
<p>
Let's see something called 3D Segmentation — Fusing a Depth Map with a Segmentation Map!
<p>

In my course [MASTER STEREO VISION](https://courses.thinkautonomous.ai/stereo-vision), I teach how to do something called **3D Reconstruction** from a Depth Map and Calibration Parameters. <p>
In this course, we're going to see how to do it with Open3D, my go-to library for Point Clouds, and we'll see how to build 3D Segmentation Algorithms by fusing the Depth Map (3D) with the Segmentation Map.

In [None]:
!pip install open3d==0.14.1

In [None]:
import open3d as o3d

In [None]:
o3d.__version__

### RGBD - Fuse the RGB Image and the Depth Map

The first thing we'll implement is to create an RGBD Image by fusing the RGB Image with the Depth Map. For that, we'll use [Open3D's Class RGBD Image](http://www.open3d.org/docs/release/python_api/open3d.geometry.RGBDImage.html) and the function create_from_color_and_depth(color, depth).<p>
It looks pretty straghtforward, we just need to make sure that the image are loaded as [Open3D Images](http://www.open3d.org/docs/release/python_api/open3d.geometry.Geometry.html?highlight=image#open3d.geometry.Geometry.Image).

In [None]:
rgbd = #TODO: Call the Function

Next, we'll use the function create_from_rgbd_image to build a Point Cloud based on this. For that, we'll need the camera's intrinsic parameters. <p>
If you'd like to learn more about this, I invite you to take my course on [Stereo Vision](https://courses.thinkautonomous.ai/stereo-vision). In this course, I'm just going to give'em to you.

In [None]:
o3d.camera.PinholeCameraIntrinsic??

In [None]:
intrinsics = o3d.camera.PinholeCameraIntrinsic(width = 1242, height = 375, fx = 721., fy = 721., cx = 609., cy = 609.)

In [None]:
point_cloud = #TODO: Create A Point Cloud
o3d.io.write_point_cloud("test.pcd", point_cloud)

### 3D Segmentation — Fuse the Segmentation Map with the Depth Map
From now on, the process is exactly the same. But instead of creating a Point Cloud from an RGBD Image with the Normal RGB Image, we'll do it with the Depth Map.

In [None]:
rgbd = #TODO: Call the Function

In [None]:
point_cloud = #TODO: Create A Point Cloud

In [None]:
o3d.io.write_point_cloud("test_segm.pcd", point_cloud)