# "Multi Task Learning"

> "Can we use a single model to perform multiple tasks?"

- toc: true
- branch: master
- badges: false
- comments: true
- categories: [Computer Vision]
- hide: false
- search_exclude: false
- image: images/post-thumbnails/mtl.png
- metadata_key1: notes
- metadata_key2: 

# Definition


- **Multi Task Learning** - Adaption of a single neural network model to multiple tasks such as segmentation, depth estimation etc.
- **Segmenation** :  Per Pixel Classification of an image to identify objects such as trees, roads, people etc.
- **Depth estimation** : Estimating the distance from the camera

We will producing a output such as the one below.

> youtube: https://youtu.be/iXpnkIxrbq0


# Explanation

## What tasks can be combined?

Based on this [paper](https://arxiv.org/pdf/1905.07553.pdf) the following observations were made

- Segmentation + anything = becomes better
- Anything + segmentation = make it worse
- Anything + Normals = makes it better


## Architecture

The 2 most recent and popular ones are 

- [Google Pathways](https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/)
- [Tesla's Hydranets](https://www.youtube.com/watch?v=IHH47nZ7FZU&t=248s)


## Our Implementation

We will focus on Tesla's Hydranets in this blog post. We will recreate a modified version of the hydranets based on this [Real-Time Joint Semantic Segmentation and Depth Estimation Using
Asymmetric Annotations](https://arxiv.org/pdf/1809.04766.pdf)

![](https://abhisheksreesaila.github.io/blog/images/mtl/ModifiedHydranets.png "Architecture")


### Code Walkthrough

Its a encoder - decoder network. We use **[MobileNetv2](https://arxiv.org/abs/1801.04381)** as encoder and **[RefineNet](https://arxiv.org/pdf/1810.03272.pdf)** as decoder. 


#### Encoder

![](https://abhisheksreesaila.github.io/blog/images/mtl/mobilenet_v2.jpeg "MobileNetv2")

There are 2 important concepts :- 
- DepthWise Separable Convolutions

![](https://abhisheksreesaila.github.io/blog/images/mtl/DepthWise.jpg "Depth Wise")

![](https://abhisheksreesaila.github.io/blog/images/mtl/NormalConv.jpg "Standard Convolutions")

![](https://abhisheksreesaila.github.io/blog/images/mtl/NormalConv.jpg "Standard Convolutions")

![](https://abhisheksreesaila.github.io/blog/images/mtl/ReducedParams.jpg "Reduction in parameters")

- Residual Networks & Inverted Residual Networks

![](https://abhisheksreesaila.github.io/blog/images/mtl/ResidualvsNonResidual.png "Residual vs NonResidual Block")

![](https://abhisheksreesaila.github.io/blog/images/mtl/ResidualNetwork.jpg "Residual Block")

#### Decoder

![](https://abhisheksreesaila.github.io/blog/images/mtl/RefineNet.png "RefineNet")

##### Highlights of the network

- RCU - Residual Conv Unit
    -  Simplified version of the convolution unit in the original ResNet where the batch-normalization layers are removed
    
- Multi Resolution Fusion
  - All path inputs are then fused into a high-resolution feature map by the multi-resolution fusion block.

- Chained Residual Pooling
  - The proposed chained residual pooling aims to capture background context from a large
    image region. It is able to efficiently pool features with multiple window sizes and fuse     them together using learnable weights. In particular, this component is built as a
    chain of multiple pooling blocks, each consisting of one max-pooling layer and one   convolution layer. One pooling block takes the output of the previous pooling block as input. Therefore, the current pooling block is able to re-use the result from the previous pooling operation and thus access the features from a large region without using a large pooling window.

Here is the [code](https://colab.research.google.com/drive/1Q8Oi37D-Qf2d5Oemo5aXuq1kPC5vePCZ?usp=sharing)

##### Output
![](https://abhisheksreesaila.github.io/blog/images/mtl/HydranetOutput.png "Output")

> youtube: https://youtu.be/iXpnkIxrbq0

# References

[Implementation Notebook](https://github.com/DrSleep/multi-task-refinenet/blob/master/src/notebooks/ExpNYUDKITTI_joint.ipynb/)

[Architecture Images From This Paper](https://arxiv.org/pdf/1801.04381.pdf)