## 💻 UnpackAI DL201 Bootcamp - Week 3 - CV tasks

### 📕 Learning Objectives
* Explore working examples os using pre-trained models for common CV tasks
* Get tips and insights for the adaptation of examples to the final project.

### 📖 Concepts map
* image classification
* object detection
* image segmentation

### Code preparation

In [None]:
# Install packages (comment if not required)
!pip install -Uqq  ipywidgets fastai fastbook

# Import dependencies for all sample AI applications (again, to test the environment)
import os
import numpy as np
import tensorflow as tf
import cv2
import matplotlib.pyplot as plt
import pandas
import torch
from fastai.vision.all import *
from fastai.text.all import *
from fastai.collab import *
from fastai.tabular.all import *
import ipywidgets as widgets
from IPython.display import Image
import urllib.request
import requests
from huggingface_hub import from_pretrained_keras

In [None]:
!pip install --upgrade git+https://github.com/rbrtwlz/fastai_object_detection -qq
from fastai_object_detection.all import *

### Image classification example: Dogs vs Cats

In [None]:
"""
AI application sample: collect images form the PETS dataset to train a 
RESNET-based dogs vs cats classifier
"""

# Download images, navigate to the folder and display some of the images
image_path = untar_data(URLs.PETS)/'images'
os.chdir(image_path)
filenames = os.listdir('.')

def slider_callback(position):
    image_object = Image(filename=filenames[position], width=600)
    display(image_object)

widgets.interact(slider_callback, position=widgets.IntSlider(min=0, max=len(filenames), step=1))

In [None]:
# On this dataset, cat images filenames beggin with an uppercase letter
print(filenames[:11])

# Define a function that uses that property to select if a filename is a cat
def is_cat(filename):
    return filename[0].isupper()

# Create a dataloader
data_loader = ImageDataLoaders.from_name_func(
    path=image_path, fnames=get_image_files(image_path), label_func=is_cat, valid_pct=0.2, seed=42,
    item_tfms=Resize(224)
)

In [None]:
# Feed data to model and train, train with 1 epoch
"""
Note: Usually more epochs are required to achieve a good result
but given the quality of the dataset and the model in this case is enough.
"""

image_learner = cnn_learner(data_loader, resnet34, metrics=error_rate)
image_learner.fine_tune(1)

### Object detection example: Cats vs Dogs 
source: https://rbrtwlz.github.io/fastai_object_detection/

It comes with a fastai DataLoaders class for object detection, prepared and easy to use models and some metrics to measure generated bounding boxes (mAP). So you can train a model for object detection in the simple fastai way with one of the included Learner classes.

All you need is a pandas DataFrame containing the data for each object in the images. In default setting follwing columns are required:

For the image, which contains the object(s):

    image_id
    image_path

The object's bounding box:

    x_min
    y_min
    x_max
    y_max

The object's class/label:

    class_name

If you want to use a model for instance segementation, following columns are additionally required:

    mask_path (path to the binary mask, which represents the object in the image)

There are helper functions available, for example for adding the image_path by image_id or to change the bbox format from xywh to x1y1x2y2.

In [None]:
# Get dataset
path, df = CocoData.create(ds_name="coco-cats-and-dogs", cat_list=["cat", "dog"], max_images=2000, with_mask=False)

Microsoft COCO dataset contains 328,000 annotated images of 91 object categories, so you can pick the categories you want and download just associated images.

Then you can build DataLoaders, using it's from_df factory method

In [None]:
dls = ObjectDetectionDataLoaders.from_df(df, bs=2, 
                                         item_tfms=[Resize(800, method="pad", pad_mode="zeros")], 
                                         batch_tfms=[Normalize.from_stats(*imagenet_stats)])
dls.show_batch(figsize=(10,10))

Now you are ready to create your fasterrcnn_learner to train a FasterRCNN model (with resnet50 backbone). To validate your models predictions you can use metrics like mAP_at_IoU60.

In [None]:
learn = fasterrcnn_learner(dls, fasterrcnn_resnet50, 
                           opt_func=SGD, lr=0.005, wd=0.0005, train_bn=False,
                           metrics=[mAP_at_IoU40, mAP_at_IoU60])
learn.lr_find()
learn.fit_one_cycle(10, 1e-04)

### Image segementation example: Localize common objects in images
Creating a model that can recognize the content of every individual pixel in an image is called *segmentation*. Here is how we can train a segmentation model with fastai, using a subset of the [*Camvid* dataset](http://www0.cs.ucl.ac.uk/staff/G.Brostow/papers/Brostow_2009-PRL.pdf) from the paper "Semantic Object Classes in Video: A High-Definition Ground Truth Database" by Gabruel J. Brostow, Julien Fauqueur, and Roberto Cipolla:

In [None]:
path = untar_data(URLs.CAMVID_TINY)
dls = SegmentationDataLoaders.from_label_func(
    path, bs=8, fnames = get_image_files(path/"images"),
    label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    codes = np.loadtxt(path/'codes.txt', dtype=str)
)

learn = unet_learner(dls, resnet34)
learn.fine_tune(8)

We can visualize how well it achieved its task, by asking the model to color-code each pixel of an image. As you can see, it nearly perfectly classifies every pixel in every object. For instance, notice that all of the cars are overlaid with the same color and all of the trees are overlaid with the same color (in each pair of images, the lefthand image is the ground truth label and the right is the prediction from the model):

In [None]:
learn.show_results(max_n=6, figsize=(12, 15))

### Another image segmentation example

Code Initialization

In [None]:
# common part
import numpy as np
import os
import tensorflow as tf
import cv2
import matplotlib.pyplot as plt

# to get the pre-trained model
from huggingface_hub import from_pretrained_keras

# to re-build the model from scratch
from glob import glob
from scipy.io import loadmat
from tensorflow import keras
from tensorflow.keras import layers

Dowload a pre-trained image segmentation model

In [None]:
model = from_pretrained_keras("keras-io/deeplabv3p-resnet50")

In [None]:
model

In [None]:
model.summary

Prepare custom functions used to prepare the data, call the model and display its results

In [None]:
colormap = np.array([[0,0,0], [31,119,180], [44,160,44], [44, 127, 125], [52, 225, 143],
                    [217, 222, 163], [254, 128, 37], [130, 162, 128], [121, 7, 166], [136, 183, 248],
                    [85, 1, 76], [22, 23, 62], [159, 50, 15], [101, 93, 152], [252, 229, 92],
                    [167, 173, 17], [218, 252, 252], [238, 126, 197], [116, 157, 140], [214, 220, 252]], dtype=np.uint8)

img_size = 512
                    
def read_image(image):
    image = tf.convert_to_tensor(image)
    image.set_shape([None, None, 3])
    image = tf.image.resize(images=image, size=[img_size, img_size])
    image = image / 127.5 - 1
    return image

def infer(model, image_tensor):
    predictions = model.predict(np.expand_dims((image_tensor), axis=0))
    predictions = np.squeeze(predictions)
    predictions = np.argmax(predictions, axis=2)
    return predictions

def decode_segmentation_masks(mask, colormap, n_classes):
    r = np.zeros_like(mask).astype(np.uint8)
    g = np.zeros_like(mask).astype(np.uint8)
    b = np.zeros_like(mask).astype(np.uint8)
    for l in range(0, n_classes):
        idx = mask == l
        r[idx] = colormap[l, 0]
        g[idx] = colormap[l, 1]
        b[idx] = colormap[l, 2]
    rgb = np.stack([r, g, b], axis=2)
    return rgb

def get_overlay(image, colored_mask):
    image = tf.keras.preprocessing.image.array_to_img(image)
    image = np.array(image).astype(np.uint8)
    overlay = cv2.addWeighted(image, 0.35, colored_mask, 0.65, 0)
    return overlay

def segmentation(input_image):
    image_tensor = read_image(input_image)
    prediction_mask = infer(image_tensor=image_tensor, model=model)
    prediction_colormap = decode_segmentation_masks(prediction_mask, colormap, 20)
    overlay = get_overlay(image_tensor, prediction_colormap)
    return (overlay, prediction_colormap)

def plot_samples_matplotlib(display_list, figsize=(5, 3)):
    _, axes = plt.subplots(nrows=1, ncols=len(display_list), figsize=figsize)
    for i in range(len(display_list)):
        if display_list[i].shape[-1] == 3:
            axes[i].imshow(tf.keras.preprocessing.image.array_to_img(display_list[i]))
        else:
            axes[i].imshow(display_list[i])
    plt.show()

def plot_predictions(images_list, colormap, model):
    for image_file in images_list:
        image_tensor = read_image(image_file)
        prediction_mask = infer(image_tensor=image_tensor, model=model)
        prediction_colormap = decode_segmentation_masks(prediction_mask, colormap, 20)
        overlay = get_overlay(image_tensor, prediction_colormap)
        plot_samples_matplotlib(
            [image_tensor, overlay, prediction_colormap], figsize=(18, 14)
        )

Prepare an input data to test the model

suggested read: this tutorial to insert an image into your Kaggle notebook :
https://www.kaggle.com/code/michaelshoemaker/adding-images-from-your-pc/notebook

Then, use the following code to check where your picture was put

In [None]:
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
picture_path = "/kaggle/input/picture1/Three_people.jfif"
os.path.isfile(picture_path)

Observe results

In [None]:
img = cv2.cvtColor(cv2.imread(picture_path), cv2.COLOR_BGR2RGB)
img_array = np.array(img)
plt.imshow(img_array)
plt.show()

In [None]:
picture_list = [img_array]

In [None]:
plot_predictions(picture_list, colormap, model)