# Human-like Visual Search Application with small data
<img src="assets/App.jpg" width="1600">

# Workshop materials

[GitHub repository](https://github.com/EzheZhezhe/Small_data_visual_search_app)

"**README.MD**" and  "**environment.yml**"

# Workshop structure

0. Introduction
1. Building blocks of Siamese Mask R-CNN
2. Siamese Mask R-CNN single deployment with FastAPI
3. Known limitations and possible improvements
4. Conclusions and Next steps

# 0.Introduction

## About me
**Alyona Galyeva**: [Principal Data Solutions Engineer at LINKIT](https://www.linkit.nl/en) and [Organiser at PyLadies Amsterdam](https://amsterdam.pyladies.com/)

Former Machine Learning Engineer

<img src="assets/PyLadies1.jpg" width="800">  <img src="assets/PyLadies2.jpg" width="800">  <img src="assets/PyLadies3.jpg" width="800">  <img src="assets/PyLadies4.png" width="800">

Feel free to contact me via LinkedIn: https://www.linkedin.com/in/alyonagalyeva/


Before diving into the deepest depths of Siamese Mask R-CNN, let's briefly recap what are the common Computer Vision tasks

<img src="assets/ComputerVisionTasks.png" width="1000">

So, what is a Siamese neural network then?

<img src="assets/siamese.png" width="1000">




Wow, and what about Meta-learning?

<img src="assets/Meta.jpg" width="1000">

# 1.Building blocks of Siamese Mask R-CNN



In [None]:
# Inevitable step to download all required libraries

%load_ext autoreload
%autoreload 2
%matplotlib inline
%load_ext dotenv
%dotenv

import os
import random
import sys

import imgaug
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
sess_config = tf.ConfigProto()

COCO_DATA = os.getenv("COCO_DATA")
MASK_RCNN_MODEL_PATH = os.getenv("MASK_RCNN_MODEL_PATH")
MODEL_DIR = os.getenv("MODEL_DIR")

if MASK_RCNN_MODEL_PATH not in sys.path:
    sys.path.append(MASK_RCNN_MODEL_PATH)
    
from samples.coco import coco

from mrcnn import utils
from mrcnn import model as modellib
from mrcnn import visualize
    
from lib import utils as siamese_utils
from lib import model as siamese_model
from lib import config as siamese_config

### MSCOCO Dataset

**What is MSCOCO?**

COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features:

* Object segmentation
* Recognition in context
* Superpixel stuff segmentation
* 330K images (>200K labeled)
* 1.5 million object instances
* 80 object categories
* 91 stuff categories
* 5 captions per image
* 250,000 people with keypoints

In [None]:
# define categories that belong to one_shot_classes:
one_shot_classes = np.array([4*i + 1 for i in range(20)])

<img src="assets/CategoriesSplit.png" width="1400">

In [None]:
# Index COCO/val dataset
coco_val = siamese_utils.IndexedCocoDataset()
coco_object = coco_val.load_coco(COCO_DATA, subset="val", year="2017", return_coco=True)
coco_val.prepare()
coco_val.build_indices()
coco_val.ACTIVE_CLASSES = one_shot_classes

### Model

<img src="assets/siamese-mask-rcnn.png" width="1000">

In [None]:
class SmallEvalConfig(siamese_config.Config):
    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1
    NUM_CLASSES = 1 + 1
    NAME = 'coco'
    EXPERIMENT = 'evaluation'
    CHECKPOINT_DIR = 'checkpoints/'
    NUM_TARGETS = 1

class LargeEvalConfig(siamese_config.Config):
    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1
    NUM_CLASSES = 1 + 1
    NAME = 'coco'
    EXPERIMENT = 'evaluation'
    CHECKPOINT_DIR = 'checkpoints/'
    NUM_TARGETS = 1
    
    # Large image sizes
    TARGET_MAX_DIM = 192
    TARGET_MIN_DIM = 150
    IMAGE_MIN_DIM = 800
    IMAGE_MAX_DIM = 1024
    # Large model size
    FPN_CLASSIF_FC_LAYERS_SIZE = 1024
    FPN_FEATUREMAPS = 256
    # Large number of rois at all stages
    RPN_ANCHOR_STRIDE = 1
    RPN_TRAIN_ANCHORS_PER_IMAGE = 256
    POST_NMS_ROIS_TRAINING = 2000
    POST_NMS_ROIS_INFERENCE = 1000
    TRAIN_ROIS_PER_IMAGE = 200
    DETECTION_MAX_INSTANCES = 100
    MAX_GT_INSTANCES = 100

In [None]:
#Let's pick a model size

#model_size = 'large'
model_size = 'small'

In [None]:
#Let's create config based on the chosen model size
if model_size == 'small':
    config = SmallEvalConfig()
elif model_size == 'large':
    config = LargeEvalConfig()
    
config.display()

In [None]:
#Select checkpoints
if model_size == 'small':
    checkpoint = 'checkpoints/small_siamese_mrcnn_0160.h5'
elif model_size == 'large':
    checkpoint = 'checkpoints/large_siamese_mrcnn_0320.h5'

In [None]:
# Let's see what is under the hood

## Home assignment

Investigate the [original article](https://arxiv.org/pdf/1811.11507.pdf) and the source code in `/lib` folder and think over what can be improved and optimized for better performance

# 2.Siamese Mask R-CNN single deployment


### Visualization

In [None]:
# Create model object in inference mode.
model = siamese_model.SiameseMaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
model.load_checkpoint(checkpoint)

In [None]:
# Select category
category = 15
image_id = np.random.choice(coco_val.category_image_index[category])
# Load target
target = siamese_utils.get_one_target(category, coco_val, config)
# Load image
image = coco_val.load_image(image_id)

# Run detection
results = model.detect([[target]], [image], verbose=0)
r = results[0]
# Display results
siamese_utils.display_results(target, image, r['rois'], r['masks'], r['class_ids'], r['scores'])

In [None]:
## Home assignment

lorem ipsum

# 3.Known limitations and possible improvements

**Known limitations:**
- easy to implement
- can keep the model structure
- can learn to ignore many new types of noise
- can combine samples from different 

**Possible improvements:**
- very different types may still succeed
- needed amount oftraining data is unclear
- when overdone, may hurt 
- is hard to measure

## Home assignment

1. lorem ipsum
2. lorem ipsum

*Bonus:* on step 2 lorem ipsum

*Future tip:* 

# 4.Conclusions and Next steps

- lorem ipsum
- lorem ipsum

# Thank you for your attention!

