# WHALES MATCHING USING LoFTR and KORNIA

What is LoFTR?

LoFTR stands for Detector-Free Local Feature Matching with Transformers. It is detailed described in [LoFTR: Detector-Free Local Feature Matching with Transformers](https://arxiv.org/pdf/2104.00680.pdf)(1). 

Source[1]: 
Novel method for local image feature matching. Instead of performing image feature detection, description, and matching sequentially, we propose to first establish pixel-wise dense matches at a coarse level and later refine the good matches at a fine level. In contrast
to dense methods that use a cost volume to search correspondences, we use self and cross attention layers in Transformer to obtain feature descriptors that are conditioned on both images. The global receptive field provided by Transformer enables our method to produce dense matches in low-texture areas, where feature detectors usually struggle to produce repeatable interest points. The experiments on indoor and outdoor datasets show that LoFTR outperforms state-of-the-art methods by a large margin. LoFTR also ranks first on two public benchmarks of visual localization among the published methods. Code is available at  project page: https://zju3dv.github.io/loftr/


<div align = "center"><img src="https://i.ibb.co/mygCR9n/LoFTR.png"/></div>


In this notebook I will show you how use LoFTR using Kornia for matching whales features. Kornia is a differentiable library that allows classical computer vision to be integrated into deep learning models.

It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions.

Why Kornia ?
With Kornia we fill the gap between classical and deep computer vision that implements standard and advanced vision algorithms for AI:

* Computer Vision: Kornia fills the gap between Classical and Deep computer Vision.
* Differentiable: We leverage the Computer Vision 2.0 paradigm.
* Open Source: Our libraries and initiatives are always according to the community needs.
* PyTorch: At our core we use PyTorch and its Autograd engine for its efficiency.

<div align ="center"><img src="https://github.com/kornia/data/raw/main/kornia_banner_pixie.png"/></div>

Update (2022.03.06)

**Today I did SURF and LoFTR comparision .... Two observations:**
* SURF and SIFT are not in Kaggle distribution -  algorithm is patented and is excluded in this configuration 
* In my opinion SURF works worse then LoFTR on this dataset - it was difficult to find a match above 75% even on very similar fins (belonging to the same individual)

See comparision:
<div align = "center"> <img src="https://i.ibb.co/smrJdDf/loftr01.jpg"/></div>
<div align = "center"> <img src="https://i.ibb.co/VQ7qhGr/surf01.jpg"/></div>

In [None]:
%%capture 

!pip install git+https://github.com/kornia/kornia
!pip install kornia_moons

In [None]:
import cv2
import kornia as K
import kornia.feature as KF
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch

from kornia_moons.feature import *
from PIL import Image

For this demo I will use cropped images from dataset provided by @  [cropped&resized(512x512) dataset using detic](https://www.kaggle.com/c/happy-whale-and-dolphin/discussion/305503). Thank you for contributing in this competition.

In [None]:
im_path =  '../input/whale2-cropped-dataset/cropped_train_images/cropped_train_images/' 
wh_dataset = pd.read_csv("../input/happy-whale-and-dolphin/train.csv")
wh_dataset.species.value_counts().head(8)

In [None]:
# I can just take one flase killer whale to watch them 
wh_dataset.query("species=='spinner_dolphin'").individual_id.value_counts().head(5)

In [None]:
def plot_whales(im_path):
    
    fig, axes = plt.subplots(3, 3, figsize=(20,20))
    
    for idx, img in enumerate(img_to_draw):
        i = idx % 3 
        j = idx // 3 
        image = Image.open(im_path + img)
        iar_shp = np.array(image).shape
        axes[i, j].imshow(image)
        axes[i, j].set_title(img)

    plt.subplots_adjust(wspace=0, hspace=.2)
    plt.show()

In [None]:
img_to_draw = [file for file in wh_dataset.query("individual_id == '9ab8c57f10bc'").sample(9).image]

plot_whales(im_path)

In [None]:
# This is demo only - I removed background only for whale ID = 9ab8c57f10bc and put images in separate dataset. 
# There is no model and solution provided yet - still working on improving background separation.

plot_whales('../input/whalebackground/')

In [None]:
def load_torch_image(fname):
    img = K.image_to_tensor(cv2.imread(fname), False).float() /255.
    img = K.color.bgr_to_rgb(img)
    return img

In this experiment we use SoTA feature extractor LoFTR however Kornia gives you wide range of features extractor and they utylize GPU (Pytorch): https://kornia.readthedocs.io/en/latest/feature.html

In [None]:
def match_and_draw(im_path, img_in1, img_in2):
    img1 = load_torch_image(im_path + img_in1)
    img2 = load_torch_image(im_path + img_in2)
    matcher = KF.LoFTR(pretrained='outdoor')
    
    input_dict = {"image0": K.color.rgb_to_grayscale(img1), 
                  "image1": K.color.rgb_to_grayscale(img2)}
    
    with torch.no_grad():
        correspondences = matcher(input_dict)
    
    mkpts0 = correspondences['keypoints0'].cpu().numpy()
    mkpts1 = correspondences['keypoints1'].cpu().numpy()
    H, inliers = cv2.findFundamentalMat(mkpts0, mkpts1, cv2.USAC_MAGSAC, 0.5, 0.999, 100000)
    inliers = inliers > 0
    
    draw_LAF_matches(
    KF.laf_from_center_scale_ori(torch.from_numpy(mkpts0).view(1,-1, 2),
                                torch.ones(mkpts0.shape[0]).view(1,-1, 1, 1),
                                torch.ones(mkpts0.shape[0]).view(1,-1, 1)),

    KF.laf_from_center_scale_ori(torch.from_numpy(mkpts1).view(1,-1, 2),
                                torch.ones(mkpts1.shape[0]).view(1,-1, 1, 1),
                                torch.ones(mkpts1.shape[0]).view(1,-1, 1)),
    torch.arange(mkpts0.shape[0]).view(-1,1).repeat(1,2),
    K.tensor_to_image(img1),
    K.tensor_to_image(img2),
    inliers,
    draw_dict={'inlier_color': (0.2, 1, 0.2),
               'tentative_color': None, 
               'feature_color': (0.2, 0.5, 1), 'vertical': False})
    return correspondences

In [None]:
img_to_draw = [file for file in wh_dataset.query("individual_id == '9ab8c57f10bc'").image]
random_samples = np.random.randint(len(img_to_draw), size=(2, 4))

for i in range(random_samples.shape[1]):
    whale_1 = img_to_draw[random_samples[0][i]]
    whale_2 = img_to_draw[random_samples[1][i]]
    print(f'Matching: {whale_1} to {whale_2}')
    correspondences = match_and_draw(im_path, whale_1, whale_2)

In [None]:
# Let's look on the structure of correcpondences

for k,v in correspondences.items():
    print (k)

In [None]:
# Keypoint coordinates for last prediction - only for showing structure 
print("Coordinate for each matching feature - X and Y")
print(correspondences['keypoints0'].cpu().numpy().T)

In [None]:
# Keypoint confidence
# Blue one - low confidence, green one confidence over threshold
print("Scores for each feature:")
print(correspondences['confidence'].cpu().numpy())

This is experimental code - demo only to show you one way of dealing with problem. Hope you like it. 
For sure I will investigate this way. It looks promisting.

In [None]:
# This is demo only - I removed background only for whale ID = 9ab8c57f10bc and put images in separate dataset. 
# There is no model and solution provided yet - still working on improving background separation.

for i in range(random_samples.shape[1]):
    whale_1 = img_to_draw[random_samples[0][i]]
    whale_2 = img_to_draw[random_samples[1][i]]
    print(f'Matching: {whale_1} to {whale_2}')
    match_and_draw('../input/whalebackground/', whale_1, whale_2)