# Feature Matching

First, let's load the required modules.

In [1]:
from icepy4d.utils import initialization
from icepy4d.classes import Image
from icepy4d.matching import SuperGlueMatcher, LOFTRMatcher, Quality, TileSelection, GeometricVerification

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


Even though this step is not mandatory, it is suggested to setup a logger to see the output of the matching process. If no logger is setup, the output of the process is suppressed.
The logger can be setup as follows:

In [2]:
initialization.setup_logger()

The first step is to load the images as numpy arrays.  
We will use the Image class implemented in ICEpy4D, which allows for creating an Image instance by passing the path to the image file as `Image('path_to_image')`.  
Creating the Image instance will read the exif data of the image and store them in the Image object. The actual image value is read when the `Image.value` proprierty is accessed.
Alternatevely, one can also use OpencCV imread function to read the image as a numpy array (pay attention to the channel order, that should be RGB, while Opencv uses BGR).

In [3]:
image0 = Image('../data/img/p1/IMG_2637.jpg').value
image1 = Image('../data/img/p2/IMG_1112.jpg').value

print(type(image0))
print(image0.shape)
print(image1.shape)

<class 'numpy.ndarray'>
(4008, 6012, 3)
(4008, 6012, 3)


## SuperGlue matching

For running the matching with SuperGlue, a new SuperGlueMatcher object must be initialized with the parameters for SuperGlue matching (see the documentation of the class for more details). The parameters are given as a dictionary.

The configuration dictionary may contain the following keys:
- "weights": defines the type of the weights used for SuperGlue inference. It can be either "indoor" or "outdoor". The default value is "outdoor".
- "keypoint_threshold": threshold for the SuperPoint keypoint detector. The default value is 0.001.
- "max_keypoints": maximum number of keypoints to be detected by SuperPoint. If -1, no limit to keypoint detection is set. The default value is -1.
- "match_threshold": threshold for the SuperGlue feature matcher. Default value is 0.3.
- "force_cpu": if True, SuperGlue will run on CPU. Default value is False.
- "nms_radius": radius for non-maximum suppression. Default value is 3.
- "sinkhorn_iterations": number of iterations for the Sinkhorn algorithm. Default value is 20.

If the configuration dictionary is not given, the default values are used.

When running the matching, additional parameters can be given as arguments to the `match` method to define the matching behavior. The parameters are the following:
- image0: the first image to be matched.
- image1: the second image to be matched.
- quality: define the resize factor for the input images. Possible values "highest", "high" or "medium", "low". With "high", images are matched with full resulution. With "highest" images are up-sampled by a factor 2. With "medium" and "low" images are downsampled respectively by a factor 2 and 4. The default value is "high".
- tile_selection: tile selection approach. Possible values are `TileSelection.None`, `TileSelection.EXHAUSTIVE`, `TileSelection.GRID` or `TileSelection.PRESELECTION`. Refer to the following "Tile Section" section for more information. The default value is `TileSelection.PRESELCTION`.
- grid: if tile_selection is not `TileSelection.None`, this parameter defines the grid size.
- overlap: if tile_selection is not `TileSelection.None`, this parameter defines the overlap between tiles.
- do_viz_matches: if True, the matches are visualized. Default value is False.
- do_viz_tiles: if True, the tiles are visualized. Default value is False.
- save_dir: if not None, the matches are saved in the given directory. Default value is None.
- geometric_verification: defines the geometric verification approach.


#### Tile Selection

To guarantee the highest collimation accuracy, by default the matching is performed on full resolution images.
However, due to limited memory capacity in mid-class GPUs, high- resolution images captured by DSLR cameras may not fit into GPU memory. To overcome this limitation, ICEPy4D divides the images into smaller regular tiles with maximum dimension of 2000 px, computed over a regular grid.
The tile selection can be performed in four different ways:

1. `TileSelection.None`  
   Images are matched as a whole in just one step. No tiling is performed.
2. `TileSelection.EXHAUSTIVE`  
   All the tiles in the first image are matched with all the tiles in the second image. This approach is very computational demading as the pairs of tiles are all the possible combinations of tiles from the two images and the total number of pairs rises quickly with the number of tiles. Additionally, several spurios matches may be found in tiles that do not overlap in the two images. 
3. `TileSelection.GRID`  
   Tiles pairs are selected only based on the position of each tile in the grid, i.e., tile 1 in imageA is matched with tile 1 in imageB, tile 2 in imageA is matched with tile 2 in imageB, and so on. This approach is less computational demanding than the exhaustive one, but it is suitable only for images that are well aligned along a stripe with regular viewing geometry.
4. `TileSelection.PRESELECTION`  
   This is the only actual 'preselection' of the tiles, as the process is carried out in two steps.
   First, a matching is performed on downsampled images. Subsequently, the full-resolution images are subdivided into regulartiles, and only the tiles that have corresponding features in the low-resolution images are selected as candidates for a second matching step.

When a tile pre-selection approach is chosen, the tile grid must be defined by the `tile_grid` argument. This is a list of integers that defines the number of tiles along the x and y direction (i.e., number of columns and number of rows). For example, `tile_grid=[3,2]` defines a grid with 3 columns and 2 rows.
Additionally, a parameter specifiyng the overlap between different tiles can be defined by the `overlap` argument. This is an integer number that defines the number of pixels of overlap between adjacent tiles. For example, `overlap=200` defines an overlap of 100 pixels between adjacent tiles. The overlap helps to avoid missing matches at the tile boundaries.

The following figure shows the tile preselection process. An example of the tiles that are selected for the second matching step are highlighted in green.

![title](notebook_figs/tile_preselection.png)

#### Geometric Verification
Geometric verification of the matches is performed by using Pydegensac (Mishkin et al., 2015), that allows for robustly estimate the fundamental matrix. 
The maximum re-projection error to accept a match is set to 1.5 px by default, but it can be changed by the user. 
The successfully matched features, together with their descriptors and scores, are saved as a Features object for each camera and stored into the current Epoch object.

In [4]:
matching_cfg = {
    "weights": "outdoor",
    "keypoint_threshold": 0.0001,
    "max_keypoints": 8192,
    "match_threshold": 0.2,
    "force_cpu": False,
}

matcher = SuperGlueMatcher(matching_cfg)
matcher.match(
    image0,
    image1,
    quality=Quality.HIGH,
    tile_selection=TileSelection.PRESELECTION,
    grid=[4, 3],
    overlap=200,
    do_viz_matches=True,
    do_viz_tiles=False,
    save_dir = "./matches/superglue_matches",
    geometric_verification=GeometricVerification.PYDEGENSAC,
    threshold=1.5,
)

[0;37m2023-09-12 10:06:01 | [INFO    ] Running inference on device cuda[0m
Loaded SuperPoint model
Loaded SuperGlue model ("outdoor" weights)
[0;37m2023-09-12 10:06:01 | [INFO    ] Matching by tiles...[0m
[0;37m2023-09-12 10:06:01 | [INFO    ] Matching tiles by preselection tile selection[0m
[0;37m2023-09-12 10:06:02 | [INFO    ] Matching completed.[0m
[0;37m2023-09-12 10:06:02 | [INFO    ]  - Matching tile pair (3, 2)[0m
[0;37m2023-09-12 10:06:05 | [INFO    ]  - Matching tile pair (4, 7)[0m
[0;37m2023-09-12 10:06:07 | [INFO    ]  - Matching tile pair (5, 7)[0m
[0;37m2023-09-12 10:06:10 | [INFO    ]  - Matching tile pair (5, 8)[0m
[0;37m2023-09-12 10:06:13 | [INFO    ]  - Matching tile pair (6, 6)[0m
[0;37m2023-09-12 10:06:15 | [INFO    ]  - Matching tile pair (6, 9)[0m
[0;37m2023-09-12 10:06:18 | [INFO    ]  - Matching tile pair (7, 6)[0m
[0;37m2023-09-12 10:06:20 | [INFO    ]  - Matching tile pair (7, 7)[0m
[0;37m2023-09-12 10:06:23 | [INFO    ]  - Matching t

True

The matches with their descriptors and scores are saved in the matcher object.
All the results are saved as numpy arrays with float32 dtype.
They can be accessed as follows:

In [14]:
# Get matched keypoints
mktps0 = matcher.mkpts0
mktps1 = matcher.mkpts1

print(f"Number of matches: {len(mktps0)}")
print(f"Matches on image0 (first 5):\n{mktps0[0:5]}")
print(f"Matches on image1 (first 5):\n{mktps1[0:5]}")

# Get descriptors
descs0 = matcher.descriptors0
descs1 = matcher.descriptors1
print(f"Descriptors shape: {descs0.shape}") 

# Get scores of each matched keypoint
scores0 = matcher.scores0
scores1 = matcher.scores1
print(f"Scores shape: {scores0.shape}")

# Matching confidence
confidence = matcher.mconf
print(f"Confidence shape: {confidence.shape}")
print(f"Confidence (first 5): {confidence[0:5]}")

Number of matches: 2585
Matches on image0 (first 5):
[[   9. 1373.]
 [  10. 1348.]
 [  10. 1439.]
 [  11. 1411.]
 [  18. 1426.]]
Matches on image1 (first 5):
[[5342.  126.]
 [5350.   91.]
 [5249.  244.]
 [5267.  204.]
 [5268.  224.]]
Descriptors shape: (256, 2585)
Scores shape: (2585,)
Confidence shape: (2585,)
Confidence (first 5): [0.38059255 0.1878048  0.17584147 0.06356245 0.16298756]


### LOFTR matching

The LOFTR matcher shares the same interface as the SuperGlue matcher, therefore the same parameters can be used for the `match` method. 
The only difference is in the matcher initialization, which takes no parameters, as default values are defined from Kornia (see the documentation of the class for more details).

The matched points can be retrieved as before, but the descriptors are not saved in the matcher object, as they are not computed by LOFTR.

In [18]:
matcher = LOFTRMatcher()
matcher.match(
    image0,
    image1,
    quality=Quality.HIGH,
    tile_selection=TileSelection.PRESELECTION,
    grid=[5, 4],
    overlap=50,
    save_dir= "./matches/LOFTR_matches",
    geometric_verification=GeometricVerification.PYDEGENSAC,
    threshold=1.5,
)

mktps0 = matcher.mkpts0
mktps1 = matcher.mkpts1

print(f"Number of matches: {len(mktps0)}")

[0;37m2023-09-12 10:37:28 | [INFO    ] Running inference on device cuda[0m
[0;37m2023-09-12 10:37:28 | [INFO    ] Matching by tiles...[0m
[0;37m2023-09-12 10:37:28 | [INFO    ] Matching tiles by preselection tile selection[0m
[0;37m2023-09-12 10:37:28 | [INFO    ]  - Matching tile pair (0, 1)[0m


OutOfMemoryError: CUDA out of memory. Tried to allocate 1.61 GiB (GPU 0; 11.75 GiB total capacity; 7.67 GiB already allocated; 1.19 GiB free; 9.66 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

In [15]:
# Clean up result folders

import os
import shutil

if os.path.exists("./matches"):
    shutil.rmtree("./matches")
if os.path.exists("./logs"):
    shutil.rmtree("./logs")
