# OmniGlue - Feature Matching with Foundation Model Guidance

This notebook will help you set up and test the OmniGlue library from Google Research, which is designed for generalizable image feature matching using foundation model guidance.

OmniGlue was introduced in a CVPR 2024 paper as a solution for image matching that can better generalize to novel image domains not seen during training.

## 1. Environment Setup

First, let's set up the environment by installing the necessary packages. We'll create a conda environment, clone the repository, and install the required dependencies.

In [1]:
# Install OmniGlue and its dependencies
!git clone https://github.com/google-research/omniglue.git
%cd omniglue
!pip install -e .

fatal: destination path 'omniglue' already exists and is not an empty directory.
/mnt/sagemaker-nvme/omniglue-triton/omniglue
Obtaining file:///mnt/sagemaker-nvme/omniglue-triton/omniglue
  Installing build dependencies ... [?2done
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: omniglue
  Building editable for omniglue (pyproject.toml) ... [?25ldone
[?25h  Created wheel for omniglue: filename=omniglue-0.0.0-0.editable-py3-none-any.whl size=13387 sha256=f9b6d8e322971295786d61ba08acf3b8d0c59c73933d1903ceeec5029d24bcc1
  Stored in directory: /tmp/pip-ephem-wheel-cache-z4cf8jx2/wheels/ed/50/5f/43084c321c2a6b5983758a39681b835d591449232d87e44b8b
Successfully built omniglue
Installing collected packages: omniglue
  Attempting uninstall: omniglue
    Found existing installation: omniglue 0.0.0
    

## 2. Download Required Models

OmniGlue requires multiple pre-trained models to work properly:
1. SuperPoint - For keypoint detection
2. DINOv2 - A vision foundation model (vit-b14)
3. OmniGlue weights - The trained OmniGlue model

In [2]:
# Create models directory
!mkdir -p models
%cd models

/mnt/sagemaker-nvme/omniglue-triton/omniglue/models


In [3]:
# Download SuperPoint
!git clone https://github.com/rpautrat/SuperPoint.git
!mv SuperPoint/pretrained_models/sp_v6.tgz .
!rm -rf SuperPoint
!tar zxvf sp_v6.tgz
!rm sp_v6.tgz

Cloning into 'SuperPoint'...
remote: Enumerating objects: 1611, done.[K
remote: Counting objects: 100% (36/36), done.[K
remote: Compressing objects: 100% (29/29), done.[K
remote: Total 1611 (delta 13), reused 18 (delta 7), pack-reused 1575 (from 1)[K
Receiving objects: 100% (1611/1611), 549.39 MiB | 49.70 MiB/s, done.
Resolving deltas: 100% (1078/1078), done.
Updating files: 100% (83/83), done.
sp_v6/
sp_v6/saved_model.pb
sp_v6/variables/
sp_v6/variables/variables.index
sp_v6/variables/variables.data-00000-of-00001


In [4]:
# Download DINOv2 (vit-b14)
!wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth

--2025-03-25 20:20:18--  https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 3.163.24.87, 3.163.24.51, 3.163.24.93, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|3.163.24.87|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 346378731 (330M) [binary/octet-stream]
Saving to: ‘dinov2_vitb14_pretrain.pth.2’


2025-03-25 20:20:20 (173 MB/s) - ‘dinov2_vitb14_pretrain.pth.2’ saved [346378731/346378731]



In [5]:
# Download OmniGlue weights
!wget https://storage.googleapis.com/omniglue/og_export.zip
!unzip og_export.zip
!rm og_export.zip

--2025-03-25 20:20:21--  https://storage.googleapis.com/omniglue/og_export.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.217.123, 142.251.215.251, 172.217.14.251, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.217.123|:443... connected.
200 OKequest sent, awaiting response... 
Length: 45860170 (44M) [application/zip]
Saving to: ‘og_export.zip’


2025-03-25 20:20:22 (49.0 MB/s) - ‘og_export.zip’ saved [45860170/45860170]

Archive:  og_export.zip
^Cplace og_export/.DS_Store? [y]es, [n]o, [A]ll, [N]one, [r]ename: 


In [6]:
# Go back to the main directory
%cd ..

/mnt/sagemaker-nvme/omniglue-triton/omniglue


## 3. Import Libraries

Now let's import the required libraries for testing OmniGlue.

In [7]:
# First, let's check if we can import omniglue and debug any issues
import os
import sys
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import cv2
import time

# Print Python path to help with debugging
print("Python path:")
for path in sys.path:
    print(f"  - {path}")

# Try to import omniglue with error handling
try:
    import omniglue
    from omniglue import utils
    print("Successfully imported omniglue!")
except ImportError as e:
    print(f"Error importing omniglue: {e}")
    print("\nTroubleshooting steps:")
    print("1. Make sure you're in the right directory")
    print("2. Check if the package was installed correctly")
    print("3. Try reinstalling:")
    print("   !pip install -e . --verbose")
    print("4. Check if the package is in your path:")
    print("   !pip list | grep omni")
    # Add the repository directory to path as a fallback
    current_dir = os.getcwd()
    if current_dir not in sys.path:
        print(f"Adding current directory to Python path: {current_dir}")
        sys.path.append(current_dir)
        try:
            import omniglue
            from omniglue import utils
            print("Successfully imported omniglue after adding current directory to path!")
        except ImportError as e:
            print(f"Still cannot import omniglue: {e}")
            print("You may need to restart the kernel after installation.")

Python path:
  - /opt/conda/lib/python311.zip
  - /opt/conda/lib/python3.11
  - /opt/conda/lib/python3.11/lib-dynload
  - 
  - /opt/conda/lib/python3.11/site-packages


2025-03-25 20:20:28.740696: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-03-25 20:20:28.756642: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-03-25 20:20:28.761697: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-25 20:20:28.773355: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Successfully imported omniglue!


## 4. Download Test Images

Let's download some sample images to test OmniGlue or use the demo images from the repository.

In [8]:
# Check if demo images exist in the repo
!ls -la res/

total 6364
drwxr-xr-x  2 sagemaker-user users      85 Mar 25 18:45 .
drwxr-xr-x 10 sagemaker-user users     309 Mar 25 19:04 ..
-rw-r--r--  1 sagemaker-user users   85333 Mar 25 18:45 demo1.jpg
-rw-r--r--  1 sagemaker-user users  113899 Mar 25 18:45 demo2.jpg
-rw-r--r--  1 sagemaker-user users 1486901 Mar 25 18:45 demo_output.png
-rw-r--r--  1 sagemaker-user users 4821703 Mar 25 18:45 og_diagram.png


In [9]:
# Create a directory for our own test images if needed
!mkdir -p test_images

## 5. Define OmniGlue Matching Function

Let's create a function that performs image matching using OmniGlue based on the demo script.

In [10]:
import torch
torch.cuda.set_per_process_memory_fraction(0.9)

def match_images(image0_path, image1_path, visualize=True):
    """Perform OmniGlue matching between two images.
    
    Args:
        image0_path: Path to the first image
        image1_path: Path to the second image
        visualize: Whether to visualize the matches
        
    Returns:
        matches: Matched keypoints
        visualization: Visualization of the matches if visualize=True
    """
    # Load the images
    image0 = np.array(Image.open(image0_path).convert('RGB'))
    image1 = np.array(Image.open(image1_path).convert('RGB'))
    
    # Create the matcher - using the correct class name OmniGlue (not OmniGlueMatcher)
    og = omniglue.OmniGlue(
        og_export="./models/og_export",
        sp_export="./models/sp_v6",
        dino_export="./models/dinov2_vitb14_pretrain.pth",
    )
    
    # Match the images
    start_time = time.time()
    match_kp0, match_kp1, match_confidences = og.FindMatches(image0, image1)
    end_time = time.time()
    
    # Get match info
    num_matches = match_kp0.shape[0]
    matches = np.arange(num_matches)  # All keypoints are matched in a 1:1 correspondence
    print(f"Number of matches: {num_matches}")
    print(f"Time taken: {end_time - start_time:.2f} seconds")
    
    if visualize:
        # Create the visualization
        visualization = utils.visualize_matches(
            image0, image1, match_kp0, match_kp1, 
            np.eye(num_matches),  # Identity matrix for matches
            show_keypoints=True,
            highlight_unmatched=True,
            title=f"{num_matches} matches",
            line_width=2,
        )
        
        # Display the visualization
        plt.figure(figsize=(15, 10))
        plt.imshow(visualization)
        plt.axis('off')
        plt.title(f"OmniGlue matches between {os.path.basename(image0_path)} and {os.path.basename(image1_path)}")
        plt.show()
        
        return matches, visualization
    
    return matches, None

In [11]:
def match_images_optimized(image0_path, image1_path, visualize=True, max_size=1024):
    """Perform OmniGlue matching between two images with memory optimization.
    
    Args:
        image0_path: Path to the first image
        image1_path: Path to the second image
        visualize: Whether to visualize the matches
        max_size: Maximum dimension for images (resizes if larger)
        
    Returns:
        matches: Matched keypoints
        visualization: Visualization of the matches if visualize=True
    """
    import os
    import gc
    import time
    import numpy as np
    import torch
    from PIL import Image
    import omniglue
    from omniglue import utils
    import matplotlib.pyplot as plt
    
    # Clear cache before starting
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    
    # Set PyTorch to use limited GPU memory
    if torch.cuda.is_available():
        torch.cuda.set_per_process_memory_fraction(0.8)  # Use at most 80% of available memory
    
    # Load and resize images if needed
    def load_and_resize_image(path):
        img = Image.open(path).convert('RGB')
        # Resize if too large (preserving aspect ratio)
        if max(img.size) > max_size:
            scale = max_size / max(img.size)
            new_size = (int(img.size[0] * scale), int(img.size[1] * scale))
            img = img.resize(new_size, Image.LANCZIS if hasattr(Image, 'LANCZIS') else Image.LANCZOS)
        return np.array(img)
    
    # Load images
    print("> Loading and resizing images...")
    image0 = load_and_resize_image(image0_path)
    image1 = load_and_resize_image(image1_path)
    
    print(f"> Image dimensions: {image0.shape} and {image1.shape}")
    
    try:
        # Create the matcher with memory-efficient config
        print("> Loading OmniGlue (and its submodules)...")
        start_time = time.time()
        og = omniglue.OmniGlue(
            og_export="./models/og_export",
            sp_export="./models/sp_v6",
            dino_export="./models/dinov2_vitb14_pretrain.pth",
        )
        print(f"> \tTook {time.time() - start_time:.2f} seconds to load models.")
        
        # Match the images - wrap in try/finally for cleanup
        print("> Finding matches...")
        start_time = time.time()
        match_kp0, match_kp1, match_confidences = og.FindMatches(image0, image1)
        match_time = time.time() - start_time
        
        # Get match info
        num_matches = match_kp0.shape[0]
        matches = np.arange(num_matches)  # All keypoints are matched in a 1:1 correspondence
        print(f"> \tFound {num_matches} matches.")
        print(f"> \tTook {match_time:.2f} seconds.")
        
        # Create visualization if requested
        if visualize:
            print("> Creating visualization...")
            viz_start = time.time()
            
            # Create the visualization in a memory-efficient way (process in smaller chunks if needed)
            visualization = utils.visualize_matches(
                image0, image1, match_kp0, match_kp1, 
                np.eye(num_matches),  # Identity matrix for matches
                show_keypoints=True,
                highlight_unmatched=True,
                title=f"{num_matches} matches",
                line_width=2,
            )
            
            print(f"> \tVisualization took {time.time() - viz_start:.2f} seconds.")
            
            # Display the visualization
            plt.figure(figsize=(12, 8))  # Reduced figure size to save memory
            plt.imshow(visualization)
            plt.axis('off')
            plt.title(f"OmniGlue: {os.path.basename(image0_path)} ↔ {os.path.basename(image1_path)}")
            plt.tight_layout()
            plt.show()
            
            result = (matches, visualization)
        else:
            result = (matches, None)
        
        return result
        
    finally:
        # Clean up to release memory
        if 'og' in locals():
            del og
        if 'visualization' in locals():
            del visualization
        gc.collect()
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
        print("> Memory cleaned up.")

## 6. Test OmniGlue with Demo Images

Now let's test OmniGlue with the demo images provided in the repository.

In [12]:
# Test with demo images
matches, visualization = match_images_optimized('./res/demo1.jpg', './res/demo2.jpg')

> Loading and resizing images...
> Image dimensions: (667, 1000, 3) and (667, 1000, 3)
> Loading OmniGlue (and its submodules)...


I0000 00:00:1742934032.081882   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934032.084885   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934032.086778   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934032.089174   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS ha

Instructions for updating:
Use `tf.saved_model.load` instead.
INFO:tensorflow:Restoring parameters from ./models/sp_v6/variables/variables


I0000 00:00:1742934046.186478   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934046.188937   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934046.190723   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1742934046.192510   16893 cuda_executor.cc:1015] successful NUMA node read from SysFS ha

> 	Took 16.36 seconds to load models.
> Finding matches...


2025-03-25 20:20:47.877615: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 90701
W0000 00:00:1742934047.998469   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.065367   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.066299   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.067666   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.074011   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.075238   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.076386   17113 gpu_timer.cc:114] Skipping the delay kernel, measurement accuracy will be reduced
W0000 00:00:1742934048.077489   17113 gpu_

> Memory cleaned up.


OutOfMemoryError: CUDA out of memory. Tried to allocate 84.00 MiB. GPU 0 has a total capacity of 21.98 GiB of which 56.44 MiB is free. Process 23677 has 21.91 GiB memory in use. Of the allocated memory 1.29 GiB is allocated by PyTorch, and 84.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

## 7. Test OmniGlue with Custom Images

You can also test OmniGlue with your own images by uploading them to the `test_images` directory.

In [None]:
# Upload your own images if you want to test with them
# Then match them
# matches, visualization = match_images('test_images/your_image1.jpg', 'test_images/your_image2.jpg')

## 8. Advanced: Test OmniGlue with Custom Parameters

OmniGlue provides options to customize the matching process. Let's create a function that allows us to experiment with different parameters.

In [None]:
def match_images_custom(image0_path, image1_path, visualize=True, match_threshold=0.02):
    """Perform OmniGlue matching between two images with custom parameters.
    
    Args:
        image0_path: Path to the first image
        image1_path: Path to the second image
        visualize: Whether to visualize the matches
        match_threshold: Threshold for confident matches
        
    Returns:
        matches: Matched keypoints
        visualization: Visualization of the matches if visualize=True
    """
    # Load the images
    image0 = np.array(Image.open(image0_path).convert('RGB'))
    image1 = np.array(Image.open(image1_path).convert('RGB'))
    
    # Create the matcher with custom parameters
    og = omniglue.OmniGlue(
        og_export="./models/og_export",
        sp_export="./models/sp_v6",
        dino_export="./models/dinov2_vitb14_pretrain.pth",
    )
    
    # Match the images
    start_time = time.time()
    match_kp0, match_kp1, match_confidences = og.FindMatches(image0, image1)
    end_time = time.time()
    
    # Filter matches by confidence
    keep_idx = []
    for i in range(match_kp0.shape[0]):
        if match_confidences[i] > match_threshold:
            keep_idx.append(i)
    
    # Apply filtering
    filtered_kp0 = match_kp0[keep_idx]
    filtered_kp1 = match_kp1[keep_idx]
    filtered_confidences = match_confidences[keep_idx]
    # Get stats
    total_matches = match_kp0.shape[0]
    filtered_matches = len(filtered_kp0)
    
    print(f"Total matches found: {total_matches}")
    print(f"Matches with confidence > {match_threshold}: {filtered_matches}")
    print(f"Time taken: {end_time - start_time:.2f} seconds")
    
    if visualize:
        # Create the visualization for filtered matches
        visualization = utils.visualize_matches(
            image0, image1, 
            filtered_kp0, filtered_kp1,
            np.eye(filtered_matches),  # Identity matrix for matches
            show_keypoints=True,
            highlight_unmatched=True,
            title=f"{filtered_matches} filtered matches (threshold: {match_threshold})",
            line_width=2,
        )
        
        # Display the visualization
        plt.figure(figsize=(15, 10))
        plt.imshow(visualization)
        plt.axis('off')
        plt.title(f"OmniGlue filtered matches (threshold: {match_threshold})")
        plt.show()
        
        return filtered_kp0, filtered_kp1, filtered_confidences, visualization
    
    return filtered_kp0, filtered_kp1, filtered_confidences, None

In [None]:
# Test with custom parameters
filtered_kp0, filtered_kp1, filtered_confidences, visualization = match_images_custom(
    './res/demo1.jpg', './res/demo2.jpg', match_threshold=0.03)

## 9. Compare OmniGlue with Traditional Methods

Let's compare OmniGlue with a traditional method like SIFT to see the difference in matching quality.

In [None]:
def match_images_sift(image0_path, image1_path, visualize=True, max_keypoints=1024):
    """Perform SIFT matching between two images.
    
    Args:
        image0_path: Path to the first image
        image1_path: Path to the second image
        visualize: Whether to visualize the matches
        max_keypoints: Maximum number of keypoints to detect
        
    Returns:
        matches: Matched keypoints
        visualization: Visualization of the matches if visualize=True
    """
    # Load the images
    image0 = cv2.imread(image0_path)
    image1 = cv2.imread(image1_path)
    
    # Convert to grayscale
    gray0 = cv2.cvtColor(image0, cv2.COLOR_BGR2GRAY)
    gray1 = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
    
    # Create SIFT detector
    sift = cv2.SIFT_create(nfeatures=max_keypoints)
    
    # Detect keypoints and compute descriptors
    start_time = time.time()
    kp0, desc0 = sift.detectAndCompute(gray0, None)
    kp1, desc1 = sift.detectAndCompute(gray1, None)
    
    # Match descriptors
    bf = cv2.BFMatcher()
    matches = bf.knnMatch(desc0, desc1, k=2)
    end_time = time.time()
    
    # Apply ratio test
    good_matches = []
    for m, n in matches:
        if m.distance < 0.75 * n.distance:
            good_matches.append(m)
    
    print(f"Number of keypoints in image 1: {len(kp0)}")
    print(f"Number of keypoints in image 2: {len(kp1)}")
    print(f"Number of matches: {len(good_matches)}")
    print(f"Time taken: {end_time - start_time:.2f} seconds")
    
    if visualize:
        # Create the visualization
        visualization = cv2.drawMatches(image0, kp0, image1, kp1, good_matches, None,
                                      flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
        
        # Convert BGR to RGB for display
        visualization = cv2.cvtColor(visualization, cv2.COLOR_BGR2RGB)
        
        # Display the visualization
        plt.figure(figsize=(15, 10))
        plt.imshow(visualization)
        plt.axis('off')
        plt.title(f"SIFT matches between {os.path.basename(image0_path)} and {os.path.basename(image1_path)}")
        plt.show()
        
        return good_matches, visualization
    
    return good_matches, None

In [None]:
# Test with SIFT
sift_matches, sift_visualization = match_images_sift('./res/demo1.jpg', './res/demo2.jpg', max_keypoints=1024)

## 10. Conclusion

In this notebook, we've explored OmniGlue, a generalizable image feature matching library from Google Research that leverages foundation models to improve matching across different domains.

We've:
1. Set up the environment and installed OmniGlue
2. Downloaded the required models
3. Tested OmniGlue with demo images
4. Created functions for customized matching
5. Compared OmniGlue with traditional SIFT matching

OmniGlue demonstrates how foundation models can improve traditional computer vision tasks by providing generalizable guidance for feature matching.