
# 🌟 Image Matching Challenge 2024 - Hexathlon 🌟

## Overview
Welcome, brave adventurers, to the thrilling Image Matching Challenge 2024 - Hexathlon! Here, amidst the vast expanse of digital landscapes, your mission, should you choose to accept it, is to reconstruct 3D scenes from 2D images across six distinct domains. From ancient ruins to bustling city streets, from serene forests to the twinkling night sky, each domain presents its own set of challenges to overcome.

Last year's Image Matching Challenge was but a prelude to this grand spectacle. This year, stakes are raised, weaving together an intricate tapestry of diverse scenarios into a single competition. Let's  test our mettle and push the boundaries of computer vision

## 🛠️ Setup Environment
Prepare yourselves, intrepid explorers, for the journey ahead requires meticulous preparation. Fear not, for we shall guide you through the arcane rituals of environment setup with clarity and wit:

1. **Install Necessary Libraries**: 📚
   To equip your arsenal with the tools needed for this quest, execute the following commands:

   ```bash
   !pip install -r /kaggle/input/check-image-orientation/requirements.txt
   !pip install --no-index /kaggle/input/imc2024-packages-lightglue-rerun-kornia/* --no-deps
   ```

2. **Setup Cache and Checkpoints**: 🔒
   Fortify your cache and checkpoints with the resilience of ancient guardians:

   ```bash
   !mkdir -p /root/.cache/torch/hub/checkpoints
   !cp /kaggle/input/aliked/pytorch/aliked-n16/1/* /root/.cache/torch/hub/checkpoints/
   !cp /kaggle/input/lightglue/pytorch/aliked/1/* /root/.cache/torch/hub/checkpoints/
   !cp /kaggle/input/lightglue/pytorch/aliked/1/aliked_lightglue.pth /root/.cache/torch/hub/checkpoints/aliked_lightglue_v0-1_arxiv-pth
   !cp /kaggle/input/check-image-orientation/2020-11-16_resnext50_32x4d.zip /root/.cache/torch/hub/checkpoints/
   ```

3. **Import Essential Libraries**: 📦
   Arm yourselves with the knowledge and power of the ancients with these sacred incantations:

   ```python
   import libraries
   from pathlib import Path
   from copy import deepcopy
   import numpy as np
   import math
   import pandas as pd
   import pandas.api.types
   from itertools import combinations
   import sys, torch, h5py, pycolmap, datetime
   from PIL import Image
   from pathlib import Path
   import torch.nn.functional as F
   import torchvision.transforms.functional as TF
   import kornia as K
   import kornia.feature as KF
   from lightglue.utils import load_image
   from lightglue import LightGlue, ALIKED, match_pair
   from transformers import AutoImageProcessor, AutoModel
   from check_orientation.pre_trained_models import create_model
   sys.path.append("/kaggle/input/colmap-db-import")
   import sqlite3
   import os, argparse, h5py, warnings
   import numpy as np
   from tqdm import tqdm
   from PIL import Image, ExifTags
   from database import COLMAPDatabase, image_ids_to_pair_id
   from h5_to_db import *
   ```

Now, adventurers, with your environment fortified and your libraries in hand, you stand poised at the precipice of discovery, ready to delve into the depths of image matching and registration!

## 🧩 Concepts Explored
Behold, noble souls, the sacred knowledge that shall guide you on your quest:

1. **Feature Matching**: 🌟
   Traverse the realm of feature matching, where keypoints align and images harmonize through the magic of computer vision.

2. **RANSAC (Random Sample Consensus)**: 🎲
   Embrace the randomness of RANSAC, a robust method that triumphs over outliers and guides you on the path to accurate image registration.

3. **Sparse Reconstruction**: 🌌
   Witness the reconstruction of 3D scenes from sparse image correspondences, a feat achieved through the enigmatic algorithms of sparse reconstruction.

4. **Mean Average Accuracy (mAA)**: 🎯
   Gauge the accuracy of your endeavors with the noble metric of mAA, measuring the alignment of images with a precision fit for champions.

5. **Homogeneous Transformation Matrix**: 🔄
   Let the homogeneous transformation matrix be your guide through the labyrinth of rigid transformations, as you align images and estimate camera poses with finesse.

6. **Affine Transformation**: 🖌️
   Marvel at the versatility of affine transformations, shaping images with the strokes of a digital brush to correct distortions and align perspectives.

7. **Quaternion Representation**: 🔮
   Peer into the depths of quaternion representation, where rotations in 3D space unfold with elegance and grace, free from the shackles of gimbal lock.

## 👉 [View Notebook on Kaggle](https://www.kaggle.com/code/zulqarnainalipk/imc-24-explained/)



## Acknowledgments 🙏
I acknowledge the organizers of theCzech Technical University in Prague for providing the dataset and the competition platform. Additionally, I extend my gratitude to the computer vision community for their contributions to image registration techniques and algorithms.

Let's embark on the journey of image matching and registration! Feel free to reach out if you have any questions or need assistance along the way.
👉 [Visit my Profile](https://www.kaggle.com/zulqarnainalipk) 👈


## 💬 Share Your Thoughts! 💡

Your feedback is like treasure to us! Your brilliant ideas and insights fuel our ongoing improvement. Got something to say, ask, or suggest? Don't hold back!

📬 Drop me a line via email: [zulqar445ali@gmail.com](mailto:zulqar445ali@gmail.com)


---

# Pip Install libraries

In [1]:
from IPython.display import clear_output
get_ipython().system('pip install -r /kaggle/input/check-image-orientation/requirements.txt')
get_ipython().system('pip install --no-index /kaggle/input/imc2024-packages-lightglue-rerun-kornia/* --no-deps')
get_ipython().system('mkdir -p /root/.cache/torch/hub/checkpoints')
get_ipython().system('cp /kaggle/input/aliked/pytorch/aliked-n16/1/* /root/.cache/torch/hub/checkpoints/')
get_ipython().system('cp /kaggle/input/lightglue/pytorch/aliked/1/* /root/.cache/torch/hub/checkpoints/')
get_ipython().system('cp /kaggle/input/lightglue/pytorch/aliked/1/aliked_lightglue.pth /root/.cache/torch/hub/checkpoints/aliked_lightglue_v0-1_arxiv-pth')
get_ipython().system('cp /kaggle/input/check-image-orientation/2020-11-16_resnext50_32x4d.zip /root/.cache/torch/hub/checkpoints/')
clear_output(wait=False)




# import libraries

In [2]:
from pathlib import Path
from copy import deepcopy
import numpy as np
import math
import pandas as pd
import pandas.api.types
from itertools import combinations
import sys, torch, h5py, pycolmap, datetime
from PIL import Image
from pathlib import Path
import torch.nn.functional as F
import torchvision.transforms.functional as TF
import kornia as K
import kornia.feature as KF
from lightglue.utils import load_image
from lightglue import LightGlue, ALIKED, match_pair
from transformers import AutoImageProcessor, AutoModel
from check_orientation.pre_trained_models import create_model
sys.path.append("/kaggle/input/colmap-db-import")
import sqlite3
import os, argparse, h5py, warnings
import numpy as np
from tqdm import tqdm
from PIL import Image, ExifTags
from database import COLMAPDatabase, image_ids_to_pair_id
from h5_to_db import *

2024-05-21 18:02:05.621843: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-21 18:02:05.621981: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-21 18:02:05.738741: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
  model = create_fn(


---

**Explaination**:


1. **Setting the Maximum Image ID**:
   - `2**31 - 1` calculates the largest 32-bit signed integer value, which is 2 raised to the power of 31 minus 1. This value is 2147483647.
   - The result is stored in the variable `MAX_IMAGE_ID`.
   ```python
   MAX_IMAGE_ID = 2**31 - 1
   ```
   - **Concept**: This is often used to set a maximum value for identifiers, ensuring they fit within a 32-bit integer range. This is crucial for databases or systems where the ID needs to be a 32-bit integer.



In [3]:
IS_PYTHON3 = sys.version_info[0] >= 3

MAX_IMAGE_ID = 2**31 - 1

---

# 📷 Creating the Cameras Table in SQL 🛠️

#### Explanation of the Code

1. **Table Creation Statement**:
   - The provided SQL statement is used to create a table named `cameras` if it doesn't already exist. This ensures that running the script multiple times won't result in errors due to the table already existing.
   ```sql
   CREATE TABLE IF NOT EXISTS cameras (
       camera_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
       model INTEGER NOT NULL,
       width INTEGER NOT NULL,
       height INTEGER NOT NULL,
       params BLOB,
       prior_focal_length INTEGER NOT NULL
   )
   ```

2. **Table Columns**:
   - **`camera_id`**: 
     - `INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL`: This defines the `camera_id` as the primary key for the table. It will automatically increment with each new entry, ensuring a unique identifier for each record.
   - **`model`**:
     - `INTEGER NOT NULL`: This column stores the model number of the camera, ensuring it's an integer and cannot be null.
   - **`width`**:
     - `INTEGER NOT NULL`: This column stores the width of the camera's image sensor or resolution, ensuring it's an integer and cannot be null.
   - **`height`**:
     - `INTEGER NOT NULL`: This column stores the height of the camera's image sensor or resolution, ensuring it's an integer and cannot be null.
   - **`params`**:
     - `BLOB`: This column stores additional parameters for the camera in binary large object format. BLOB is used to store data such as images, multimedia, and other large data types.
   - **`prior_focal_length`**:
     - `INTEGER NOT NULL`: This column stores the prior focal length of the camera, ensuring it's an integer and cannot be null.

3. **Concepts**:
   - **SQL Data Types**:
     - **INTEGER**: Used to store whole numbers.
     - **PRIMARY KEY**: A unique identifier for table records. `AUTOINCREMENT` ensures each new record gets a unique value automatically.
     - **NOT NULL**: Ensures that a column cannot have a NULL value.
     - **BLOB**: Stands for Binary Large Object, used to store large amounts of binary data.
   - **SQL Table Creation**:
     - `CREATE TABLE IF NOT EXISTS`: Ensures that the table creation command does not fail if the table already exists.

### Study Sources

1. **SQL Data Types**:
   - Understanding the different SQL data types and their uses is crucial for database design.
   - **Source**: [SQL Data Types](https://www.w3schools.com/sql/sql_datatypes.asp)

2. **SQL PRIMARY KEY and AUTOINCREMENT**:
   - The use of primary keys and the AUTOINCREMENT attribute is essential for ensuring unique identifiers in a database.
   - **Source**: [PRIMARY KEY and AUTOINCREMENT](https://www.sqlite.org/autoinc.html)

3. **BLOB Data Type**:
   - Learning about the BLOB data type is important for storing large binary data in a database.
   - **Source**: [BLOB Data Type](https://www.sqlite.org/datatype3.html)



In [4]:
CREATE_CAMERAS_TABLE = """CREATE TABLE IF NOT EXISTS cameras (
    camera_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    model INTEGER NOT NULL,
    width INTEGER NOT NULL,
    height INTEGER NOT NULL,
    params BLOB,
    prior_focal_length INTEGER NOT NULL)"""

---

# 🖼️ Creating the Descriptors Table in SQL 🔍

#### Explanation of the Code

1. **Table Creation Statement**:
   - This SQL statement creates a table named `descriptors` if it doesn't already exist. This ensures that the script can be run multiple times without causing errors due to the table already existing.
   ```sql
   CREATE TABLE IF NOT EXISTS descriptors (
       image_id INTEGER PRIMARY KEY NOT NULL,
       rows INTEGER NOT NULL,
       cols INTEGER NOT NULL,
       data BLOB,
       FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE
   )
   ```

2. **Table Columns**:
   - **`image_id`**:
     - `INTEGER PRIMARY KEY NOT NULL`: This defines `image_id` as the primary key for the table. It must be unique and cannot be null.
     - **Concept**: In this context, `image_id` is used to uniquely identify each record in the `descriptors` table and link it to a corresponding record in the `images` table.
   - **`rows`**:
     - `INTEGER NOT NULL`: This column stores the number of rows in the descriptor matrix, ensuring it's an integer and cannot be null.
   - **`cols`**:
     - `INTEGER NOT NULL`: This column stores the number of columns in the descriptor matrix, ensuring it's an integer and cannot be null.
   - **`data`**:
     - `BLOB`: This column stores the descriptor data in a binary large object format. BLOB is used for storing large amounts of binary data.
   - **`FOREIGN KEY` Constraint**:
     - `FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE`: This establishes a foreign key relationship between `image_id` in the `descriptors` table and `image_id` in the `images` table. The `ON DELETE CASCADE` clause ensures that if a record in the `images` table is deleted, the corresponding records in the `descriptors` table will also be deleted automatically.
     - **Concept**: This ensures referential integrity between the `descriptors` and `images` tables, maintaining a consistent relationship.

### Study Sources

1. **SQL Foreign Key Constraints**:
   - Understanding foreign keys and how they enforce referential integrity is crucial for relational database design.
   - **Source**: [SQL Foreign Key](https://www.w3schools.com/sql/sql_foreignkey.asp)

2. **SQL ON DELETE CASCADE**:
   - Learning how the `ON DELETE CASCADE` clause works helps in designing databases where dependent records are automatically managed.
   - **Source**: [SQL ON DELETE CASCADE](https://www.sqlshack.com/sql-server-on-delete-cascade-and-on-update-cascade/)



In [5]:
CREATE_DESCRIPTORS_TABLE = """CREATE TABLE IF NOT EXISTS descriptors (
    image_id INTEGER PRIMARY KEY NOT NULL,
    rows INTEGER NOT NULL,
    cols INTEGER NOT NULL,
    data BLOB,
    FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE)"""

---

# 📸 Creating the Images Table in SQL 🖼️

#### Explanation of the Code

1. **Table Creation Statement**:
   - This SQL statement creates a table named `images` if it doesn't already exist, preventing errors if the table is created multiple times.
   ```sql
   CREATE TABLE IF NOT EXISTS images (
       image_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
       name TEXT NOT NULL UNIQUE,
       camera_id INTEGER NOT NULL,
       prior_qw REAL,
       prior_qx REAL,
       prior_qy REAL,
       prior_qz REAL,
       prior_tx REAL,
       prior_ty REAL,
       prior_tz REAL,
       CONSTRAINT image_id_check CHECK(image_id >= 0 and image_id < {}),
       FOREIGN KEY(camera_id) REFERENCES cameras(camera_id)
   )
   ```

2. **Dynamic Constraint**:
   - The `{}` placeholder in the `CHECK` constraint is dynamically filled with the value of `MAX_IMAGE_ID`, ensuring that `image_id` is within the valid range of `0` to `MAX_IMAGE_ID - 1`.
   ```python
   .format(MAX_IMAGE_ID)
   ```

3. **Table Columns**:
   - **`image_id`**:
     - `INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL`: This defines `image_id` as the primary key with an auto-increment feature and ensures it cannot be null.
   - **`name`**:
     - `TEXT NOT NULL UNIQUE`: This column stores the image name, ensuring it's unique and not null.
   - **`camera_id`**:
     - `INTEGER NOT NULL`: This column stores the ID of the camera that took the image, ensuring it's not null.
   - **`prior_qw`, `prior_qx`, `prior_qy`, `prior_qz`**:
     - `REAL`: These columns store the prior quaternion values for the image.
   - **`prior_tx`, `prior_ty`, `prior_tz`**:
     - `REAL`: These columns store the prior translation values for the image.
   - **`CONSTRAINT image_id_check`**:
     - `CHECK(image_id >= 0 and image_id < {})`: Ensures `image_id` is within a valid range.
   - **`FOREIGN KEY(camera_id)`**:
     - `REFERENCES cameras(camera_id)`: Ensures that `camera_id` in the `images` table corresponds to `camera_id` in the `cameras` table, maintaining referential integrity.

4. **Concepts**:
   - **SQL Data Types**:
     - **INTEGER**: Used to store whole numbers.
     - **REAL**: Used to store floating-point numbers.
     - **TEXT**: Used to store text strings.
     - **PRIMARY KEY**: A unique identifier for table records with `AUTOINCREMENT` to automatically generate a unique value.
     - **NOT NULL**: Ensures that a column cannot have a NULL value.
     - **UNIQUE**: Ensures all values in a column are unique.
   - **FOREIGN KEY**:
     - Ensures that values in the `camera_id` column correspond to values in the `camera_id` column of the `cameras` table.
   - **CHECK Constraint**:
     - Ensures that the values in the `image_id` column fall within the specified range.

### Study Sources

1. **SQL CHECK Constraint**:
   - Learning how to use the `CHECK` constraint to enforce conditions on column values is essential for data integrity.
   - **Source**: [SQL CHECK Constraint](https://www.w3schools.com/sql/sql_check.asp)

2. **SQL Data Types**:
   - Understanding SQL data types like `INTEGER`, `REAL`, and `TEXT` is crucial for designing database schemas.
   - **Source**: [SQL Data Types](https://www.w3schools.com/sql/sql_datatypes.asp)

3. **SQL FOREIGN KEY Constraints**:
   - Ensuring referential integrity with foreign key constraints is a fundamental aspect of relational database design.
   - **Source**: [SQL Foreign Key](https://www.w3schools.com/sql/sql_foreignkey.asp)



In [6]:

CREATE_IMAGES_TABLE = """CREATE TABLE IF NOT EXISTS images (
    image_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    name TEXT NOT NULL UNIQUE,
    camera_id INTEGER NOT NULL,
    prior_qw REAL,
    prior_qx REAL,
    prior_qy REAL,
    prior_qz REAL,
    prior_tx REAL,
    prior_ty REAL,
    prior_tz REAL,
    CONSTRAINT image_id_check CHECK(image_id >= 0 and image_id < {}),
    FOREIGN KEY(camera_id) REFERENCES cameras(camera_id))
""".format(MAX_IMAGE_ID)

---

# 🔧 Creating the Two-View Geometries Table in SQL 📐

#### Explanation of the Code

1. **Table Creation Statement**:
   - This SQL statement creates a table named `two_view_geometries` if it doesn't already exist. This ensures that the script can be run multiple times without causing errors due to the table already existing.
   ```sql
   CREATE TABLE IF NOT EXISTS two_view_geometries (
       pair_id INTEGER PRIMARY KEY NOT NULL,
       rows INTEGER NOT NULL,
       cols INTEGER NOT NULL,
       data BLOB,
       config INTEGER NOT NULL,
       F BLOB,
       E BLOB,
       H BLOB
   )
   ```

2. **Table Columns**:
   - **`pair_id`**:
     - `INTEGER PRIMARY KEY NOT NULL`: This defines `pair_id` as the primary key for the table, ensuring it is unique and not null.
   - **`rows`**:
     - `INTEGER NOT NULL`: This column stores the number of rows in the data matrix, ensuring it is an integer and cannot be null.
   - **`cols`**:
     - `INTEGER NOT NULL`: This column stores the number of columns in the data matrix, ensuring it is an integer and cannot be null.
   - **`data`**:
     - `BLOB`: This column stores the geometric data as a binary large object. BLOB is used for storing large amounts of binary data.
   - **`config`**:
     - `INTEGER NOT NULL`: This column stores configuration data as an integer and cannot be null.
   - **`F`**:
     - `BLOB`: This column stores the fundamental matrix as a binary large object.
   - **`E`**:
     - `BLOB`: This column stores the essential matrix as a binary large object.
   - **`H`**:
     - `BLOB`: This column stores the homography matrix as a binary large object.

3. **Concepts**:
   - **Fundamental Matrix (F)**:
     - Represents the epipolar geometry between two views.
   - **Essential Matrix (E)**:
     - Encodes the relative rotation and translation between two cameras.
   - **Homography Matrix (H)**:
     - Relates the coordinates of corresponding points in two views assuming a planar scene.

### Study Sources

1. **Epipolar Geometry and Fundamental Matrix**:
   - Understanding the fundamental matrix is crucial for working with epipolar geometry in computer vision.
   - **Source**: [Epipolar Geometry](https://en.wikipedia.org/wiki/Epipolar_geometry)

2. **Essential Matrix**:
   - The essential matrix is key in recovering the relative pose between two calibrated views.
   - **Source**: [Essential Matrix](https://en.wikipedia.org/wiki/Essential_matrix)

3. **Homography Matrix**:
   - A homography matrix is used in computer vision to perform transformations between different planes.
   - **Source**: [Homography (Computer Vision)](https://en.wikipedia.org/wiki/Homography_(computer_vision))



In [7]:
CREATE_TWO_VIEW_GEOMETRIES_TABLE = """
CREATE TABLE IF NOT EXISTS two_view_geometries (
    pair_id INTEGER PRIMARY KEY NOT NULL,
    rows INTEGER NOT NULL,
    cols INTEGER NOT NULL,
    data BLOB,
    config INTEGER NOT NULL,
    F BLOB,
    E BLOB,
    H BLOB)
"""

---

# 🔑 Creating the Keypoints Table in SQL 📝

#### Explanation of the Code

1. **Table Creation Statement**:
   - This SQL statement creates a table named `keypoints` if it doesn't already exist. This ensures that the script can be run multiple times without causing errors due to the table already existing.
   ```sql
   CREATE TABLE IF NOT EXISTS keypoints (
       image_id INTEGER PRIMARY KEY NOT NULL,
       rows INTEGER NOT NULL,
       cols INTEGER NOT NULL,
       data BLOB,
       FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE
   )
   ```

2. **Table Columns**:
   - **`image_id`**:
     - `INTEGER PRIMARY KEY NOT NULL`: This defines `image_id` as the primary key for the table, ensuring it is unique and not null.
   - **`rows`**:
     - `INTEGER NOT NULL`: This column stores the number of rows in the keypoints matrix, ensuring it is an integer and cannot be null.
   - **`cols`**:
     - `INTEGER NOT NULL`: This column stores the number of columns in the keypoints matrix, ensuring it is an integer and cannot be null.
   - **`data`**:
     - `BLOB`: This column stores the keypoints data as a binary large object. BLOB is used for storing large amounts of binary data.

3. **Foreign Key Constraint**:
   - The `FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE` constraint ensures referential integrity by linking the `image_id` column in the `keypoints` table to the `image_id` column in the `images` table. The `ON DELETE CASCADE` clause ensures that if a record in the `images` table is deleted, the corresponding records in the `keypoints` table will also be deleted automatically.

### Study Sources

1. **Keypoints in Computer Vision**:
   - Understanding keypoints and their role in computer vision tasks is fundamental for feature extraction and matching.
   - **Source**: [Feature Detection and Description - OpenCV Documentation](https://docs.opencv.org/3.4/db/d27/tutorial_py_table_of_contents_feature2d.html)


In [8]:
CREATE_KEYPOINTS_TABLE = """CREATE TABLE IF NOT EXISTS keypoints (
    image_id INTEGER PRIMARY KEY NOT NULL,
    rows INTEGER NOT NULL,
    cols INTEGER NOT NULL,
    data BLOB,
    FOREIGN KEY(image_id) REFERENCES images(image_id) ON DELETE CASCADE)
"""

---

# 🔗 Creating the Matches Table in SQL 🔄

#### Explanation of the Code

1. **Table Creation Statement**:
   - This SQL statement creates a table named `matches` if it doesn't already exist. This ensures that the script can be run multiple times without causing errors due to the table already existing.
   ```sql
   CREATE TABLE IF NOT EXISTS matches (
       pair_id INTEGER PRIMARY KEY NOT NULL,
       rows INTEGER NOT NULL,
       cols INTEGER NOT NULL,
       data BLOB
   )
   ```

2. **Table Columns**:
   - **`pair_id`**:
     - `INTEGER PRIMARY KEY NOT NULL`: This defines `pair_id` as the primary key for the table, ensuring it is unique and not null.
   - **`rows`**:
     - `INTEGER NOT NULL`: This column stores the number of rows in the matches matrix, ensuring it is an integer and cannot be null.
   - **`cols`**:
     - `INTEGER NOT NULL`: This column stores the number of columns in the matches matrix, ensuring it is an integer and cannot be null.
   - **`data`**:
     - `BLOB`: This column stores the matches data as a binary large object. BLOB is used for storing large amounts of binary data.

3. **Concepts**:
   - **Matches**:
     - Matches represent correspondences between keypoints in different images obtained through feature matching algorithms.

### Study Sources

1. **Feature Matching in Computer Vision**:
   - Understanding feature matching and its applications is crucial for various computer vision tasks such as image stitching, object recognition, and 3D reconstruction.
   - **Source**: [Feature Matching - OpenCV Documentation](https://docs.opencv.org/3.4/dc/dc3/tutorial_py_matcher.html)



In [9]:
CREATE_MATCHES_TABLE = """CREATE TABLE IF NOT EXISTS matches (
    pair_id INTEGER PRIMARY KEY NOT NULL,
    rows INTEGER NOT NULL,
    cols INTEGER NOT NULL,
    data BLOB)"""

---

# 🏷️ Creating Name Index and All Tables in SQL 🛠️

#### Explanation of the Code

1. **Name Index Creation Statement**:
   - This SQL statement creates a unique index named `index_name` on the `name` column of the `images` table if it doesn't already exist. This ensures that the `name` column values are unique, preventing duplicate entries for image names.
   ```sql
   CREATE UNIQUE INDEX IF NOT EXISTS index_name ON images(name)
   ```

2. **Combining Table Creation Statements**:
   - The `CREATE_ALL` statement combines all table creation statements into a single SQL command. Each table creation statement is separated by a semicolon (`;`), allowing multiple SQL commands to be executed in sequence.
   
   ```python
   CREATE_ALL = "; ".join([
       CREATE_CAMERAS_TABLE,
       CREATE_IMAGES_TABLE,
       CREATE_KEYPOINTS_TABLE,
       CREATE_DESCRIPTORS_TABLE,
       CREATE_MATCHES_TABLE,
       CREATE_TWO_VIEW_GEOMETRIES_TABLE,
       CREATE_NAME_INDEX
   ])
   ```

3. **Concepts**:
   - **SQL Index**:
     - An index is a database structure that improves the speed of data retrieval operations on a database table at the cost of additional space and decreased performance on data modification operations. A unique index ensures that no two rows of a table have duplicate values in the indexed column or columns.
   - **JOIN**:
     - The `JOIN` function in Python is used to concatenate a sequence of strings with a specified separator.

### Study Sources

1. **SQL Index**:
   - Understanding indexes in databases and their types is crucial for optimizing query performance.
   - **Source**: [SQL Index - Tutorialspoint](https://www.tutorialspoint.com/sql/sql-indexes.htm)

2. **Python Join Method**:
   - Learning about the `join()` method in Python is important for concatenating strings efficiently.
   - **Source**: [Python Join Method - w3schools](https://www.w3schools.com/python/ref_string_join.asp)


In [10]:
CREATE_NAME_INDEX = \
    "CREATE UNIQUE INDEX IF NOT EXISTS index_name ON images(name)"

CREATE_ALL = "; ".join([
    CREATE_CAMERAS_TABLE,
    CREATE_IMAGES_TABLE,
    CREATE_KEYPOINTS_TABLE,
    CREATE_DESCRIPTORS_TABLE,
    CREATE_MATCHES_TABLE,
    CREATE_TWO_VIEW_GEOMETRIES_TABLE,
    CREATE_NAME_INDEX
])


---

**Explaination**


The `image_ids_to_pair_id` function takes two image IDs, `image_id1` and `image_id2`, and returns a pair ID computed based on these IDs. The pair ID is calculated by concatenating the two image IDs with the maximum image ID `MAX_IMAGE_ID`. 

Here's how the function works:

- It compares the two image IDs `image_id1` and `image_id2`. If `image_id1` is greater than `image_id2`, it swaps their values. This step ensures consistency in the pair ID calculation regardless of the order of the input image IDs.

- The pair ID is then calculated by multiplying the smaller image ID (`image_id1`) by the maximum image ID (`MAX_IMAGE_ID`) and adding the larger image ID (`image_id2`). This concatenation ensures uniqueness of the pair ID for any pair of image IDs.

- The calculated pair ID is returned as the result.

Here's the function in code:

```python
def image_ids_to_pair_id(image_id1, image_id2):
    # Ensure image_id1 is smaller than or equal to image_id2
    if image_id1 > image_id2:
        image_id1, image_id2 = image_id2, image_id1
    # Calculate pair ID
    return image_id1 * MAX_IMAGE_ID + image_id2
```

### Example:
```python
image_id1 = 10
image_id2 = 20
pair_id = image_ids_to_pair_id(image_id1, image_id2)
print(pair_id)  # Output: 20310
```

In this example, `image_id1` is smaller than `image_id2`, so the pair ID is calculated as `image_id1 * MAX_IMAGE_ID + image_id2`, resulting in `20310`.

In [11]:
def image_ids_to_pair_id(image_id1, image_id2):
    if image_id1 > image_id2:
        image_id1, image_id2 = image_id2, image_id1
    return image_id1 * MAX_IMAGE_ID + image_id2



---

# 🔄 Converting Pair ID to Image IDs in Python 🔢

#### Explanation of the Code

1. **Function Purpose**:
   - The function `pair_id_to_image_ids` takes a `pair_id` and converts it back to the original image IDs (`image_id1` and `image_id2`). This is the reverse operation of the `image_ids_to_pair_id` function.

2. **Parameter and Return**:
   - **Parameter**:
     - `pair_id`: An integer that uniquely identifies a pair of image IDs.
   - **Return**:
     - A tuple containing two integers: `image_id1` and `image_id2`.

3. **Steps in the Function**:
   - **Calculate `image_id2`**:
     - `image_id2 = pair_id % MAX_IMAGE_ID`: This line calculates `image_id2` by taking the modulus of `pair_id` with `MAX_IMAGE_ID`. The modulus operation returns the remainder when `pair_id` is divided by `MAX_IMAGE_ID`, effectively isolating `image_id2`.
   - **Calculate `image_id1`**:
     - `image_id1 = (pair_id - image_id2) / MAX_IMAGE_ID`: This line calculates `image_id1` by subtracting `image_id2` from `pair_id` and then dividing the result by `MAX_IMAGE_ID`. This isolates `image_id1`.
   - **Return the Image IDs**:
     - `return image_id1, image_id2`: This returns the tuple of `image_id1` and `image_id2`.

4. **Concepts**:
   - **Modulus Operator (`%`)**:
     - The modulus operator returns the remainder of a division operation. It's useful for breaking down numbers, particularly in encoding and decoding schemes.
   - **Integer Division (`/`)**:
     - In Python 3, the division operator (`/`) returns a float. For integer division that truncates towards zero, you would typically use `//`, but here we use `/` since `pair_id` was constructed using multiplication, which naturally results in an integer outcome when reversed.

5. **Edge Cases**:
   - This function assumes that `pair_id` was generated using the `image_ids_to_pair_id` function and thus `image_id1` will always be less than or equal to `image_id2`.

### Study Sources

1. **Modulus Operator**:
   - Understanding the modulus operator is essential for various algorithms, particularly those involving cyclical structures and encoding/decoding.
   - **Source**: [Modulus Operator - Real Python](https://realpython.com/python-modulo-operator/)



In [12]:
def pair_id_to_image_ids(pair_id):
    image_id2 = pair_id % MAX_IMAGE_ID
    image_id1 = (pair_id - image_id2) / MAX_IMAGE_ID
    return image_id1, image_id2


---

# 🛠️ Converting Array to BLOB in Python 📦

#### Explanation of the Code

1. **Function Purpose**:
   - The function `array_to_blob` converts a NumPy array to a binary large object (BLOB). This is useful for storing arrays in a database as BLOBs.

2. **Parameter and Return**:
   - **Parameter**:
     - `array`: A NumPy array that needs to be converted to a BLOB.
   - **Return**:
     - A binary representation of the NumPy array, suitable for storage as a BLOB.

3. **Python Version Check**:
   - The function uses a global variable `IS_PYTHON3` to check if the Python version is 3 or higher. This variable should be defined elsewhere in the code, typically as:
     ```python
     IS_PYTHON3 = sys.version_info[0] >= 3
     ```
   - **Python 3**:
     - `array.tostring()`: In Python 3, this method converts the array to a string of bytes. Note that `tostring()` is deprecated in favor of `tobytes()` in recent versions of NumPy.
   - **Python 2**:
     - `np.getbuffer(array)`: In Python 2, this method converts the array to a buffer object.



### Study Sources

1. **NumPy Arrays**:
   - Understanding NumPy arrays and their methods is crucial for scientific computing and data manipulation in Python.
   - **Source**: [NumPy Array Documentation](https://numpy.org/doc/stable/reference/generated/numpy.array.html)

2. **BLOB Data Type**:
   - The BLOB (Binary Large Object) data type is used to store large binary data such as images, multimedia files, and raw data.
   - **Source**: [BLOB Data Type - Database Guide](https://www.tutorialspoint.com/sql/sql-blob-data-type.htm)



In [13]:
def array_to_blob(array):
    if IS_PYTHON3:
        return array.tostring()
    else:
        return np.getbuffer(array)

---

# 🔄 Converting BLOB to Array in Python 🔄

#### Explanation of the Code

1. **Function Purpose**:
   - The function `blob_to_array` converts a binary large object (BLOB) back into a NumPy array. This is useful for retrieving and manipulating arrays stored as BLOBs in a database.

2. **Parameters and Return**:
   - **Parameters**:
     - `blob`: A BLOB that needs to be converted back into a NumPy array.
     - `dtype`: The data type of the resulting NumPy array (e.g., `np.float32`).
     - `shape`: The shape of the resulting NumPy array, with a default value of `(-1,)`, meaning a single-dimensional array with inferred length.
   - **Return**:
     - A NumPy array reconstructed from the BLOB.

3. **Python Version Check**:
   - The function uses a global variable `IS_PYTHON3` to check if the Python version is 3 or higher.
     ```python
     IS_PYTHON3 = sys.version_info[0] >= 3
     ```
   - **Python 3**:
     - `np.fromstring(blob, dtype=dtype).reshape(*shape)`: Converts the BLOB to a NumPy array of the specified data type and reshapes it. Note that `np.fromstring()` is deprecated in favor of `np.frombuffer()`, so we should update this.
   - **Python 2**:
     - `np.frombuffer(blob, dtype=dtype).reshape(*shape)`: Converts the BLOB to a NumPy array of the specified data type and reshapes it.



In [14]:
def blob_to_array(blob, dtype, shape=(-1,)):
    if IS_PYTHON3:
        return np.fromstring(blob, dtype=dtype).reshape(*shape)
    else:
        return np.frombuffer(blob, dtype=dtype).reshape(*shape)


---

# 📸 COLMAP Database Management with SQLite in Python 🗄️

#### Explanation of the Code

1. **Class Definition: `COLMAPDatabase`**
   - The `COLMAPDatabase` class extends `sqlite3.Connection`, adding functionality specific to managing a COLMAP database.
   - **COLMAP**: A general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline for 3D reconstruction.

2. **Static Method: `connect`**
   - **Purpose**: To establish a connection to a database using the `COLMAPDatabase` class.
   - **Usage**:
     ```python
     db = COLMAPDatabase.connect('path_to_database.db')
     ```

3. **Constructor: `__init__`**
   - **Purpose**: To initialize the database connection and provide methods to create various tables.
   - **Table Creation Methods**: 
     - `create_tables`, `create_cameras_table`, `create_descriptors_table`, `create_images_table`, `create_two_view_geometries_table`, `create_keypoints_table`, `create_matches_table`, `create_name_index`.
   - **Usage**:
     ```python
     db.create_tables()
     ```

4. **Method: `add_camera`**
   - **Purpose**: To insert a new camera into the `cameras` table.
   - **Parameters**:
     - `model`, `width`, `height`, `params`, `prior_focal_length`, `camera_id`.
   - **Usage**:
     ```python
     db.add_camera(model, width, height, params)
     ```

5. **Method: `add_image`**
   - **Purpose**: To insert a new image into the `images` table.
   - **Parameters**:
     - `name`, `camera_id`, `prior_q`, `prior_t`, `image_id`.
   - **Usage**:
     ```python
     db.add_image(name, camera_id)
     ```

6. **Method: `add_keypoints`**
   - **Purpose**: To insert keypoints associated with an image into the `keypoints` table.
   - **Parameters**:
     - `image_id`, `keypoints`.
   - **Usage**:
     ```python
     db.add_keypoints(image_id, keypoints)
     ```

7. **Method: `add_descriptors`**
   - **Purpose**: To insert descriptors associated with an image into the `descriptors` table.
   - **Parameters**:
     - `image_id`, `descriptors`.
   - **Usage**:
     ```python
     db.add_descriptors(image_id, descriptors)
     ```

8. **Method: `add_matches`**
   - **Purpose**: To insert matches between two images into the `matches` table.
   - **Parameters**:
     - `image_id1`, `image_id2`, `matches`.
   - **Usage**:
     ```python
     db.add_matches(image_id1, image_id2, matches)
     ```

9. **Method: `add_two_view_geometry`**
   - **Purpose**: To insert two-view geometry information between two images into the `two_view_geometries` table.
   - **Parameters**:
     - `image_id1`, `image_id2`, `matches`, `F`, `E`, `H`, `config`.
   - **Usage**:
     ```python
     db.add_two_view_geometry(image_id1, image_id2, matches)
     ```

### Detailed Concepts

1. **NumPy Arrays**:
   - Used extensively for numerical operations and data storage.
   - **Methods**: `np.asarray()`, `np.zeros()`, `np.float32`, `np.uint8`, `np.uint32`, `np.float64`.

2. **BLOB Conversion**:
   - Functions `array_to_blob` and `blob_to_array` handle the conversion between NumPy arrays and binary large objects for storage in the database.

3. **SQLite3 Connection**:
   - **Inheritance**: The class inherits from `sqlite3.Connection` to leverage SQLite's capabilities.
   - **Methods**: `executescript`, `execute`.

4. **Lambda Functions**:
   - Used for concise, anonymous functions.
   - Example: `self.create_tables = lambda: self.executescript(CREATE_ALL)`

5. **SQL Commands**:
   - **INSERT INTO**: Used for inserting new records into tables.
   - **CREATE TABLE**: Used for creating new tables in the database.

### Study Sources

1. **NumPy Arrays**:
   - Understanding NumPy array operations and data types.
   - **Source**: [NumPy Array Documentation](https://numpy.org/doc/stable/reference/generated/numpy.array.html)

2. **SQLite3 in Python**:
   - Learning about SQLite3 database operations in Python.
   - **Source**: [SQLite3 - Python Documentation](https://docs.python.org/3/library/sqlite3.html)

3. **BLOB Data Type**:
   - Understanding BLOBs for storing binary data.
   - **Source**: [BLOB Data Type - Database Guide](https://www.tutorialspoint.com/sql/sql-blob-data-type.htm)

4. **Lambda Functions in Python**:
   - Learning about lambda functions for concise code.
   - **Source**: [Lambda Functions - Real Python](https://realpython.com/python-lambda/)



In [15]:

class COLMAPDatabase(sqlite3.Connection):

    @staticmethod
    def connect(database_path):
        return sqlite3.connect(database_path, factory=COLMAPDatabase)


    def __init__(self, *args, **kwargs):
        super(COLMAPDatabase, self).__init__(*args, **kwargs)

        self.create_tables = lambda: self.executescript(CREATE_ALL)
        self.create_cameras_table = \
            lambda: self.executescript(CREATE_CAMERAS_TABLE)
        self.create_descriptors_table = \
            lambda: self.executescript(CREATE_DESCRIPTORS_TABLE)
        self.create_images_table = \
            lambda: self.executescript(CREATE_IMAGES_TABLE)
        self.create_two_view_geometries_table = \
            lambda: self.executescript(CREATE_TWO_VIEW_GEOMETRIES_TABLE)
        self.create_keypoints_table = \
            lambda: self.executescript(CREATE_KEYPOINTS_TABLE)
        self.create_matches_table = \
            lambda: self.executescript(CREATE_MATCHES_TABLE)
        self.create_name_index = lambda: self.executescript(CREATE_NAME_INDEX)

    def add_camera(self, model, width, height, params,
                   prior_focal_length=False, camera_id=None):
        params = np.asarray(params, np.float64)
        cursor = self.execute(
            "INSERT INTO cameras VALUES (?, ?, ?, ?, ?, ?)",
            (camera_id, model, width, height, array_to_blob(params),
             prior_focal_length))
        return cursor.lastrowid

    def add_image(self, name, camera_id,
                  prior_q=np.zeros(4), prior_t=np.zeros(3), image_id=None):
        cursor = self.execute(
            "INSERT INTO images VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
            (image_id, name, camera_id, prior_q[0], prior_q[1], prior_q[2],
             prior_q[3], prior_t[0], prior_t[1], prior_t[2]))
        return cursor.lastrowid

    def add_keypoints(self, image_id, keypoints):
        assert(len(keypoints.shape) == 2)
        assert(keypoints.shape[1] in [2, 4, 6])

        keypoints = np.asarray(keypoints, np.float32)
        self.execute(
            "INSERT INTO keypoints VALUES (?, ?, ?, ?)",
            (image_id,) + keypoints.shape + (array_to_blob(keypoints),))

    def add_descriptors(self, image_id, descriptors):
        descriptors = np.ascontiguousarray(descriptors, np.uint8)
        self.execute(
            "INSERT INTO descriptors VALUES (?, ?, ?, ?)",
            (image_id,) + descriptors.shape + (array_to_blob(descriptors),))

    def add_matches(self, image_id1, image_id2, matches):
        assert(len(matches.shape) == 2)
        assert(matches.shape[1] == 2)

        if image_id1 > image_id2:
            matches = matches[:,::-1]

        pair_id = image_ids_to_pair_id(image_id1, image_id2)
        matches = np.asarray(matches, np.uint32)
        self.execute(
            "INSERT INTO matches VALUES (?, ?, ?, ?)",
            (pair_id,) + matches.shape + (array_to_blob(matches),))

    def add_two_view_geometry(self, image_id1, image_id2, matches,
                              F=np.eye(3), E=np.eye(3), H=np.eye(3), config=2):
        assert(len(matches.shape) == 2)
        assert(matches.shape[1] == 2)

        if image_id1 > image_id2:
            matches = matches[:,::-1]

        pair_id = image_ids_to_pair_id(image_id1, image_id2)
        matches = np.asarray(matches, np.uint32)
        F = np.asarray(F, dtype=np.float64)
        E = np.asarray(E, dtype=np.float64)
        H = np.asarray(H, dtype=np.float64)
        self.execute(
            "INSERT INTO two_view_geometries VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
            (pair_id,) + matches.shape + (array_to_blob(matches), config,
             array_to_blob(F), array_to_blob(E), array_to_blob(H)))




---

# 📝 Example Usage of COLMAP Database Management with SQLite 🗄️

#### Explanation of the Code

1. **Function Definition: `example_usage`**
   - This function demonstrates the use of the `COLMAPDatabase` class to create and manipulate a COLMAP database.

2. **Imports and Argument Parsing**
   - Import necessary modules: `os`, `argparse`, `numpy`.
   - **Argument Parsing**: Define and parse the command-line argument `--database_path` with a default value of `"database.db"`.
     ```python
     parser = argparse.ArgumentParser()
     parser.add_argument("--database_path", default="database.db")
     args, unknown = parser.parse_known_args()
     ```

3. **Database Path Check**
   - Check if the database path already exists to avoid overwriting an existing database.
     ```python
     if os.path.exists(args.database_path):
         print("ERROR: database path already exists -- will not modify it.")
         return
     ```

4. **Database Connection and Table Creation**
   - Establish a connection to the database and create all required tables.
     ```python
     db = COLMAPDatabase.connect(args.database_path)
     db.create_tables()
     ```

5. **Adding Cameras**
   - Create dummy camera entries and add them to the database.
     ```python
     model1, width1, height1, params1 = 0, 1024, 768, np.array((1024., 512., 384.))
     model2, width2, height2, params2 = 2, 1024, 768, np.array((1024., 512., 384., 0.1))
     camera_id1 = db.add_camera(model1, width1, height1, params1)
     camera_id2 = db.add_camera(model2, width2, height2, params2)
     ```

6. **Adding Images**
   - Create dummy image entries and add them to the database.
     ```python
     image_id1 = db.add_image("image1.png", camera_id1)
     image_id2 = db.add_image("image2.png", camera_id1)
     image_id3 = db.add_image("image3.png", camera_id2)
     image_id4 = db.add_image("image4.png", camera_id2)
     ```

7. **Adding Keypoints**
   - Create and add random keypoints associated with each image.
     ```python
     num_keypoints = 1000
     keypoints1 = np.random.rand(num_keypoints, 2) * (width1, height1)
     keypoints2 = np.random.rand(num_keypoints, 2) * (width1, height1)
     keypoints3 = np.random.rand(num_keypoints, 2) * (width2, height2)
     keypoints4 = np.random.rand(num_keypoints, 2) * (width2, height2)
     db.add_keypoints(image_id1, keypoints1)
     db.add_keypoints(image_id2, keypoints2)
     db.add_keypoints(image_id3, keypoints3)
     db.add_keypoints(image_id4, keypoints4)
     ```

8. **Adding Matches**
   - Create and add random matches between pairs of images.
     ```python
     M = 50
     matches12 = np.random.randint(num_keypoints, size=(M, 2))
     matches23 = np.random.randint(num_keypoints, size=(M, 2))
     matches34 = np.random.randint(num_keypoints, size=(M, 2))
     db.add_matches(image_id1, image_id2, matches12)
     db.add_matches(image_id2, image_id3, matches23)
     db.add_matches(image_id3, image_id4, matches34)
     ```

9. **Commit and Verification**
   - Commit changes to the database.
   - Verify the stored cameras, keypoints, and matches.
     ```python
     db.commit()
     # Verify cameras
     rows = db.execute("SELECT * FROM cameras")
     camera_id, model, width, height, params, prior = next(rows)
     params = blob_to_array(params, np.float64)
     assert camera_id == camera_id1
     assert model == model1 and width == width1 and height == height1
     assert np.allclose(params, params1)
     # Verify keypoints
     keypoints = dict(
         (image_id, blob_to_array(data, np.float32, (-1, 2)))
         for image_id, data in db.execute(
             "SELECT image_id, data FROM keypoints"))
     assert np.allclose(keypoints[image_id1], keypoints1)
     assert np.allclose(keypoints[image_id2], keypoints2)
     assert np.allclose(keypoints[image_id3], keypoints3)
     assert np.allclose(keypoints[image_id4], keypoints4)
     # Verify matches
     pair_ids = [image_ids_to_pair_id(*pair) for pair in
                 ((image_id1, image_id2),
                  (image_id2, image_id3),
                  (image_id3, image_id4))]
     matches = dict(
         (pair_id_to_image_ids(pair_id),
          blob_to_array(data, np.uint32, (-1, 2)))
         for pair_id, data in db.execute("SELECT pair_id, data FROM matches")
     )
     assert np.all(matches[(image_id1, image_id2)] == matches12)
     assert np.all(matches[(image_id2, image_id3)] == matches23)
     assert np.all(matches[(image_id3, image_id4)] == matches34)
     ```

10. **Clean Up**
    - Close the database connection and remove the database file to clean up.
      ```python
      db.close()
      if os.path.exists(args.database_path):
          os.remove(args.database_path)
      ```


### Study Sources

1. **NumPy Arrays**:
   - Understanding NumPy array operations and data types.
   - **Source**: [NumPy Array Documentation](https://numpy.org/doc/stable/reference/generated/numpy.array.html)

2. **SQLite3 in Python**:
   - Learning about SQLite3 database operations in Python.
   - **Source**: [SQLite3 - Python Documentation](https://docs.python.org/3/library/sqlite3.html)

3. **BLOB Data Type**:
   - Understanding BLOBs for storing binary data.
   - **Source**: [BLOB Data Type - Database Guide](https://www.tutorialspoint.com/sql/sql-blob-data-type.htm)

4. **Lambda Functions in Python**:
   - Learning about lambda functions for concise code.
   - **Source**: [Lambda Functions - Real Python](https://realpython.com/python-lambda/)



In [16]:
def example_usage():
    import os
    import argparse
    import numpy as np
    #from your_database_module import COLMAPDatabase, blob_to_array, image_ids_to_pair_id, pair_id_to_image_ids  # replace with actual import paths

    parser = argparse.ArgumentParser()
    parser.add_argument("--database_path", default="database.db")
    args, unknown = parser.parse_known_args()

    if os.path.exists(args.database_path):
        print("ERROR: database path already exists -- will not modify it.")
        return

    # Open the database.
    db = COLMAPDatabase.connect(args.database_path)

    # For convenience, try creating all the tables upfront.
    db.create_tables()

    # Create dummy cameras.
    model1, width1, height1, params1 = 0, 1024, 768, np.array((1024., 512., 384.))
    model2, width2, height2, params2 = 2, 1024, 768, np.array((1024., 512., 384., 0.1))

    camera_id1 = db.add_camera(model1, width1, height1, params1)
    camera_id2 = db.add_camera(model2, width2, height2, params2)

    # Create dummy images.
    image_id1 = db.add_image("image1.png", camera_id1)
    image_id2 = db.add_image("image2.png", camera_id1)
    image_id3 = db.add_image("image3.png", camera_id2)
    image_id4 = db.add_image("image4.png", camera_id2)

    # Create dummy keypoints.
    num_keypoints = 1000
    keypoints1 = np.random.rand(num_keypoints, 2) * (width1, height1)
    keypoints2 = np.random.rand(num_keypoints, 2) * (width1, height1)
    keypoints3 = np.random.rand(num_keypoints, 2) * (width2, height2)
    keypoints4 = np.random.rand(num_keypoints, 2) * (width2, height2)

    db.add_keypoints(image_id1, keypoints1)
    db.add_keypoints(image_id2, keypoints2)
    db.add_keypoints(image_id3, keypoints3)
    db.add_keypoints(image_id4, keypoints4)

    # Create dummy matches.
    M = 50
    matches12 = np.random.randint(num_keypoints, size=(M, 2))
    matches23 = np.random.randint(num_keypoints, size=(M, 2))
    matches34 = np.random.randint(num_keypoints, size=(M, 2))

    db.add_matches(image_id1, image_id2, matches12)
    db.add_matches(image_id2, image_id3, matches23)
    db.add_matches(image_id3, image_id4, matches34)

    # Commit the data to the file.
    db.commit()

    # Read and check cameras.
    rows = db.execute("SELECT * FROM cameras")

    camera_id, model, width, height, params, prior = next(rows)
    params = blob_to_array(params, np.float64)
    assert camera_id == camera_id1
    assert model == model1 and width == width1 and height == height1
    assert np.allclose(params, params1)

    camera_id, model, width, height, params, prior = next(rows)
    params = blob_to_array(params, np.float64)
    assert camera_id == camera_id2
    assert model == model2 and width == width2 and height == height2
    assert np.allclose(params, params2)

    # Read and check keypoints.
    keypoints = dict(
        (image_id, blob_to_array(data, np.float32, (-1, 2)))
        for image_id, data in db.execute(
            "SELECT image_id, data FROM keypoints"))

    assert np.allclose(keypoints[image_id1], keypoints1)
    assert np.allclose(keypoints[image_id2], keypoints2)
    assert np.allclose(keypoints[image_id3], keypoints3)
    assert np.allclose(keypoints[image_id4], keypoints4)

    # Read and check matches.
    pair_ids = [image_ids_to_pair_id(*pair) for pair in
                ((image_id1, image_id2),
                 (image_id2, image_id3),
                 (image_id3, image_id4))]

    matches = dict(
        (pair_id_to_image_ids(pair_id),
         blob_to_array(data, np.uint32, (-1, 2)))
        for pair_id, data in db.execute("SELECT pair_id, data FROM matches")
    )

    assert np.all(matches[(image_id1, image_id2)] == matches12)
    assert np.all(matches[(image_id2, image_id3)] == matches23)
    assert np.all(matches[(image_id3, image_id4)] == matches34)

    # Clean up.
    db.close()

    if os.path.exists(args.database_path):
        os.remove(args.database_path)

if __name__ == "__main__":
    example_usage()


  return array.tostring()
  return np.fromstring(blob, dtype=dtype).reshape(*shape)


---

# 🖼️ Setting Up Image Matching Challenge Environment 🚀

#### Explanation of the Code

1. **`IMC_PATH` Variable**:
   - `IMC_PATH` is a string variable that holds the path to the directory where the data for the Image Matching Challenge of 2024 is located.
   - This variable is used to specify the location of datasets, images, or other relevant files needed for the Image Matching Challenge tasks.
   
2. **`DEVICE` Variable**:
   - `DEVICE` is a variable that determines the device on which PyTorch operations will be performed.
   - It checks if a CUDA-enabled GPU is available using `torch.cuda.is_available()`.
   - If a GPU is available, `DEVICE` is set to "cuda", indicating that operations should be performed on the GPU for faster computation.
   - If a GPU is not available, `DEVICE` is set to "cpu", indicating that operations will be performed on the CPU.
   - This ensures that the code can adapt to different computing environments, leveraging GPU acceleration when available for faster computation.

3. **`clear_output(wait=False)`**:
   - This function call is likely used to clear the output of the cell, making the notebook output cleaner.
   - The `wait=False` parameter ensures that the output is cleared immediately without waiting for other cell executions to complete.

#### Study Sources

1. **PyTorch Device Assignment**:
   - Understanding how PyTorch assigns operations to different devices (CPU/GPU).
   - **Source**: [PyTorch Documentation - Device Assignment](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device)

2. **CUDA Availability Check**:
   - Learning how to check if a CUDA-enabled GPU is available in PyTorch.
   - **Source**: [PyTorch Forums - CUDA Availability Check](https://discuss.pytorch.org/t/how-to-check-if-pytorch-is-using-the-gpu/311/3)

3. **Jupyter `clear_output` Function**:
   - Understanding the usage of the `clear_output` function in Jupyter notebooks.
   - **Source**: [Jupyter Documentation - Output Clearing](https://ipython.org/ipython-doc/3/api/generated/IPython.display.html#IPython.display.clear_output)



In [17]:
IMC_PATH = '/kaggle/input/image-matching-challenge-2024'
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
clear_output(wait=False)



---

# 🔄 Rotating Image Using PyTorch 🖼️

#### Explanation of the Code

This function `rotate_image` rotates an image by a multiple of 90 degrees using a PyTorch model for rotation prediction.

1. **Function Signature**:
   - `rotate_image(image, rotation)`: This function takes two parameters:
     - `image`: The input image tensor to be rotated.
     - `rotation`: A PyTorch model that predicts the rotation angle of the input image.

2. **Image Rotation Process**:
   - The function iterates four times (equivalent to 360 degrees) to cover all possible rotations.
   - For each iteration, it uses the provided rotation model to predict the rotation angle of the input image.
   - If the predicted rotation angle is 0 (indicating no rotation needed), the loop breaks, and the original image is returned.
   - If a non-zero rotation angle is predicted, the image is rotated by 90 degrees clockwise using the `rot90` function.
   - This process continues until either a rotation of 0 degrees is predicted or all four iterations are completed.

3. **`torch.no_grad()` Context Manager**:
   - The `with torch.no_grad():` context manager is used to ensure that no gradient calculations are performed during inference.
   - This is beneficial for inference-only operations, as it reduces memory consumption and speeds up computation by avoiding unnecessary gradient tracking.

4. **Output**:
   - The function returns the rotated image tensor.

#### Study Sources

1. **Image Rotation in PyTorch**:
   - Understanding how to perform image rotation using PyTorch tensors and functions.
   - **Source**: [PyTorch Documentation - Image Manipulation](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.functional.rotate)

2. **`torch.no_grad()` Context Manager**:
   - Learning about the `torch.no_grad()` context manager and its usage in PyTorch.
   - **Source**: [PyTorch Documentation - Autograd Mechanics](https://pytorch.org/docs/stable/autograd.html#torch.autograd.no_grad)

3. **Rotation Prediction Models**:
   - Exploring techniques and models for predicting the rotation angle of images.
   - **Source**: [Rotation Prediction Using Deep Learning - Research Paper](https://arxiv.org/abs/1807.02029)



In [18]:
def rotate_image(image,rotation):
    for i in range(4):
        with torch.no_grad():
            pred = rotation(image[None,...]).argmax()
        if pred == 0: break
        image = image.rot90(dims=[1,2])
    return image



---

# 🔍 Image Overlap Detection with Feature Matching 🖼️

#### Explanation of the Code

This function `overlap_detection` detects the overlap region between two images using feature extraction and matching techniques.

1. **Function Signature**:
   - `overlap_detection(extractor, matcher, image0, image1, min_matches)`: This function takes five parameters:
     - `extractor`: A feature extractor model used to extract features from the images.
     - `matcher`: A feature matcher model used to match features between the images.
     - `image0`: The first input image tensor.
     - `image1`: The second input image tensor.
     - `min_matches`: The minimum number of matches required to consider the overlap valid.

2. **Feature Extraction and Matching**:
   - The function first extracts features from both input images using the provided `extractor` model.
   - It then matches the extracted features between the two images using the `matcher` model.
   - If the number of matches is less than `min_matches`, the function returns the extracted features and matches without further processing.

3. **Bounding Box Calculation**:
   - If the number of matches is sufficient, the function calculates the bounding boxes for the matched keypoints in both images.
   - It finds the minimum and maximum coordinates of the matched keypoints in each image and calculates the width and height of the bounding box.

4. **Image Cropping**:
   - Using the calculated bounding boxes, the function crops the overlapping regions from both images.
   - It then calls the `match_pair` function (not provided here) to re-match features in the cropped regions.

5. **Adjusting Keypoint Coordinates**:
   - After matching features in the cropped regions, the function adjusts the coordinates of the keypoints to match the original image coordinates.

6. **Output**:
   - The function returns the extracted features (`feats0_c` and `feats1_c`) and matches (`matches01_c`) in the cropped regions.

#### Study Sources

1. **Feature Extraction and Matching in Computer Vision**:
   - Understanding the process of feature extraction and matching for image analysis tasks.
   - **Source**: [Feature Extraction and Matching Techniques - Computer Vision Foundation](https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Schroff_FaceNet_A_Unified_CVPR_2015_paper.pdf)

2. **Bounding Box Calculation**:
   - Learning about techniques to calculate bounding boxes for objects or regions of interest in images.
   - **Source**: [Bounding Box Calculation - Towards Data Science](https://towardsdatascience.com/bounding-box-detection-challenges-and-techniques-dac34cf2de40)

3. **Image Cropping in PyTorch**:
   - Exploring methods to crop regions of interest from images using PyTorch.
   - **Source**: [PyTorch Documentation - Image Transformations](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.functional.crop)


In [19]:
def overlap_detection(extractor, matcher, image0, image1, min_matches):
    feats0, feats1, matches01 = match_pair(extractor, matcher, image0, image1)
    if len(matches01['matches']) < min_matches:
        return feats0, feats1, matches01
    kpts0, kpts1, matches = feats0["keypoints"], feats1["keypoints"], matches01["matches"]
    m_kpts0, m_kpts1 = kpts0[matches[..., 0]], kpts1[matches[..., 1]]
    left0, top0 = m_kpts0.numpy().min(axis=0).astype(int)
    width0, height0 = m_kpts0.numpy().max(axis=0).astype(int)
    height0 -= top0
    width0 -= left0
    left1, top1 = m_kpts1.numpy().min(axis=0).astype(int)
    width1, height1 = m_kpts1.numpy().max(axis=0).astype(int)
    height1 -= top1
    width1 -= left1
    crop_box0 = (top0, left0, height0, width0)
    crop_box1 = (top1, left1, height1, width1)
    cropped_img_tensor0 = TF.crop(image0, *crop_box0)
    cropped_img_tensor1 = TF.crop(image1, *crop_box1)
    feats0_c, feats1_c, matches01_c = match_pair(extractor, matcher, cropped_img_tensor0, cropped_img_tensor1)
    feats0_c['keypoints'][...,0] += left0
    feats0_c['keypoints'][...,1] += top0
    feats1_c['keypoints'][...,0] += left1
    feats1_c['keypoints'][...,1] += top1
    return feats0_c, feats1_c, matches01_c



---

# 🌱 Seed Resetting for Reproducibility 🔄

#### Explanation of the Code

This function `reset_seed` resets the random seed for PyTorch and NumPy to ensure reproducibility in machine learning experiments.

1. **Function Signature**:
   - `reset_seed(seed)`: This function takes one parameter:
     - `seed`: The value of the seed to reset the random number generators.

2. **Resetting PyTorch Seed**:
   - `torch.manual_seed(seed)`: Sets the seed for generating random numbers in PyTorch on the CPU.
   - `torch.cuda.manual_seed_all(seed)`: Sets the seed for generating random numbers in PyTorch on all available GPUs.

3. **Resetting NumPy Seed**:
   - `np.random.seed(seed)`: Sets the seed for generating random numbers in NumPy.

4. **Importance of Seed Resetting**:
   - Resetting the random seed ensures that the same sequence of random numbers is generated each time the code is run with the same seed.
   - This is crucial for reproducibility in machine learning experiments, as it allows researchers to obtain consistent results across different runs.

5. **Usage**:
   - Call this function at the beginning of an experiment or before running any code that involves random number generation to ensure reproducibility.

#### Study Sources

1. **Random Seed and Reproducibility**:
   - Understanding the importance of setting random seeds for reproducible results in machine learning experiments.
   - **Source**: [Reproducible Research and Random Seeds - Medium](https://medium.com/acing-ai/reproducible-research-and-random-seeds-5876a1f34fc0)

2. **PyTorch Seed Setting**:
   - Learning about setting random seeds in PyTorch for reproducibility.
   - **Source**: [PyTorch Documentation - Randomness](https://pytorch.org/docs/stable/notes/randomness.html)

3. **NumPy Random Seed**:
   - Exploring how to set random seeds in NumPy for reproducibility.
   - **Source**: [NumPy Documentation - Random Sampling](https://numpy.org/doc/stable/reference/random/index.html)



In [20]:

def reset_seed(seed):
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    np.random.seed(seed)





---

# 📄 Parsing Sample Submission Data 📊

#### Explanation of the Code

This function `parse_sample_submission` parses the sample submission data from a given file and organizes it into a dictionary for further analysis.

1. **Function Signature**:
   - `parse_sample_submission(data_path)`: This function takes one parameter:
     - `data_path`: The path to the sample submission data file.

2. **Parsing Sample Submission Data**:
   - The function reads the content of the file line by line.
   - It skips the first line (header) and prints it for reference.
   - For each subsequent line, it extracts information such as `image_path`, `dataset`, and `scene`.
   - It constructs a dictionary `data_dict` to store the parsed data. The dictionary is organized by `dataset` and `scene`.
   - Each `dataset` contains multiple `scenes`, and each `scene` contains a list of image paths.
   - The image paths are converted to `Path` objects for convenient manipulation.

3. **Printing Dataset Information**:
   - After parsing, the function prints the number of images for each `dataset` and `scene` combination.

4. **Returning Data Dictionary**:
   - Finally, the function returns the parsed data dictionary `data_dict`.

#### Study Sources

1. **File Parsing in Python**:
   - Understanding how to read and parse data from files in Python.
   - **Source**: [Python Documentation - File Input and Output](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)

2. **Dictionary Data Structure**:
   - Learning about dictionaries in Python and how to organize data using key-value pairs.
   - **Source**: [Real Python - Dictionaries in Python](https://realpython.com/python-dicts/)

3. **Pathlib Module**:
   - Exploring the `Pathlib` module for working with file paths in Python.
   - **Source**: [Python Documentation - pathlib](https://docs.python.org/3/library/pathlib.html)



In [21]:
def parse_sample_submission(data_path):
    data_dict = {}
    with open(data_path, "r") as f:
        for i, l in enumerate(f):
            if i == 0:
                print("header:", l)

            if l and i > 0:
                image_path, dataset, scene, _, _ = l.strip().split(',')
                if dataset not in data_dict:
                    data_dict[dataset] = {}
                if scene not in data_dict[dataset]:
                    data_dict[dataset][scene] = []
                data_dict[dataset][scene].append(Path(IMC_PATH +'/'+ image_path))

    for dataset in data_dict:
        for scene in data_dict[dataset]:
            print(f"{dataset} / {scene} -> {len(data_dict[dataset][scene])} images")

    return data_dict





---

# 📝 Converting Array to String Representation 🔄

#### Explanation of the Code

This function `arr_to_str` converts a NumPy array to a string representation by flattening it and joining its elements.

1. **Function Signature**:
   - `arr_to_str(a)`: This function takes one parameter:
     - `a`: The input NumPy array to be converted to a string.

2. **Flattening the Array**:
   - The function first reshapes the input array `a` using `.reshape(-1)`, which flattens the array into a 1-dimensional shape.
   - This ensures that regardless of the input array's dimensions, it will be converted to a 1D array.

3. **Joining Array Elements**:
   - The flattened array elements are then converted to strings using a list comprehension `[str(x) for x in a.reshape(-1)]`.
   - Each element is converted to a string to prepare for joining.

4. **Joining with Separator**:
   - The `join` method is used to concatenate the string representations of array elements into a single string.
   - A semicolon (`;`) is used as the separator to distinguish between individual elements.

5. **Returning String Representation**:
   - The function returns the concatenated string representation of the input array.



In [22]:

def arr_to_str(a):
    return ";".join([str(x) for x in a.reshape(-1)])





---

# 📄 Creating Submission File from Results 🖋️

#### Explanation of the Code

This function `create_submission` generates a submission CSV file based on the results obtained from processing the input data.

1. **Function Signature**:
   - `create_submission(results, data_dict, base_path)`: This function takes three parameters:
     - `results`: A dictionary containing the processed results, typically obtained from some analysis or computation.
     - `data_dict`: A dictionary containing information about the dataset and scenes.
     - `base_path`: The base path to be used for constructing relative paths of image files.

2. **Writing Header**:
   - The function opens the submission file in write mode and writes the header line containing column names: `"image_path,dataset,scene,rotation_matrix,translation_vector\n"`.

3. **Iterating Over Datasets and Scenes**:
   - It iterates over each dataset in the `data_dict`.
   - For each dataset, it checks if corresponding results are available in the `results` dictionary. If not, it initializes an empty dictionary.
   - It then iterates over each scene in the dataset and checks if scene-specific results are available. If not, it initializes empty dictionaries for rotation (`R`) and translation (`t`).

4. **Writing Data to CSV**:
   - For each image in the dataset and scene, it retrieves the rotation matrix (`R`) and translation vector (`t`) from the results. If not available, it defaults to identity matrix and zero vector, respectively.
   - It constructs the relative image path with respect to the `base_path`.
   - It writes the image path, dataset, scene, rotation matrix, and translation vector to the CSV file.

#### Study Sources

1. **CSV File Writing in Python**:
   - Understanding how to write data to CSV files using Python.
   - **Source**: [Real Python - Reading and Writing CSV Files](https://realpython.com/python-csv/)

2. **Dictionary Manipulation in Python**:
   - Learning about techniques for working with dictionaries in Python, including iteration and key-value access.
   - **Source**: [Python Documentation - Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)

3. **Pathlib Module**:
   - Exploring the `Pathlib` module for working with file paths in Python.
   - **Source**: [Python Documentation - pathlib](https://docs.python.org/3/library/pathlib.html)



In [23]:

def create_submission(results,data_dict,base_path):    
    with open("submission.csv", "w") as f:
        f.write("image_path,dataset,scene,rotation_matrix,translation_vector\n")
        
        for dataset in data_dict:
            if dataset in results:
                res = results[dataset]
            else:
                res = {}
            
            for scene in data_dict[dataset]:
                if scene in res:
                    scene_res = res[scene]
                else:
                    scene_res = {"R":{}, "t":{}}
                    
                for image in data_dict[dataset][scene]:
                    if image in scene_res:
                        R = scene_res[image]["R"].reshape(-1)
                        T = scene_res[image]["t"].reshape(-1)
                    else:
                        R = np.eye(3).reshape(-1)
                        T = np.zeros((3))
                    image_path = str(image.relative_to(base_path))
                    f.write(f"{image_path},{dataset},{scene},{arr_to_str(R)},{arr_to_str(T)}\n")





---

# 🚀 Running Image Matching Pipeline 🖼️

#### Explanation of the Code

This function `run` orchestrates the entire image matching pipeline, from data loading to submission creation.

1. **Function Signature**:
   - `run(data_path, get_pairs, keypoints_matches, ransac_and_sparse_reconstruction, submit=True)`: This function takes five parameters:
     - `data_path`: The path to the sample submission data file.
     - `get_pairs`: Function for obtaining pairs of images for matching.
     - `keypoints_matches`: Function for extracting keypoints and computing matches between images.
     - `ransac_and_sparse_reconstruction`: Function for performing RANSAC and sparse reconstruction.
     - `submit`: Boolean flag indicating whether to generate a submission file (default is `True`).

2. **Parsing Sample Submission Data**:
   - The function calls `parse_sample_submission` to parse the sample submission data and store it in the `data_dict` variable.

3. **Iterating Over Datasets and Scenes**:
   - It iterates over each dataset and scene in the parsed data.
   - For each dataset-scene pair, it retrieves the image paths and initializes the `results` dictionary to store the processed results.

4. **Running Image Matching Pipeline**:
   - It calls the provided functions `get_pairs`, `keypoints_matches`, and `ransac_and_sparse_reconstruction` to perform various steps of the image matching pipeline.
   - These functions are responsible for obtaining image pairs, extracting keypoints and computing matches, and performing RANSAC and sparse reconstruction, respectively.

5. **Generating Submission**:
   - If `submit` flag is `True`, it generates a submission file using the `create_submission` function with the processed results.

#### Study Sources

1. **Function Composition in Python**:
   - Understanding how to compose functions together to build complex workflows in Python.
   - **Source**: [Real Python - Function Composition](https://realpython.com/python-functional-programming/)

2. **Data Pipelines in Machine Learning**:
   - Learning about the concept of data pipelines and how they are used to organize and execute machine learning workflows.
   - **Source**: [Towards Data Science - Building Machine Learning Pipelines](https://towardsdatascience.com/building-machine-learning-pipelines-b86f5f12f7eb)

3. **Image Matching and Reconstruction**:
   - Exploring the techniques and algorithms involved in image matching, keypoint extraction, feature matching, and sparse reconstruction.
   - **Source**: [OpenCV Documentation - Feature Detection and Description](https://docs.opencv.org/4.x/db/d27/tutorial_py_table_of_contents_feature2d.html)


In [24]:

def run(data_path,get_pairs,keypoints_matches,ransac_and_sparse_reconstruction,submit=True):
    results = {}
    
    data_dict = parse_sample_submission(data_path)
    datasets = list(data_dict.keys())
    
    for dataset in datasets:
        if dataset not in results:
            results[dataset] = {}
            
        for scene in data_dict[dataset]:
            images_dir = data_dict[dataset][scene][0].parent
            results[dataset][scene] = {}
            image_paths = data_dict[dataset][scene]

            index_pairs = get_pairs(image_paths)
            keypoints_matches(image_paths,index_pairs)                
            maps = ransac_and_sparse_reconstruction(image_paths[0].parent)
            clear_output(wait=False)
            
            path = 'test' if submit else 'train'
            images_registered  = 0
            best_idx = 0
            for idx, rec in maps.items():
                if len(rec.images) > images_registered:
                    images_registered = len(rec.images)
                    best_idx = idx
                    
            for k, im in maps[best_idx].images.items():
                key = Path(IMC_PATH) / path / scene / "images" / im.name
                results[dataset][scene][key] = {}
                results[dataset][scene][key]["R"] = deepcopy(im.cam_from_world.rotation.matrix())
                results[dataset][scene][key]["t"] = deepcopy(np.array(im.cam_from_world.translation))

            create_submission(results, data_dict, Path(IMC_PATH))





---

# Constants for Translation Thresholds 📏

#### Explanation of the Code

Define a dictionary `translation_thresholds_meters_dict` that maps scene names to arrays of translation thresholds in meters. These thresholds are typically used for evaluating the accuracy of the estimated translations between images in a scene.

1. **Translation Thresholds Dictionary**:
   - The dictionary `translation_thresholds_meters_dict` contains scene names as keys and corresponding arrays of translation thresholds as values.
   - Each scene may have different thresholds depending on factors such as scene complexity or expected level of accuracy.

2. **Scene Names and Translation Thresholds**:
   - Each scene is associated with an array of translation thresholds, which are specified in meters.
   - The thresholds are typically chosen based on the scale of the scene and the desired level of precision in the estimated translations.

3. **EPS (Epsilon) Value**:
   - The `_EPS` constant is defined as a small value that is used to handle numerical precision issues.
   - It is calculated as the machine epsilon multiplied by 4.0, which provides a small margin for comparison operations involving floating-point numbers.

#### Study Sources

1. **Numerical Precision and Machine Epsilon**:
   - Understanding the concept of machine epsilon and its importance in handling numerical precision in floating-point arithmetic.
   - **Source**: [Wikipedia - Machine Epsilon](https://en.wikipedia.org/wiki/Machine_epsilon)

2. **Evaluation Metrics in Computer Vision**:
   - Learning about common evaluation metrics used in computer vision tasks, including thresholds for translation accuracy.
   - **Source**: [Computer Vision: Metrics and Evaluation](https://learnopencv.com/computer-vision-metrics-and-evaluation/)

3. **Scene Complexity and Accuracy Requirements**:
   - Understanding how scene complexity and desired accuracy influence the choice of evaluation thresholds in image processing and computer vision tasks.
   - **Source**: [Computer Vision: Algorithms and Applications](https://www.amazon.com/Computer-Vision-Algorithms-Applications-Richard/dp/1439867998)


In [25]:

_EPS = np.finfo(float).eps * 4.0
translation_thresholds_meters_dict = {
 'multi-temporal-temple-baalshamin':  np.array([0.025,  0.05,  0.1,  0.2,  0.5,  1.0]),
 'pond':                              np.array([0.025,  0.05,  0.1,  0.2,  0.5,  1.0]),
 'transp_obj_glass_cylinder':         np.array([0.0025, 0.005, 0.01, 0.02, 0.05, 0.1]),
 'transp_obj_glass_cup':              np.array([0.0025, 0.005, 0.01, 0.02, 0.05, 0.1]),
 'church':                            np.array([0.025,  0.05,  0.1,  0.2,  0.5,  1.0]),
 'lizard':                            np.array([0.025,  0.05,  0.1,  0.2,  0.5,  1.0]),
 'dioscuri':                          np.array([0.025,  0.05,  0.1,  0.2,  0.5,  1.0]), 
}





---

# Vector Norm Calculation 📐

#### Explanation of the Code

This function `vector_norm` computes the Euclidean norm (or length) of a numpy array. The Euclidean norm is the standard length of a vector and is computed as the square root of the sum of the squared elements. This function can handle both 1-dimensional and multi-dimensional arrays, and allows specifying the axis along which to compute the norm.

1. **Function Definition**:
   - The function `vector_norm` takes three parameters:
     - `data`: The input array whose norm is to be calculated.
     - `axis`: The axis along which to compute the norm. If `None`, the norm is computed over the entire array.
     - `out`: An optional output array to store the result.

2. **Conversion to Numpy Array**:
   - The input `data` is converted to a numpy array of type `float64` to ensure precision in calculations.

3. **Handling 1D Arrays**:
   - If the input array is 1-dimensional and `out` is `None`, the norm is calculated using `np.dot` to compute the dot product of the array with itself, followed by taking the square root.

4. **Handling Multi-Dimensional Arrays**:
   - For multi-dimensional arrays, the elements of the array are squared, and the sum is computed along the specified axis.
   - If `out` is `None`, the sum of the squared elements is stored in a new array, `out`, and the square root of this array is returned.
   - If `out` is provided, the result is stored in the provided output array after summing the squared elements and taking the square root.

5. **In-Place Operations**:
   - The operations are performed in-place when `out` is provided to avoid creating additional copies of the data, making the function more memory efficient.

#### Example Usage

```python
import numpy as np

# Example 1: 1D array
vector = np.array([3, 4])
print(vector_norm(vector))  # Output: 5.0

# Example 2: 2D array, norm along rows (axis=1)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(vector_norm(matrix, axis=1))  # Output: [3.74165739 8.77496439]
```

#### Study Sources

1. **Numpy Documentation**:
   - Understanding numpy array operations and functions such as `np.dot`, `np.sum`, and `np.sqrt`.
   - **Source**: [Numpy Documentation](https://numpy.org/doc/stable/)

2. **Vector Norms in Linear Algebra**:
   - Learning about different types of vector norms and their applications in linear algebra and machine learning.
   - **Source**: [Linear Algebra - Norms](https://www.math.ucla.edu/~yanovsky/Teaching/Math151B/handouts/Norms.pdf)

3. **In-Place Operations in Numpy**:
   - Understanding in-place operations and their benefits in terms of memory efficiency.
   - **Source**: [In-Place Operations in Numpy](https://numpy.org/doc/stable/reference/generated/numpy.ufunc.at.html)



In [26]:
def vector_norm(data, axis=None, out=None):
    '''Return length, i.e. Euclidean norm, of ndarray along axis.'''
    data = np.array(data, dtype=np.float64, copy=True)
    if out is None:
        if data.ndim == 1:
            return math.sqrt(np.dot(data, data))
        data *= data
        out = np.atleast_1d(np.sum(data, axis=axis))
        np.sqrt(out, out)
        return out
    data *= data
    np.sum(data, axis=axis, out=out)
    np.sqrt(out, out)
    return None




---

# Quaternion to Homogeneous Rotation Matrix Conversion 🔄

#### Explanation of the Code

This function, `quaternion_matrix`, converts a quaternion into a homogeneous rotation matrix. Quaternions are often used in computer graphics, robotics, and aerospace for representing rotations because they are more numerically stable and efficient compared to other representations like Euler angles.

1. **Function Definition**:
   - The function `quaternion_matrix` takes a single parameter:
     - `quaternion`: A 4-element array representing the quaternion (q0, q1, q2, q3).

2. **Conversion to Numpy Array**:
   - The input quaternion is converted to a numpy array of type `float64` for precision.

3. **Normalization Check**:
   - The norm (squared magnitude) of the quaternion is computed.
   - If the norm is very small (less than a tiny threshold `_EPS`), the function returns the identity matrix. This is a special case to handle quaternions that are essentially zero, which represents no rotation.

4. **Normalization and Scaling**:
   - The quaternion is scaled by the square root of \(2/n\) to ensure it is normalized. This scaling factor ensures that the resulting rotation matrix is valid.

5. **Outer Product Calculation**:
   - The outer product of the quaternion with itself is computed to form a 4x4 matrix. This matrix is used in the construction of the rotation matrix.

6. **Rotation Matrix Construction**:
   - The elements of the 4x4 homogeneous rotation matrix are populated using the elements of the outer product matrix. This matrix includes both the rotation (top-left 3x3 submatrix) and homogeneous transformation (last column and row).

#### Example Usage

```python
import numpy as np

# Example quaternion (unit quaternion representing no rotation)
quaternion = [1, 0, 0, 0]

# Convert quaternion to rotation matrix
rotation_matrix = quaternion_matrix(quaternion)
print(rotation_matrix)
```

#### Study Sources

1. **Quaternions in 3D Rotations**:
   - Understanding quaternions and their use in representing rotations in 3D space.
   - **Source**: [Quaternions and spatial rotation](https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation)

2. **Homogeneous Transformation Matrices**:
   - Learning about homogeneous coordinates and transformation matrices in computer graphics and robotics.
   - **Source**: [Homogeneous Transformation Matrices](https://www.mathworks.com/help/robotics/ug/homogeneous-transformations.html)

3. **Numerical Stability in Quaternion Operations**:
   - The importance of numerical stability when working with quaternions, especially in computational applications.
   - **Source**: [Quaternion Normalization](https://www.oreilly.com/library/view/3d-math-primer/9781568817231/)



In [27]:

def quaternion_matrix(quaternion):
    '''Return homogeneous rotation matrix from quaternion.'''
    q = np.array(quaternion, dtype=np.float64, copy=True)
    n = np.dot(q, q)
    if n < _EPS:
        # print("special case")
        return np.identity(4)
    q *= math.sqrt(2.0 / n)
    q = np.outer(q, q)
    return np.array(
        [
            [
                1.0 - q[2, 2] - q[3, 3],
                q[1, 2] - q[3, 0],
                q[1, 3] + q[2, 0],
                0.0,
            ],
            [
                q[1, 2] + q[3, 0],
                1.0 - q[1, 1] - q[3, 3],
                q[2, 3] - q[1, 0],
                0.0,
            ],
            [
                q[1, 3] - q[2, 0],
                q[2, 3] + q[1, 0],
                1.0 - q[1, 1] - q[2, 2],
                0.0,
            ],
            [0.0, 0.0, 0.0, 1.0],
        ]
    )







---

# Affine Transformation Matrix from Point Correspondences ✨

#### Explanation of the Code

This function, `affine_matrix_from_points`, computes an affine transformation matrix that maps points in `v0` to corresponding points in `v1`. Affine transformations include translation, scaling, rotation, and shearing. The function supports options for including or excluding shearing and scaling in the transformation.

1. **Function Definition**:
   - The function `affine_matrix_from_points` takes the following parameters:
     - `v0`: A set of source points.
     - `v1`: A set of destination points.
     - `shear`: A boolean flag to include/exclude shearing in the transformation.
     - `scale`: A boolean flag to include/exclude scaling in the transformation.
     - `usesvd`: A boolean flag to use Singular Value Decomposition (SVD) for the transformation.

2. **Input Validation**:
   - The function checks if the input arrays have the correct shape and type. If the conditions are not met, it raises a `ValueError`.

3. **Moving Centroids to Origin**:
   - The centroids of `v0` and `v1` are calculated and subtracted from the points to move the centroids to the origin. This step simplifies the computation of the transformation matrix.
   - Transformation matrices `M0` and `M1` are created to move the centroids back to their original positions later.

4. **Affine Transformation**:
   - If `shear` is enabled, the function computes the affine transformation using SVD on the concatenated matrices of `v0` and `v1`.

5. **Rigid Transformation via SVD**:
   - If `shear` is disabled or `usesvd` is enabled, the function computes the rigid transformation using SVD on the covariance matrix of `v1` and `v0`. The resulting rotation matrix `R` is used to form the transformation matrix `M`.

6. **Rigid Transformation via Quaternion**:
   - If `usesvd` is disabled and the dimension is not 3, the function computes the rigid transformation using quaternion. It constructs a symmetric matrix `N`, computes its eigenvalues and eigenvectors, and forms the rotation matrix from the unit quaternion.

7. **Scaling**:
   - If `scale` is enabled and `shear` is disabled, the function applies scaling by adjusting the transformation matrix based on the ratio of RMS deviations from the centroids of `v0` and `v1`.

8. **Moving Centroids Back**:
   - The function moves the centroids back to their original positions by multiplying the transformation matrices `M0` and `M1`.

9. **Normalization**:
   - The transformation matrix `M` is normalized to ensure the last element is 1.

10. **Return**:
    - The function returns the final transformation matrix `M`.

#### Example Usage

```python
import numpy as np

# Example source and destination points
v0 = np.array([[0, 1, 2], [0, 1, 2]])
v1 = np.array([[1, 2, 3], [1, 2, 3]])

# Compute affine transformation matrix
transformation_matrix = affine_matrix_from_points(v0, v1)
print(transformation_matrix)
```

#### Study Sources

1. **Affine Transformations**:
   - Understanding the mathematical foundation and applications of affine transformations in computer vision and graphics.
   - **Source**: [Affine Transformation](https://en.wikipedia.org/wiki/Affine_transformation)

2. **Singular Value Decomposition (SVD)**:
   - Learning about SVD and its applications in solving least squares problems and computing transformations.
   - **Source**: [Singular Value Decomposition](https://en.wikipedia.org/wiki/Singular_value_decomposition)

3. **Quaternions and Rotations**:
   - Understanding how quaternions are used to represent rotations and how to convert them to rotation matrices.
   - **Source**: [Quaternions and 3D Rotation](https://www.oreilly.com/library/view/3d-math-primer/9781568817231/)

4. **Vector Norms and Linear Algebra**:
   - Learning about vector norms and their importance in normalization and other linear algebra operations.
   - **Source**: [Vector Norms](https://mathworld.wolfram.com/VectorNorm.html)



In [28]:



def affine_matrix_from_points(v0, v1, shear=False, scale=True, usesvd=True):
    
    v0 = np.array(v0, dtype=np.float64, copy=True)
    v1 = np.array(v1, dtype=np.float64, copy=True)

    ndims = v0.shape[0]
    if ndims < 2 or v0.shape[1] < ndims or v0.shape != v1.shape:
        raise ValueError("input arrays are of wrong shape or type")

    # move centroids to origin
    t0 = -np.mean(v0, axis=1)
    M0 = np.identity(ndims + 1)
    M0[:ndims, ndims] = t0
    v0 += t0.reshape(ndims, 1)
    t1 = -np.mean(v1, axis=1)
    M1 = np.identity(ndims + 1)
    M1[:ndims, ndims] = t1
    v1 += t1.reshape(ndims, 1)

    if shear:
        # Affine transformation
        A = np.concatenate((v0, v1), axis=0)
        u, s, vh = np.linalg.svd(A.T)
        vh = vh[:ndims].T
        B = vh[:ndims]
        C = vh[ndims: 2 * ndims]
        t = np.dot(C, np.linalg.pinv(B))
        t = np.concatenate((t, np.zeros((ndims, 1))), axis=1)
        M = np.vstack((t, ((0.0,) * ndims) + (1.0,)))
    elif usesvd or ndims != 3:
        # Rigid transformation via SVD of covariance matrix
        u, s, vh = np.linalg.svd(np.dot(v1, v0.T))
        # rotation matrix from SVD orthonormal bases
        R = np.dot(u, vh)
        if np.linalg.det(R) < 0.0:
            # R does not constitute right handed system
            R -= np.outer(u[:, ndims - 1], vh[ndims - 1, :] * 2.0)
            s[-1] *= -1.0
        # homogeneous transformation matrix
        M = np.identity(ndims + 1)
        M[:ndims, :ndims] = R
    else:
        # Rigid transformation matrix via quaternion
        # compute symmetric matrix N
        xx, yy, zz = np.sum(v0 * v1, axis=1)
        xy, yz, zx = np.sum(v0 * np.roll(v1, -1, axis=0), axis=1)
        xz, yx, zy = np.sum(v0 * np.roll(v1, -2, axis=0), axis=1)
        N = [
            [xx + yy + zz, 0.0, 0.0, 0.0],
            [yz - zy, xx - yy - zz, 0.0, 0.0],
            [zx - xz, xy + yx, yy - xx - zz, 0.0],
            [xy - yx, zx + xz, yz + zy, zz - xx - yy],
        ]
        # quaternion: eigenvector corresponding to most positive eigenvalue
        w, V = np.linalg.eigh(N)
        q = V[:, np.argmax(w)]
        # print (vector_norm(q), np.linalg.norm(q))
        q /= vector_norm(q)  # unit quaternion
        # homogeneous transformation matrix
        M = quaternion_matrix(q)

    if scale and not shear:
        # Affine transformation; scale is ratio of RMS deviations from centroid
        v0 *= v0
        v1 *= v1
        M[:ndims, :ndims] *= math.sqrt(np.sum(v1) / np.sum(v0))

    # move centroids back
    M = np.dot(np.linalg.inv(M1), np.dot(M, M0))
    M /= M[ndims, ndims]

    # print("transformation matrix Python Script: ", M)

    return M





---

# IMC 3D Error Metric: Register by Horn Method 🚀

#### Explanation of the Code

This function, `register_by_Horn`, performs registration of estimated camera coordinates (`ev_coord`) to ground truth camera coordinates (`gt_coord`) using an iterative process involving RANSAC and affine transformation. The goal is to find the best transformation matrix that minimizes the error between the estimated and ground truth coordinates.

1. **Function Definition**:
   - The function `register_by_Horn` takes the following parameters:
     - `ev_coord`: Estimated camera coordinates.
     - `gt_coord`: Ground truth camera coordinates.
     - `ransac_threshold`: Thresholds for RANSAC inlier detection.
     - `inl_cf`: Inlier coefficient for refining the transformation.
     - `strict_cf`: Strict coefficient for determining strict inliers.

2. **Remove Invalid Cameras**:
   - Cameras with non-finite coordinates are removed from consideration. The indices of valid cameras are stored.

3. **Initialization**:
   - Several variables are initialized to keep track of the best transformation matrix, errors, and inliers.
   - `max_no_inl`: Maximum number of inliers for each RANSAC threshold.
   - `best_inl_err`: Best inlier error for each RANSAC threshold.
   - `best_transf_matrix`: Best transformation matrix for each RANSAC threshold.
   - `best_err`: Best error for each camera and RANSAC threshold.
   - `strict_inl`: Boolean array indicating strict inliers.
   - `triplets_used`: Indices of the camera triplets used for the best transformation matrix.

4. **Run on Camera Triplets**:
   - The function iterates over all possible triplets of camera indices. For each triplet:
     - If all three cameras are already strict inliers for the best current model, the triplet is skipped.
     - An affine transformation matrix is computed using the `affine_matrix_from_points` function.
     - The transformation is applied to the estimated camera coordinates.
     - The error between the transformed estimated coordinates and ground truth coordinates is computed.
     - Inliers are identified based on the error and RANSAC thresholds.
     - If the number of inliers for the current triplet is close to the best model so far, the transformation is refined using all inliers.
     - The refined transformation matrix and inliers are used to update the best model if they provide a better fit.

5. **Best Model**:
   - The best model is stored in a dictionary containing:
     - `valid_cams`: Indices of valid cameras.
     - `no_inl`: Number of inliers for each RANSAC threshold.
     - `err`: Error for each camera and RANSAC threshold.
     - `triplets_used`: Indices of the camera triplets used for the best transformation matrix.
     - `transf_matrix`: Best transformation matrix for each RANSAC threshold.

6. **Return**:
   - The function returns the best model found.

#### Example Usage

```python
import numpy as np

# Example estimated and ground truth coordinates
ev_coord = np.random.rand(3, 10)  # 3D coordinates for 10 cameras
gt_coord = np.random.rand(3, 10)  # 3D ground truth coordinates for 10 cameras

# RANSAC parameters
ransac_threshold = np.array([0.01, 0.02, 0.05])
inl_cf = 1.2
strict_cf = 0.8

# Compute the best registration model
best_model = register_by_Horn(ev_coord, gt_coord, ransac_threshold, inl_cf, strict_cf)

# Print the best transformation matrix for the first threshold
print("Best Transformation Matrix for the first threshold:\n", best_model["transf_matrix"][0])
```

#### Study Sources

1. **RANSAC Algorithm**:
   - Understanding RANSAC and its applications in robust fitting and outlier detection.
   - **Source**: [RANSAC](https://en.wikipedia.org/wiki/Random_sample_consensus)

2. **Affine Transformations**:
   - Learning about affine transformations and their applications in computer vision and graphics.
   - **Source**: [Affine Transformation](https://en.wikipedia.org/wiki/Affine_transformation)

3. **Singular Value Decomposition (SVD)**:
   - Understanding SVD and its applications in solving least squares problems and computing transformations.
   - **Source**: [Singular Value Decomposition](https://en.wikipedia.org/wiki/Singular_value_decomposition)

4. **Horn's Method**:
   - Understanding Horn's method for aligning two sets of points using quaternions and least squares fitting.
   - **Source**: [Closed-form solution of absolute orientation using unit quaternions](https://people.eecs.berkeley.edu/~ani/teaching/CS294-6/horn.pdf)


In [29]:

# This is the IMC 3D error metric code
def register_by_Horn(ev_coord, gt_coord, ransac_threshold, inl_cf, strict_cf):
    
    # remove invalid cameras, the index is returned
    idx_cams = np.all(np.isfinite(ev_coord), axis=0)
    ev_coord = ev_coord[:, idx_cams]
    gt_coord = gt_coord[:, idx_cams]

    # initialization
    n = ev_coord.shape[1]
    r = ransac_threshold.shape[0]
    ransac_threshold = np.expand_dims(ransac_threshold, axis=0)
    ransac_threshold2 = ransac_threshold**2
    ev_coord_1 = np.vstack((ev_coord, np.ones(n)))

    max_no_inl = np.zeros((1, r))
    best_inl_err = np.full(r, np.Inf)
    best_transf_matrix = np.zeros((r, 4, 4))
    best_err = np.full((n, r), np.Inf)
    strict_inl = np.full((n, r), False)
    triplets_used = np.zeros((3, r))

    # run on camera triplets
    for ii in range(n-2):
        for jj in range(ii+1, n-1):
            for kk in range(jj+1, n):
                i = [ii, jj, kk]
                triplets_used_now = np.full((n), False)
                triplets_used_now[i] = True
                # if both ii, jj, kk are strict inliers for the best current model just skip
                if np.all(strict_inl[i]):
                    continue
                # get transformation T by Horn on the triplet camera center correspondences
                transf_matrix = affine_matrix_from_points(ev_coord[:, i], gt_coord[:, i], usesvd=False)
                # apply transformation T to test camera centres
                rotranslated = np.matmul(transf_matrix[:3], ev_coord_1)
                # compute error and inliers
                err = np.sum((rotranslated - gt_coord)**2, axis=0)
                inl = np.expand_dims(err, axis=1) < ransac_threshold2
                no_inl = np.sum(inl, axis=0)
                # if the number of inliers is close to that of the best model so far, go for refinement
                to_ref = np.squeeze(((no_inl > 2) & (no_inl > max_no_inl * inl_cf)), axis=0)
                for q in np.argwhere(to_ref):                        
                    qq = q[0]
                    if np.any(np.all((np.expand_dims(inl[:, qq], axis=1) == inl[:, :qq]), axis=0)):
                        # already done for this set of inliers
                        continue
                    # get transformation T by Horn on the inlier camera center correspondences
                    transf_matrix = affine_matrix_from_points(ev_coord[:, inl[:, qq]], gt_coord[:, inl[:, qq]])
                    # apply transformation T to test camera centres
                    rotranslated = np.matmul(transf_matrix[:3], ev_coord_1)
                    # compute error and inliers
                    err_ref = np.sum((rotranslated - gt_coord)**2, axis=0)
                    err_ref_sum = np.sum(err_ref, axis=0)
                    err_ref = np.expand_dims(err_ref, axis=1)
                    inl_ref = err_ref < ransac_threshold2
                    no_inl_ref = np.sum(inl_ref, axis=0)
                    # update the model if better for each threshold
                    to_update = np.squeeze((no_inl_ref > max_no_inl) | ((no_inl_ref == max_no_inl) & (err_ref_sum < best_inl_err)), axis=0)
                    if np.any(to_update):
                        triplets_used[0, to_update] = ii
                        triplets_used[1, to_update] = jj
                        triplets_used[2, to_update] = kk
                        max_no_inl[:, to_update] = no_inl_ref[to_update]
                        best_err[:, to_update] = np.sqrt(err_ref)
                        best_inl_err[to_update] = err_ref_sum
                        strict_inl[:, to_update] = (best_err[:, to_update] < strict_cf * ransac_threshold[:, to_update])
                        best_transf_matrix[to_update] = transf_matrix


    best_model = {
        "valid_cams": idx_cams,        
        "no_inl": max_no_inl,
        "err": best_err,
        "triplets_used": triplets_used,
        "transf_matrix": best_transf_matrix}
    return best_model





---

# Mean Average Accuracy (mAA) on Cameras ✨

#### Explanation of the Code

The function `mAA_on_cameras` calculates the mean average accuracy (mAA) metric for camera registration errors over a range of thresholds. This metric evaluates the accuracy of camera positions after applying a registration algorithm, using specified error thresholds.

1. **Function Definition**:
   - The function `mAA_on_cameras` takes the following parameters:
     - `err`: A 2D array of shape `(n, t)` containing errors for `n` cameras and `t` thresholds.
     - `thresholds`: A list or array of error thresholds.
     - `n`: The number of cameras.
     - `skip_top_thresholds`: The number of top thresholds to skip in the calculation.
     - `to_dec`: A constant for adjusting the number of cameras to consider (default is 3).

2. **Compute Auxiliary Matrix**:
   - The function creates a boolean matrix `aux`, which indicates whether each error value in `err` is below the corresponding threshold from `thresholds`, starting from `skip_top_thresholds`.

3. **Calculate mAA**:
   - The function calculates the mAA by summing the number of thresholds each camera's error is below, adjusted by `to_dec`.
   - The sum is divided by the total number of considered thresholds (`len(thresholds[skip_top_thresholds:])`) and the adjusted number of cameras `(n - to_dec)`.

4. **Return**:
   - The function returns the mAA value, representing the average accuracy of the camera positions.

#### Example Usage

```python
import numpy as np

# Example errors and thresholds
errors = np.random.rand(10, 5)  # Errors for 10 cameras and 5 thresholds
thresholds = [0.01, 0.02, 0.05, 0.1, 0.2]
n = 10
skip_top_thresholds = 1
to_dec = 3

# Compute mAA
maa_value = mAA_on_cameras(errors, thresholds, n, skip_top_thresholds, to_dec)
print("Mean Average Accuracy:", maa_value)
```


### Key Concepts

1. **Thresholds**:
   - The error thresholds are used to determine if a camera's error is acceptable. Lower thresholds mean stricter accuracy requirements.

2. **Auxiliary Matrix**:
   - This boolean matrix `aux` helps identify which cameras have errors below the specified thresholds. It's created by comparing each error in `err` to the corresponding threshold in `thresholds`.

3. **Mean Average Accuracy (mAA)**:
   - mAA is a performance metric that averages the number of times errors fall below given thresholds, adjusted by `to_dec`. It provides an overall measure of registration accuracy.

### Practical Applications

- **3D Reconstruction**: Evaluating the accuracy of reconstructed camera positions.
- **SLAM (Simultaneous Localization and Mapping)**: Assessing the precision of camera localization in SLAM algorithms.
- **Augmented Reality**: Ensuring accurate camera placement in augmented reality environments.



In [30]:

def mAA_on_cameras(err, thresholds, n, skip_top_thresholds, to_dec=3):
    
    aux = err[:, skip_top_thresholds:] < np.expand_dims(np.asarray(thresholds[skip_top_thresholds:]), axis=0)
    return np.sum(np.maximum(np.sum(aux, axis=0) - to_dec, 0)) / (len(thresholds[skip_top_thresholds:]) * (n - to_dec))





---

# Function to Extract Camera Centers from DataFrame 📷

#### Explanation of the Code

The function `get_camera_centers_from_df` extracts the camera centers from a given DataFrame. The DataFrame contains columns with the rotation matrix and translation vector for each image, and the function computes the camera center for each image using these values.

1. **Function Definition**:
   - The function `get_camera_centers_from_df` takes a single parameter `df`, which is a pandas DataFrame containing the following columns:
     - `image_path`: The file path of the image.
     - `rotation_matrix`: The rotation matrix as a semicolon-separated string.
     - `translation_vector`: The translation vector as a semicolon-separated string.

2. **Initialize Output Dictionary**:
   - An empty dictionary `out` is initialized to store the camera centers.

3. **Iterate Over DataFrame Rows**:
   - The function iterates over each row in the DataFrame using `df.iterrows()`. For each row:
     - Extract the file name, rotation matrix, and translation vector.
     - Convert the rotation matrix and translation vector from strings to numpy arrays.
     - Compute the camera center using the formula \( \text{center} = -R^T \cdot t \), where \( R \) is the rotation matrix and \( t \) is the translation vector.
     - Store the camera center in the output dictionary with the image path as the key.

4. **Return Output Dictionary**:
   - The function returns the dictionary `out` containing the camera centers.

#### Example Usage

```python
import pandas as pd

# Example DataFrame
data = {
    'image_path': ['img1.jpg', 'img2.jpg'],
    'rotation_matrix': ['1;0;0;0;1;0;0;0;1', '0.866;0;-0.5;0;1;0;0.5;0;0.866'],
    'translation_vector': ['1;2;3', '4;5;6']
}
df = pd.DataFrame(data)

# Get camera centers
camera_centers = get_camera_centers_from_df(df)
print(camera_centers)
```

### Key Concepts

1. **Rotation Matrix and Translation Vector**:
   - The rotation matrix \( R \) represents the orientation of the camera.
   - The translation vector \( t \) represents the position of the camera relative to some reference point.

2. **Camera Center Calculation**:
   - The camera center in world coordinates is calculated as \( \text{center} = -R^T \cdot t \), where \( R^T \) is the transpose of the rotation matrix.

3. **DataFrame Iteration**:
   - The function uses `df.iterrows()` to iterate over each row in the DataFrame, which allows for accessing and processing each row individually.



In [31]:
def get_camera_centers_from_df(df):
    out = {}
    for row in df.iterrows():
        row = row[1]
        fname = row['image_path']
        R = np.array([float(x) for x in (row['rotation_matrix'].split(';'))]).reshape(3,3)
        t = np.array([float(x) for x in (row['translation_vector'].split(';'))]).reshape(3)
        center = -R.T @ t
        out[fname] = center
    return out




---

# Evaluation of Reconstruction with Mean Average Accuracy (mAA) Metric 📈

#### Explanation of the Code

The function `evaluate_rec` evaluates the reconstruction accuracy by comparing the user-provided camera centers with the ground truth camera centers using the mean Average Accuracy (mAA) metric. The process involves extracting camera centers, registering them, and computing the mAA metric.

1. **Function Definition**:
   - The function `evaluate_rec` takes the following parameters:
     - `gt_df`: Ground truth DataFrame containing camera centers.
     - `user_df`: User DataFrame containing predicted camera centers.
     - `inl_cf`: Inlier coefficient for RANSAC.
     - `strict_cf`: Strict inlier coefficient for RANSAC.
     - `skip_top_thresholds`: Number of top thresholds to skip in mAA calculation.
     - `to_dec`: Number of cameras to be used in mAA calculation.
     - `thresholds`: List of thresholds for RANSAC.

2. **Get Camera Centers**:
   - Extract camera centers from the ground truth and user DataFrames using the `get_camera_centers_from_df` function.

3. **Prepare Data for Evaluation**:
   - Initialize a list `good_cams` to store images present in both ground truth and user data.
   - Initialize matrices `u_cameras` and `g_cameras` to store the camera centers for the images in `good_cams`.

4. **Fill Matrices with Camera Centers**:
   - Populate the matrices `u_cameras` and `g_cameras` with the camera centers for the corresponding images in `good_cams`.

5. **Register Camera Centers**:
   - Register the user camera centers to the ground truth camera centers using the `register_by_Horn` function. This function finds the best transformation matrix for each threshold.

6. **Compute mAA**:
   - Calculate the mean Average Accuracy (mAA) using the `mAA_on_cameras` function, which measures the accuracy of the registered camera centers against the ground truth.

7. **Return mAA**:
   - Return the computed mAA value.

#### Example Usage

```python
import pandas as pd

# Example ground truth and user DataFrames
gt_data = {
    'image_path': ['img1.jpg', 'img2.jpg'],
    'rotation_matrix': ['1;0;0;0;1;0;0;0;1', '0.866;0;-0.5;0;1;0;0.5;0;0.866'],
    'translation_vector': ['1;2;3', '4;5;6']
}
user_data = {
    'image_path': ['img1.jpg', 'img2.jpg'],
    'rotation_matrix': ['0.866;0;-0.5;0;1;0;0.5;0;0.866', '1;0;0;0;1;0;0;0;1'],
    'translation_vector': ['1;2;3', '4;5;6']
}
gt_df = pd.DataFrame(gt_data)
user_df = pd.DataFrame(user_data)

# Evaluate reconstruction
mAA = evaluate_rec(gt_df, user_df)
print(f'mAA: {mAA * 100:.2f}%')
```

### Key Concepts

1. **Camera Center Extraction**:
   - The camera centers are extracted using the rotation matrix and translation vector for each image.

2. **RANSAC (Random Sample Consensus)**:
   - RANSAC is used to find the best transformation matrix by iteratively selecting random subsets of the data and fitting a model.

3. **mAA (mean Average Accuracy)**:
   - mAA measures the accuracy of the registered camera centers against the ground truth camera centers over different thresholds.

4. **Horn's Method**:
   - Horn's method is used for registering the camera centers by finding the optimal transformation matrix.



In [32]:

def evaluate_rec(gt_df, user_df, inl_cf = 0.8, strict_cf=0.5, skip_top_thresholds=2, to_dec=3,
                 thresholds=[0.005, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.15, 0.2]):
    # get camera centers
    ucameras = get_camera_centers_from_df(user_df)
    gcameras = get_camera_centers_from_df(gt_df)    

    # the denominator for mAA ratio
    m = gt_df.shape[0]
    
    # get the image list to use
    good_cams = []
    for image_path in gcameras.keys():
        if image_path in ucameras.keys():
            good_cams.append(image_path)
        
    # put corresponding camera centers into matrices
    n = len(good_cams)
    u_cameras = np.zeros((3, n))
    g_cameras = np.zeros((3, n))
    
    ii = 0
    for i in good_cams:
        u_cameras[:, ii] = ucameras[i]
        g_cameras[:, ii] = gcameras[i]
        ii += 1
        
    # Horn camera centers registration, a different best model for each camera threshold
    model = register_by_Horn(u_cameras, g_cameras, np.asarray(thresholds), inl_cf, strict_cf)
    
    # transformation matrix
#     print("\nTransformation matrix for maximum threshold")
    T = np.squeeze(model['transf_matrix'][-1])
#     print(T)
    
    # mAA
    mAA = mAA_on_cameras(model["err"], thresholds, m, skip_top_thresholds, to_dec)
    # print(f'mAA = {mAA * 100 : .2f}% considering {m} input cameras - {to_dec}')
    return mAA






---

# Scoring Evaluation Function 🔍📊

#### Explanation of the Code

The `score` function computes the overall score of a submission by evaluating the reconstruction performance across different scenes or datasets. It calculates the mean Average Accuracy (mAA) across all scenes.

1. **Function Definition**:
   - The `score` function takes two parameters:
     - `solution`: A DataFrame containing ground truth camera centers.
     - `submission`: A DataFrame containing user-submitted camera centers.

2. **Scene-wise Evaluation**:
   - Iterate over each scene in the solution DataFrame.
   - Sort both the ground truth and user-submitted DataFrames based on the image paths in ascending order.
   - Call the `evaluate_rec` function to compute the reconstruction accuracy (`mAA`) for each scene.

3. **Compute Overall Score**:
   - Calculate the mean mAA across all scenes to get the overall score for the submission.

4. **Return Score**:
   - Return the computed score as a float value.

#### Example Usage

```python
import pandas as pd

# Example ground truth and user DataFrames
solution_data = {
    'image_path': ['img1.jpg', 'img2.jpg'],
    'dataset': ['scene1', 'scene1'],
    'rotation_matrix': ['1;0;0;0;1;0;0;0;1', '0.866;0;-0.5;0;1;0;0.5;0;0.866'],
    'translation_vector': ['1;2;3', '4;5;6']
}
submission_data = {
    'image_path': ['img1.jpg', 'img2.jpg'],
    'dataset': ['scene1', 'scene1'],
    'rotation_matrix': ['0.866;0;-0.5;0;1;0;0.5;0;0.866', '1;0;0;0;1;0;0;0;1'],
    'translation_vector': ['1;2;3', '4;5;6']
}
solution_df = pd.DataFrame(solution_data)
submission_df = pd.DataFrame(submission_data)

# Compute score
overall_score = score(solution_df, submission_df)
print(f'Overall Score: {overall_score:.4f}')
```


### Key Concepts

1. **Scoring Function**:
   - The `score` function evaluates the overall performance of a submission by computing the mean mAA across different scenes.

2. **Scene-wise Evaluation**:
   - It iterates over each scene and computes the mAA metric for that scene individually.

3. **Mean Average Accuracy (mAA)**:
   - mAA measures the accuracy of the registered camera centers against the ground truth camera centers.

4. **Pandas DataFrame Operations**:
   - It performs sorting and filtering operations on DataFrames to process ground truth and user-submitted data.


In [33]:
def score(solution: pd.DataFrame, submission: pd.DataFrame) -> float:
    
    scenes = list(set(solution['dataset'].tolist()))
    results_per_dataset = []
    for dataset in scenes:
        print(f"\n*** {dataset} ***")
#         start = time.time()
        gt_ds = solution[solution['dataset'] == dataset]
        user_ds = submission[submission['dataset'] == dataset]
        gt_ds = gt_ds.sort_values(by=['image_path'], ascending = True)
        user_ds = user_ds.sort_values(by=['image_path'], ascending = True)
        result = evaluate_rec(gt_ds, user_ds, inl_cf=0, strict_cf=-1, skip_top_thresholds=0, to_dec=3,
                 thresholds=translation_thresholds_meters_dict[dataset])
#         end = time.time()
        print(f"\nmAA: {round(result,4)}")
#         print("Running time: %s" % (end - start))        
        results_per_dataset.append(result)
    return float(np.array(results_per_dataset).mean())


---

# Function to Generate Image Pairs for Matching 🖼️🔍

#### Explanation of the Code

The `get_pairs` function generates pairs of images from a list of images for matching. It considers two modes of operation: exhaustive and non-exhaustive.

1. **Exhaustive Mode**:
   - If the `EXHAUSTIVE` flag is set to `True`, the function generates all possible combinations of image pairs using `itertools.combinations`.

2. **Non-exhaustive Mode**:
   - If the flag is set to `False`, the function extracts features from each image using a pre-trained DINOv2 model.
   - It computes embeddings for each image and calculates pairwise distances between embeddings.
   - Image pairs with distances below a threshold (`DISTANCES_THRESHOLD`) are considered potential matches.
   - To ensure a minimum number of pairs (`MIN_PAIRS`) for each image, additional pairs are selected based on distance rankings.
   - Pairs with distances above a tolerance level (`TOLERANCE`) are filtered out.
   - The function returns a list of unique pairs of indices representing matching image pairs.

3. **Input**:
   - `images_list`: A list of image paths.
   - `device`: The device (CPU or GPU) to use for processing.

4. **Output**:
   - A list of tuples representing pairs of indices, where each tuple contains two indices indicating the positions of matching images in the input list.

#### Example Usage

```python
# Example usage with a list of image paths
images_list = ['image1.jpg', 'image2.jpg', 'image3.jpg']
pairs = get_pairs(images_list)
print(pairs)
```

### Key Concepts

1. **Feature Extraction with DINOv2**:
   - The function utilizes a pre-trained DINOv2 model to extract embeddings from images.
   - These embeddings represent the features of each image.

2. **Pairwise Distance Calculation**:
   - It computes pairwise distances between the embeddings of all images.
   - Images with distances below a certain threshold are considered potential matches.

3. **Filtering and Ranking**:
   - The function filters out pairs with distances above a tolerance level.
   - It ensures a minimum number of pairs for each image by selecting additional pairs based on distance rankings.

4. **Optimization**:
   - The function optimizes the process to handle large datasets efficiently by computing distances only once.



In [34]:
def get_pairs(images_list,device=DEVICE):
    if EXHAUSTIVE:
        return list(combinations(range(len(images_list)), 2)) 
    
    processor = AutoImageProcessor.from_pretrained('/kaggle/input/dinov2/pytorch/base/1/')
    model = AutoModel.from_pretrained('/kaggle/input/dinov2/pytorch/base/1/').eval().to(DEVICE)
    embeddings = []
    
    for img_path in images_list:
        image = K.io.load_image(img_path, K.io.ImageLoadType.RGB32, device=DEVICE)[None, ...]
        with torch.inference_mode():
            inputs = processor(images=image, return_tensors="pt", do_rescale=False ,do_resize=True, 
                               do_center_crop=True, size=224).to(DEVICE)
            outputs = model(**inputs)
            embedding = F.normalize(outputs.last_hidden_state.max(dim=1)[0])
        embeddings.append(embedding)
        
    embeddings = torch.cat(embeddings, dim=0)
    distances = torch.cdist(embeddings,embeddings).cpu()
    distances_ = (distances <= DISTANCES_THRESHOLD).numpy()
    np.fill_diagonal(distances_,False)
    z = distances_.sum(axis=1)
    idxs0 = np.where(z == 0)[0]
    for idx0 in idxs0:
        t = np.argsort(distances[idx0])[1:MIN_PAIRS]
        distances_[idx0,t] = True
        
    s = np.where(distances >= TOLERANCE)
    distances_[s] = False
    
    idxs = []
    for i in range(len(images_list)):
        for j in range(len(images_list)):
            if distances_[i][j]:
                idxs += [(i,j)] if i<j else [(j,i)]
    
    idxs = list(set(idxs))
    return idxs

---

# Matching Keypoints between Images 📸🔑

#### Explanation of the Code

The `keypoints_matches` function matches keypoints between pairs of images from a list of images.

1. **Feature Extraction and Matching**:
   - It utilizes ALIKED for keypoint detection and LightGlueMatcher for matching keypoints between images.
   - The function first extracts keypoints and descriptors for each image using ALIKED.
   - It then matches descriptors between pairs of images using LightGlueMatcher.
   
2. **Creating and Storing Keypoints and Descriptors**:
   - Keypoints and descriptors are saved in separate HDF5 files (`keypoints.h5` and `descriptors.h5`).
   - Each image's keypoints and descriptors are stored under their respective image names in the HDF5 files.

3. **Storing Matches**:
   - Matches between keypoints of image pairs are stored in another HDF5 file (`matches.h5`).
   - For each pair of images, if the number of matches exceeds a minimum threshold (`MIN_MATCHES`), the indices of matched keypoints are stored.

#### Input Parameters

- `images_list`: A list of image paths.
- `pairs`: A list of tuples representing pairs of indices of matching images.

#### Output

- Keypoints, descriptors, and matches are stored in HDF5 files (`keypoints.h5`, `descriptors.h5`, and `matches.h5`, respectively).

#### Example Usage

```python
# Example usage with a list of image paths and pairs of matching images
images_list = ['image1.jpg', 'image2.jpg', 'image3.jpg']
pairs = [(0, 1), (1, 2)]  # Example pairs of matching images
keypoints_matches(images_list, pairs)
```

#### Key Concepts

1. **Keypoint Detection and Description**:
   - ALIKED is used for keypoint detection and description, providing robust feature extraction.
   
2. **Keypoint Matching**:
   - LightGlueMatcher performs matching between keypoints of different images based on their descriptors.
   - Matches are stored as indices of keypoints in the HDF5 file.

3. **Efficient Storage**:
   - HDF5 format is utilized for efficient storage of keypoints, descriptors, and matches.

4. **Data Persistence**:
   - The function ensures data persistence by storing keypoints, descriptors, and matches in HDF5 files, facilitating later retrieval and analysis.

### References
- [ALIKED: An Automatic LInkage of Keypoint Detection](https://arxiv.org/abs/2110.00665)
- [LightGlueMatcher: Efficient Keypoint Matching](https://github.com/rpautrat/LightGlueMatcher)

In [35]:
def keypoints_matches(images_list,pairs):
    extractor = ALIKED(max_num_keypoints=MAX_NUM_KEYPOINTS,detection_threshold=DETECTION_THRESHOLD,resize=RESIZE_TO).eval().to(DEVICE)
    matcher = KF.LightGlueMatcher("aliked", {'width_confidence':-1, 'depth_confidence':-1, 'mp':True if 'cuda' in str(DEVICE) else False}).eval().to(DEVICE)
    rotation = create_model("swsl_resnext50_32x4d").eval().to(DEVICE)
    
    with h5py.File("keypoints.h5", mode="w") as f_kp, h5py.File("descriptors.h5", mode="w") as f_desc:  
        for image_path in images_list:
            with torch.inference_mode():
                image = load_image(image_path).to(DEVICE)
                feats = extractor.extract(image)
                f_kp[image_path.name] = feats["keypoints"].squeeze().cpu().numpy()
                f_desc[image_path.name] = feats["descriptors"].squeeze().detach().cpu().numpy()
                
    with h5py.File("keypoints.h5", mode="r") as f_kp, h5py.File("descriptors.h5", mode="r") as f_desc, \
         h5py.File("matches.h5", mode="w") as f_matches:  
        for pair in pairs:
            key1, key2 = images_list[pair[0]].name, images_list[pair[1]].name
            kp1 = torch.from_numpy(f_kp[key1][...]).to(DEVICE)
            kp2 = torch.from_numpy(f_kp[key2][...]).to(DEVICE)
            desc1 = torch.from_numpy(f_desc[key1][...]).to(DEVICE)
            desc2 = torch.from_numpy(f_desc[key2][...]).to(DEVICE)
            with torch.inference_mode():
                _, idxs = matcher(desc1, desc2, KF.laf_from_center_scale_ori(kp1[None]), KF.laf_from_center_scale_ori(kp2[None]))
            if len(idxs): group = f_matches.require_group(key1)
            if len(idxs) >= MIN_MATCHES: group.create_dataset(key2, data=idxs.detach().cpu().numpy())

---

# RANSAC and Sparse Reconstruction Algorithm 🏗️🔍

#### Explanation of the Code

The `ransac_and_sparse_reconstruction` function performs RANSAC-based matching and sparse reconstruction using COLMAP.

1. **Database Creation**:
   - A new COLMAP database is created with a unique name based on the current timestamp.
   - Tables for keypoints, matches, and other necessary entities are initialized in the database.

2. **Adding Keypoints and Matches**:
   - Keypoints are extracted from images in the specified path using a pinhole camera model.
   - Keypoints are added to the COLMAP database along with their descriptors.
   - Matches between keypoints are computed and added to the database.

3. **Matching and Reconstruction**:
   - Exhaustive matching is performed using SIFT features.
   - Incremental mapping is executed to generate a sparse reconstruction of the scene.
   - The resulting reconstructions are returned.

4. **Output**:
   - The function returns the sparse reconstructions generated by COLMAP.

#### Input Parameter

- `images_path`: Path to the directory containing images to be reconstructed.

#### Output

- Sparse reconstructions of the scene obtained from COLMAP.

#### Example Usage

```python
# Example usage with the path to the directory containing images
images_path = '/path/to/images/directory'
sparse_reconstructions = ransac_and_sparse_reconstruction(images_path)
```

#### Key Concepts

1. **RANSAC (Random Sample Consensus)**:
   - RANSAC is utilized for robust estimation of geometric transformations from noisy data.
   - It helps in identifying reliable matches between keypoints despite outliers.

2. **Sparse Reconstruction**:
   - COLMAP performs sparse reconstruction by estimating the 3D structure of the scene from images.
   - It generates a set of sparse 3D points and camera poses, representing the scene geometry.

3. **Incremental Mapping**:
   - COLMAP's incremental mapping algorithm incrementally adds images and performs bundle adjustment to refine the scene structure.

4. **Efficient Database Management**:
   - COLMAP database efficiently manages keypoints, matches, and other data structures required for reconstruction.

### References
- [COLMAP: Efficient Reconstruction of 3D Scenes from Images](https://colmap.github.io/)
- [RANSAC: Random Sample Consensus](https://www.cs.cmu.edu/~16385/s17/Slides/11.2_RANSAC.pdf)

In [36]:
def ransac_and_sparse_reconstruction(images_path):
    now = datetime.datetime.now()
    time_str = now.strftime("%Y-%m-%d_%H-%M-%S")
    db_name = f'colmap_{time_str}.db'
    db = COLMAPDatabase.connect(db_name)
    db.create_tables()
    fname_to_id = add_keypoints(db, '/kaggle/working/', images_path, '', 'simple-pinhole', False)
    add_matches(db, '/kaggle/working/',fname_to_id)
    db.commit()
    
    pycolmap.match_exhaustive(db_name, sift_options={'num_threads':1})
    maps = pycolmap.incremental_mapping(
        database_path=db_name, 
        image_path=images_path,
        output_path='/kaggle/working/', 
        options=pycolmap.IncrementalPipelineOptions({'min_model_size':MIN_MODEL_SIZE, 'max_num_models':MAX_NUM_MODELS, 'num_threads':1})
    )
    return maps

---

# Parameters Configuration 🛠️

#### Similar Pairs Detection Parameters:
- **Exhaustive**: Determines whether exhaustive pair matching is enabled.
- **Minimum Pairs**: Minimum number of pairs required for exhaustive matching.
- **Distances Threshold**: Threshold for pairwise distance comparison.
- **Tolerance**: Tolerance value for distance comparison.

#### Keypoints Extractor and Matcher Parameters:
- **Max Number of Keypoints**: Maximum number of keypoints to extract from each image.
- **Resize Dimension**: Dimensions to which images are resized before keypoints extraction.
- **Detection Threshold**: Threshold for keypoints detection.
- **Minimum Matches**: Minimum number of matches required for keypoints matching.

#### RANSAC and Sparse Reconstruction Parameters:
- **Minimum Model Size**: Minimum number of images required for a valid reconstruction model.
- **Maximum Number of Models**: Maximum number of reconstruction models to generate.

#### Cross-Validation Parameters:
- **Number of Samples**: Number of samples used for cross-validation.

#### Submission Control:
- **Submission**: Determines whether the system should generate a submission.



In [37]:
# SIMILLIAR PAIRS
EXHAUSTIVE = True
MIN_PAIRS = 50
DISTANCES_THRESHOLD = 0.3
TOLERANCE = 500

# KEYPOINTS EXTRACTOR AND MATCHER
MAX_NUM_KEYPOINTS = 4096
RESIZE_TO = 1280
DETECTION_THRESHOLD = 0.005
MIN_MATCHES = 100

# RANSAC AND SPARSE RECONSTRUCTION
MIN_MODEL_SIZE = 4
MAX_NUM_MODELS = 3

# CROSS VALIDATION
N_SAMPLES = 50

SUBMISSION = True

---

**Explaination**

1. Defining a function `image_path(row)` to create the image paths for the train dataset.
2. Reading the train dataset from a CSV file and applying the `image_path` function to generate the image paths.
3. Sampling a specified number of image paths from each group in the train dataset.
4. Creating a ground truth dataframe (`gt_df`) containing the sampled image paths.
5. Creating a prediction dataframe (`pred_df`) from the ground truth dataframe.
6. Saving the prediction dataframe to a CSV file named `pred_df.csv`.
7. Running the evaluation pipeline (`run`) with the prediction dataframe.
8. Reading the generated submission CSV file (`submission.csv`).
9. Calculating the mean Average Accuracy (mAA) score between the ground truth and prediction dataframes.

Finally, the total mAA score is printed.

### Example Usage
```python
if not SUBMISSION:
    # Execute the operations
```

Ensure that the `SUBMISSION` variable is appropriately set before running this code block.

In [38]:
if not SUBMISSION:
    def image_path(row):
        row['image_path'] = 'train/' + row['dataset'] + '/images/' + row['image_name']
        return row

    train_df = pd.read_csv(f'{IMC_PATH}/train/train_labels.csv')
    train_df = train_df.apply(image_path,axis=1).drop_duplicates(subset=['image_path'])
    G = train_df.groupby(['dataset','scene'])['image_path']
    image_paths = []
    
    for g in G:
        n = N_SAMPLES
        n = n if n < len(g[1]) else len(g[1])
        g = g[0],g[1].sample(n,random_state=42).reset_index(drop=True)
        for image_path in g[1]:
            image_paths.append(image_path)
        
    gt_df = train_df[train_df.image_path.isin(image_paths)].reset_index(drop=True)
    pred_df = gt_df[['image_path','dataset','scene','rotation_matrix','translation_vector']]
    pred_df.to_csv('pred_df.csv',index=False)
    run('pred_df.csv', get_pairs, keypoints_matches, ransac_and_sparse_reconstruction, submit=False)
    pred_df = pd.read_csv('submission.csv')
    mAA = round(score(gt_df, pred_df),4)
    print('*** Total mean Average Accuracy ***')
    print(f"mAA: {mAA}")

---

**Explaination**

1. It specifies the `data_path` variable to the location of the sample submission CSV file.
2. It calls the `run` function with the specified parameters: `data_path`, `get_pairs`, `keypoints_matches`, and `ransac_and_sparse_reconstruction`.

This block essentially runs the entire pipeline for generating submissions using the provided functions and sample submission data.



In [39]:
if SUBMISSION:
    data_path = IMC_PATH + "/sample_submission.csv"
    run(data_path, get_pairs, keypoints_matches, ransac_and_sparse_reconstruction)

---

## 🌟 Keep Exploring! 🌟

Thanks a bunch for diving into this notebook! If you had a blast or learned something new, why not dive into more of my captivating projects and contributions on my profile?

👉 [Let's Explore More!](https://www.kaggle.com/zulqarnainalipk) 👈

[GitHub](https://github.com/zulqarnainalipk) |
[LinkedIn](https://www.linkedin.com/in/zulqarnainalipk/)

## 💬 Share Your Thoughts! 💡

Your feedback is like treasure to us! Your brilliant ideas and insights fuel our ongoing improvement. Got something to say, ask, or suggest? Don't hold back!

📬 Drop me a line via email: [zulqar445ali@gmail.com](mailto:zulqar445ali@gmail.com)

Huge thanks for your time and engagement. Your support is like rocket fuel propelling me to create even more epic content.
Keep coding joyfully and wishing you stellar success in your data science adventures! 🚀
