# Overlap Reduction Experiment  

## Objective  
This experiment aims to determine the **minimum overlap required** to achieve accurate **3D alignment and reconstruction** while using fewer images.  

### Dataset Selection  
- A set of images of the **same building** is selected, captured with **high initial overlap** (e.g., **80% frontal, 60% lateral**).  
- Preferably, the dataset should feature a **complex structure** (facades, corners, textured roofs) to challenge the reconstruction process.  

### Creating Subsets with Different Overlaps  
Multiple subsets are generated by **progressively reducing** the overlap:  

| Dataset  | Frontal Overlap | Lateral Overlap | Number of Images |
|----------|----------------|-----------------|------------------|
| **Full Dataset** | 80% | 60% | 100% (Baseline) |
| **Set 1** | 70% | 50% | 🔽 -X% |
| **Set 2** | 60% | 40% | 🔽 -Y% |
| **Set 3** | 50% | 30% | 🔽 -Z% |

**DeepPrune’s CNN** is applied to select the most informative images, which are then compared with **randomly selected subsets**.  

### Processing in Reality Capture  
For each subset:  
1. The images are loaded into **Reality Capture**.  
2. The **image alignment** process is executed, recording:  
   - Number of aligned images  
   - Number of detected key points  
   - Point cloud quality  
3. The **3D mesh** is generated and evaluated based on:  
   - Presence of **geometric errors**  
   - Presence of **gaps or distortions**  
   - **Processing time** and **memory usage**  

### Comparison and Analysis  
The results are analyzed through visual comparisons:  
- **Number of images vs. Reconstruction accuracy**  
- **Processing time vs. Number of images**  
- **Quality loss in the mesh with lower overlap**  

### Expected Outcome  
The experiment seeks to identify the **optimal point** where the number of images can be **reduced** **without significantly compromising accuracy**.  


#### Preliminary Script
Tasks:
- Extracts visual features from images using ResNet50 to select the most informative ones.
- Filters out redundant images using K-Means clustering, optimizing the selection.
- Generates subsets with different levels of overlap (80%, 70%, 60%, 50%).
- Copies the selected images to folders ready for processing in Reality Capture.

In [None]:
import os
import shutil
import numpy as np
import cv2
import exifread
from sklearn.cluster import KMeans
import torch
import torchvision.transforms as transforms
from torchvision import models
from PIL import Image

# --------------------------
# CONFIGURATION
# --------------------------
BASE_FOLDER = "path_to_photogrammetry_images"  # Path to original images
OUTPUT_FOLDER = "path_to_selected_images"  # Path to save subsets

# Define overlap levels to test
OVERLAP_LEVELS = {"80_60": (80, 60), "70_50": (70, 50), "60_40": (60, 40), "50_30": (50, 30)}

# --------------------------
# FUNCTION TO READ EXIF METADATA
# --------------------------
def get_exif_metadata(image_path):
    with open(image_path, 'rb') as f:
        tags = exifread.process_file(f, stop_tag='EXIF')
    gps_data = {}
    if 'GPS GPSLatitude' in tags and 'GPS GPSLongitude' in tags:
        lat = tags['GPS GPSLatitude'].values
        lon = tags['GPS GPSLongitude'].values
        gps_data = {'latitude': lat, 'longitude': lon}
    return gps_data

# --------------------------
# FUNCTION TO EXTRACT FEATURES WITH CNN (RESNET50)
# --------------------------
def extract_features(image_path, model, transform):
    image = Image.open(image_path).convert('RGB')
    image = transform(image).unsqueeze(0)
    with torch.no_grad():
        features = model(image)
    return features.squeeze().numpy()

# --------------------------
# IMAGE SELECTION BASED ON KEY INFORMATION (DeepPrune)
# --------------------------
def select_optimal_images(image_folder, overlap_percentage, model, transform):
    images = sorted(os.listdir(image_folder))
    feature_vectors = []
    image_paths = []
    
    for img in images:
        img_path = os.path.join(image_folder, img)
        features = extract_features(img_path, model, transform)
        feature_vectors.append(features)
        image_paths.append(img_path)
    
    feature_vectors = np.array(feature_vectors)
    
    # Group similar images with KMeans to avoid redundancy
    n_clusters = max(1, len(images) * overlap_percentage // 100)
    kmeans = KMeans(n_clusters=n_clusters, random_state=42)
    labels = kmeans.fit_predict(feature_vectors)
    
    # Select one representative image per cluster
    selected_images = []
    for i in range(n_clusters):
        idx = np.where(labels == i)[0][0]
        selected_images.append(image_paths[idx])
    
    return selected_images

# --------------------------
# MAIN PROCESSING
# --------------------------
if __name__ == "__main__":
    # Load pre-trained model (ResNet50 without final layer)
    resnet50 = models.resnet50(pretrained=True)
    resnet50 = torch.nn.Sequential(*list(resnet50.children())[:-1])
    resnet50.eval()
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    
    for level, (front_overlap, side_overlap) in OVERLAP_LEVELS.items():
        print(f"Processing overlap {level}...")
        output_path = os.path.join(OUTPUT_FOLDER, level)
        os.makedirs(output_path, exist_ok=True)
        
        selected_images = select_optimal_images(BASE_FOLDER, front_overlap, resnet50, transform)
        
        # Copy selected images to the new folder
        for img in selected_images:
            shutil.copy(img, output_path)
        
        print(f"{len(selected_images)} images selected and saved in {output_path}")
