3DGeoRef

3DGeoRef is an automated pipeline for georeferencing 3D models using synthetic rendering, AI-powered geolocation, and satellite imagery. The system transforms an arbitrary 3D model into a georeferenced asset aligned with real-world geographic coordinates, ready for integration into GIS systems or 3D viewers.

Overview

Given a 3D model (GLB or GLTF format), 3DGeoRef performs the following operations:

Synthetic View Generation: Creates multiple rendered views of the model using Blender
Geolocation Estimation: Estimates geographic coordinates using AI models (GeoCLIP, Ollama, or Gemini)
Satellite Image Download: Retrieves high-resolution satellite imagery from Mapbox API
Image Matching: Performs feature matching between synthetic renders and satellite images using Deep Image Matching
Transformation Computation: Calculates the affine transformation matrix to align the model
Georeferencing: Applies the transformation and elevation alignment to produce the final georeferenced model

Features

Multi-Model AI Geolocation: Choose between GeoCLIP, Ollama (llama3.2-vision), or Google Gemini for location estimation
Automated Pipeline: End-to-end processing from raw 3D model to georeferenced output
Flexible Execution Modes: Run the full pipeline, geolocation only, or image matching only
Docker Support: Fully containerized environment with GPU support for CUDA acceleration
High-Quality Rendering: Blender-based synthetic view generation with HDRI lighting
Robust Image Matching: Integration with Deep Image Matching for accurate feature correspondence
Comprehensive Transformation Library: Advanced 3D transformation utilities for precise alignment

Project Structure

3DGeoRef/
├── main.py                             # Main entry point for the pipeline
├── Dockerfile                          # Docker image for the main pipeline (with Blender & CUDA)
├── docker-compose.yml                  # Docker Compose configuration
├── hdri/                               # HDRI environment maps for rendering
├── pipeline/                           # Core pipeline modules
│   ├── __init__.py
│   ├── core.py                         # PipelineProcessor - main orchestration logic
│   ├── geolocation/                    # Geolocation estimation modules
│   │   ├── __init__.py
│   │   ├── geoclip.py                  # GeoCLIP-based geolocation
│   │   ├── ollama.py                   # Ollama AI-based geolocation
│   │   └── gemini.py                   # Google Gemini-based geolocation
│   ├── georeferencing/                 # Georeferencing and transformation modules
│   │   ├── __init__.py
│   │   ├── dim.py                      # Deep Image Matching integration
│   │   └── transformer.py              # 3D model transformation and alignment
│   ├── rendering/                      # 3D rendering modules
│   │   ├── __init__.py
│   │   └── multiview.py                # Blender-based multi-view synthetic rendering
│   ├── services/                       # External service integrations
│   │   ├── __init__.py
│   │   └── satellite_downloader.py     # Mapbox satellite imagery download
│   └── utils/                          # Utility modules
│       ├── __init__.py
│       └── transformations.py          # 3D transformation matrix utilities
└── README.md                           # This file

Installation

Docker Setup (Recommended)

The easiest way to run 3DGeoRef is using Docker, which provides a pre-configured environment with all dependencies including Blender, CUDA, and Deep Image Matching.

Prerequisites

Docker Engine 20.10+
Docker Compose 2.0+
NVIDIA GPU with CUDA support (optional but recommended)
NVIDIA Container Toolkit (for GPU acceleration)

Build and Run

# Clone the repository
git clone https://github.com/3DOM-FBK/3DGeoRef.git
cd 3DGeoRef

# Pull the pre-built Docker image
docker pull 3domfbk/3d-georef:04032026

# Or build the Docker image locally
docker build -t 3domfbk/3d-georef:04032026 .

# Run the container interactively
docker run --rm -it \
  --gpus all \
  -v /path/to/your/data:/data \
  3domfbk/3d-georef:04032026 bash

Usage

Basic Usage

Run the complete georeferencing pipeline on a 3D model:

python main.py \
  -i /path/to/model.glb \
  -o /path/to/output \
  --geoloc_model gemini

Docker Usage Examples

Example 1: Full Pipeline with Automatic Geolocation

Process a 3D model with automatic geolocation using Gemini AI:

docker run --rm -it \
  --gpus all \
  -v /path/to/data:/data \
  3domfbk/3d-georef:04032026 \
    -i /data/input/model.glb \
    -o /data/output \
    --geoloc_model gemini \
    --gemini_model gemini-2.5-flash \
    --gemini_api_key "YOUR_GEMINI_API_KEY" \
    --mapbox_api_key "YOUR_MAPBOX_API_KEY" \
    --cleanup \
    --streetviews 8 \
    --area_size 500 \
    --zoom 18

Example 2: Using GeoCLIP for Geolocation

Use the GeoCLIP model for faster geolocation (no API key required):

docker run --rm -it \
  --gpus all \
  -v /path/to/data:/data \
  3domfbk/3d-georef:04032026 \
    -i /data/input/building.glb \
    -o /data/output \
    --geoloc_model geoclip \
    --nr_prediction 3 \
    --cleanup

Example 3: Manual Coordinates with DIM Mode

Skip geolocation and run only Deep Image Matching with known coordinates:

docker run --rm -it \
  --gpus all \
  -v /path/to/data:/data \
  3domfbk/3d-georef:04032026 \
    -i /data/input/monument.glb \
    -o /data/output \
    --mode dim \
    --lat 46.0669 \
    --lon 11.1216 \
    --mapbox_api_key "YOUR_MAPBOX_API_KEY" \
    --area_size 300 \
    --zoom 20

Example 4: Using Custom Orthophoto

Use your own orthophoto instead of downloading satellite imagery:

docker run --rm -it \
  --gpus all \
  -v /path/to/data:/data \
  3domfbk/3d-georef:04032026 \
    -i /data/input/site.glb \
    -o /data/output \
    --ortho /data/orthophoto.tif \
    --lat 48.8582 \
    --lon 2.2945

Example 5: Using Docker Compose with Ollama

For using Ollama-based geolocation, use Docker Compose to run both services:

# Start the services
docker-compose up -d

# Wait for Ollama to download the model (first run only)
docker-compose logs -f ollama

# Run the pipeline (in a new terminal)
docker exec -it 3dgeoref_python python main.py \
  -i /data/input/model.glb \
  -o /data/output \
  --geoloc_model ollama

Example 6: Interactive Development Mode

Run the container interactively for development and debugging:

docker run --rm -it \
  --gpus all \
  -v /path/to/data:/data \
  3domfbk/3d-georef:04032026

# Inside the container, you can run commands manually:
python main.py -i /data/input/test.glb -o /data/output --mode geoloc

Command-Line Arguments

Argument	Type	Default	Description
`-i, --input_file`	str	required	Path to input 3D model (.glb/.gltf)
`-o, --output_folder`	str	required	Output folder for results
`--streetviews`	int	5	Number of street-view style renderings around the model
`--nr_prediction`	int	1	Number of GPS predictions (GeoCLIP only)
`--area_size`	int	500	Side length of square area to download (meters)
`--zoom`	int	18	Satellite imagery zoom level (18-20 recommended)
`--lat`	float	None	Manual latitude (skips geolocation if provided with --lon)
`--lon`	float	None	Manual longitude (skips geolocation if provided with --lat)
`--ortho`	str	None	Path to custom orthophoto (skips satellite download)
`--mode`	str	auto	Execution mode: `auto`, `geoloc`, or `dim`
`--geoloc_model`	str	gemini	Geolocation model: `geoclip`, `ollama`, or `gemini`
`--gemini_model`	str	gemini-2.5-flash	Gemini model version to use
`--gemini_api_key`	str	None	API Key for Google Gemini (can also be set via env var)
`--mapbox_api_key`	str	None	API Key for Mapbox (can also be set via env var)
`--cleanup`	bool	False	Delete temporary working directory in /tmp after execution

Execution Modes

auto (default): Full pipeline from 3D model to georeferenced output
geoloc: Only perform geolocation estimation and stop
dim: Skip geolocation, perform only Deep Image Matching (requires --lat and --lon)

Pipeline Workflow

The complete pipeline follows these steps:

1. Synthetic View Generation (`pipeline/rendering/multiview.py`)

Loads the 3D model into Blender
Computes bounding box and optimal camera positions
Generates orthographic top-down view
Creates multiple street-view perspective renderings
Applies HDRI lighting for realistic appearance
Exports rendered images and scaled model

2. Geolocation Estimation (`pipeline/geolocation/`)

GeoCLIP (geoclip.py): Fast, offline geolocation using CLIP embeddings
Ollama (ollama.py): Vision-language model (llama3.2-vision) for location reasoning
Gemini (gemini.py): Google's multimodal AI for high-accuracy geolocation
Filters out top-down views for better accuracy
Returns GPS coordinates (latitude, longitude)

3. Satellite Image Download (`pipeline/services/satellite_downloader.py`)

Queries Mapbox Static API for satellite tiles
Downloads tiles at specified zoom level
Stitches tiles into georeferenced mosaic
Exports as GeoTIFF with proper coordinate system

4. Image Matching (`pipeline/georeferencing/dim.py`)

Integrates with Deep Image Matching (DIM)
Extracts and matches keypoints between synthetic renders and satellite imagery
Uses pycolmap for robust feature matching
Computes homography and affine transformation matrices

5. Model Transformation (`pipeline/georeferencing/transformer.py`)

Applies computed transformation to 3D model
Aligns model to correct elevation using DEM data
Handles coordinate system conversions (WGS84, UTM, local)
Exports georeferenced model in original format

6. Output Generation (`pipeline/core.py`)

Saves georeferenced 3D model
Exports transformation matrices
Generates debug visualizations
Creates processing logs and metadata

Module Documentation

Core Module (`pipeline/core.py`)

PipelineProcessor: Main orchestration class that manages the entire pipeline.

Handles logging and temporary directory management
Coordinates all pipeline stages
Manages error handling and recovery
Provides progress tracking

Geolocation Modules (`pipeline/geolocation/`)

geoclip.py: GeoCLIP-based geolocation using CLIP embeddings
ollama.py: Ollama AI integration for vision-language geolocation
gemini.py: Google Gemini API integration for multimodal geolocation

Georeferencing Modules (`pipeline/georeferencing/`)

dim.py: Deep Image Matching integration for feature correspondence
transformer.py: 3D transformation and georeferencing utilities

Rendering Module (`pipeline/rendering/`)

multiview.py: Blender-based synthetic view generation with HDRI lighting

Services Module (`pipeline/services/`)

satellite_downloader.py: Mapbox satellite imagery download and mosaicking

Utils Module (`pipeline/utils/`)

transformations.py: Comprehensive 3D transformation library
- Rotation matrices (Euler angles, quaternions, axis-angle)
- Translation and scaling
- Homography and affine transformations
- Matrix decomposition and composition
- Coordinate system conversions

Requirements

System Requirements

OS: Linux (Ubuntu 22.04+ recommended), Windows with WSL2
RAM: 16 GB minimum, 32 GB recommended
GPU: NVIDIA GPU with 8+ GB VRAM (optional but highly recommended)
Storage: 10 GB for Docker images, additional space for data

Software Dependencies

Blender 4.4.0+
Python 3.9+
CUDA 12.1+ (for GPU acceleration)
Deep Image Matching (dev branch)

Python Packages

See Dockerfile for complete list. Key dependencies:

torch, torchvision (PyTorch)
geoclip (geolocation)
pycolmap (image matching)
trimesh, open3d (3D processing)
rasterio (geospatial data)
google-genai (Gemini API)
ollama (Ollama integration)

License

This project is developed by 3DOM-FBK (Fondazione Bruno Kessler, 3D Optical Metrology unit).

For licensing information, please contact the authors.

Future Updates

Improved Elevation Alignment: Refine the elevation application to the 3D model for better integration with Cesium. This involves addressing potential discrepancies between ellipsoidal and geodetic height formats when fetching data from OpenTopoData.
Support for Multiple 3D Formats: Extend the input pipeline to support various 3D formats, specifically point clouds. Currently, the pipeline is optimized for GLB models.

Changelog

2026-01-16

Image Matching Improvement: Integrated the usage of SuperPoint+SuperGlue combined with LoFTR in the Deep Image Matching (DIM) step for more robust feature correspondence.
Nadir Dimension Estimation: Implemented automatic estimation of the nadir image dimensions using the Gemini model.

Acknowledgments

Deep Image Matching: 3DOM-FBK/deep-image-matching
GeoCLIP: Geolocation estimation using CLIP
Blender: Open-source 3D creation suite
Google Gemini: Multimodal AI for geolocation
Ollama: Local AI model inference

Contact

For questions, issues, or contributions, please open an issue on the GitHub repository or contact:

3DOM-FBK
Fondazione Bruno Kessler
Via Sommarive 18, 38123 Trento, Italy
https://3dom.fbk.eu

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
hdri		hdri
pipeline		pipeline
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py

Folders and files

Latest commit

History

Repository files navigation

3DGeoRef

Table of Contents

Overview

Features

Project Structure

Installation

Docker Setup (Recommended)

Prerequisites

Build and Run

Usage

Basic Usage

Docker Usage Examples

Example 1: Full Pipeline with Automatic Geolocation

Example 2: Using GeoCLIP for Geolocation

Example 3: Manual Coordinates with DIM Mode

Example 4: Using Custom Orthophoto

Example 5: Using Docker Compose with Ollama

Example 6: Interactive Development Mode

Command-Line Arguments

Execution Modes

Pipeline Workflow

1. Synthetic View Generation (pipeline/rendering/multiview.py)

2. Geolocation Estimation (pipeline/geolocation/)

3. Satellite Image Download (pipeline/services/satellite_downloader.py)

4. Image Matching (pipeline/georeferencing/dim.py)

5. Model Transformation (pipeline/georeferencing/transformer.py)

6. Output Generation (pipeline/core.py)

Module Documentation

Core Module (pipeline/core.py)

Geolocation Modules (pipeline/geolocation/)

Georeferencing Modules (pipeline/georeferencing/)

Rendering Module (pipeline/rendering/)

Services Module (pipeline/services/)

Utils Module (pipeline/utils/)

Requirements

System Requirements

Software Dependencies

Python Packages

License

Future Updates

Changelog

2026-01-16

Acknowledgments

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Synthetic View Generation (`pipeline/rendering/multiview.py`)

2. Geolocation Estimation (`pipeline/geolocation/`)

3. Satellite Image Download (`pipeline/services/satellite_downloader.py`)

4. Image Matching (`pipeline/georeferencing/dim.py`)

5. Model Transformation (`pipeline/georeferencing/transformer.py`)

6. Output Generation (`pipeline/core.py`)

Core Module (`pipeline/core.py`)

Geolocation Modules (`pipeline/geolocation/`)

Georeferencing Modules (`pipeline/georeferencing/`)

Rendering Module (`pipeline/rendering/`)

Services Module (`pipeline/services/`)

Utils Module (`pipeline/utils/`)

Packages