Skip to content

Oshkr/mmlandmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

🏢 MMLandmarks Dataset

Paper Dataset Website

Description

Explore MMLandmarks [here].

Welcome to the MultiModal Landmarks (MMLandmarks) dataset, part of our CVPR 2026 paper:

MMLandmarks: a Cross-View Instance-Level Benchmark for Geo-Spatial Understanding

The codebase with the training and evaluation setup for our results can be found here.

With this dataset, Cross-View Localization is extended for the first time to a continental scale at a fine-grained level. The dataset collection process is inspired by the Google Landmarks Dataset v2 (GLDv2), which is combined with information from OpenStreetMaps (OSM) and the National Agriculture Imagery Program (NAIP). It has been collected to enable training models for various geospatial tasks, including Geolocalization, Cross-View Ground-to-Satellite and Satellite-to-Ground localization, and Any-to-Any retrieval.

MMLandmarks is built from $18{,}557$ landmarks in the United States of America, which have associated Wikipedia and Wikimedia Commons pages. For each landmark, multiple ground and aerial images are collected, while each landmark has a unique GPS coordinate taken as the geographical center from OSM, and text descriptions collected from Wikimedia Commons. The total dataset contains $329k$ Ground images, $197k$ Aerial images, $18{,}557$ GPS coordinates and $18{,}557$ Text descriptions, split into 3 sets: train, query, and index.

Dataset Statistics

Split Landmarks Ground Images Satellite Images GPS Coordinates Text Descriptions
train 17,557 310,661 186,574 17,557 17,557
query 1,000 18,688 10,631 1,000 1,000
index (ground) 714,554
index (satellite) 99,539 99,539

We would like to acknowledge the work of Tobias Weyand, Andre Araujo, Bingyi Cao and Jack Sim, and thank them for their comprehensive GLDv2 repository which has greatly inspired the structure for this repository.

General Information

Download the following CSV file containing information about all $18{,}557$ landmarks with the following link:

https://archive.compute.dtu.dk/downloads/public/projects/MMLandmarks/mmlandmarks.csv

  • mmlandmarks.csv CSV with landmark_id, CommonsCategory, WikipediaPage, lat, lon, min_lat, min_lon, max_lat, max_lon, QID, osm_type, osm_id, category, state, hierarchical_category fields. The file gives a full overview of the dataset.
    • landmark_id: integer from 0 to 18557.
    • CommonsCategory: string with the landmark's Wikimedia Commons Category webpage.
    • WikipediaPage: string with the landmark's Wikipedia webpage.
    • lat, lon: GPS coordinates of the landmark's geographical center.
    • [min_lat,min_lon,max_lat,max_lon]: bounding box from the landmark OSM polygon.
    • QID: string with the landmark's Wikidata identifier.
    • osm_type, osm_id: string with the type and id for the OSM polygon. Can be found with https://www.openstreetmap.org/osm_type/osm_id to retrieve the associated landmark polygon information.
    • category: string referring to the type of the landmark found from Wikimedia.
    • state: string referring to the state in which the landmark is located.
    • hierarchical_category: string corresponding to the landmark's hierarchical label using the hierarchical extension of GLDv2.

Getting started

Follow the instructions below for downloading the different parts of the dataset. get_started.ipynb gives a comprehensive introduction to navigate the dataset.

Dataset structure

After downloading and extracting the full dataset, the directory structure is:

MMLandmarks/
├── mmlandmarks.csv│
├── train/
│   ├── mml_train.csv                       
│   ├── mml_train_ground.csv                
│   ├── mml_train_ground_subset.csv         
│   ├── mml_train_satellite.csv            
│   ├── mml_train_text.csv                  
│   ├── mml_train_licenses.csv              
│   ├── ground/                            
│   │   └── {a}/{b}/{c}/{image_id}.jpg
│   ├── satellite/                          
│   │   └── {a}/{b}/{c}/{image_id}.png
│   └── text/                              
│       └── {a}/{b}/{c}/{text_id}.json
│
├── index/
│   ├── mml_index_ground.csv                
│   ├── mml_index_satellite.csv             
│   ├── ground/                             
│   │   └── {a}/{b}/{c}/{image_id}.jpg
│   └── satellite/                          
│       └── {a}/{b}/{c}/{image_id}.png
│
└── query/
    ├── mml_query.csv                       
    ├── mml_query_ground.csv                
    ├── mml_query_satellite.csv             
    ├── mml_query_text.csv                  
    ├── mml_query_all_satellite.csv         
    ├── mml_query_text_sentences.csv        
    ├── mml_query_licenses.csv              
    ├── ground/                            
    │   └── {a}/{b}/{c}/{image_id}.jpg
    ├── satellite/                          
    │   └── {a}/{b}/{c}/{image_id}.png
    └── text/                            
        └── {a}/{b}/{c}/{text_id}.json

Where {a}, {b}, {c} are the first three characters of the image/json id. For example, a ground image with id 0123456789abcdef is stored at train/ground/0/1/2/0123456789abcdef.jpg.

Download train set

The training set contains $17{,}557$ landmarks with: $310k$ Ground images, $186k$ Satellite images, $17{,}557$ GPS coordinates and $17{,}557$ Text descriptions

Downloading the labels and metadata

Downloading the data:

The train/ground is split into 80 TAR files (each of size ~800MB), train/satellite is split into 200 TAR files (each of size ~850MB) and train/text has 1 TAR file (of size ~106MB). The files are located in the train/(ground/satellite/text) directory, and are e.g. named images_000.tar, images_001.tar, ..., images_079.tar for the ground files. To download them, access the following link:

https://archive.compute.dtu.dk/downloads/public/projects/MMLandmarks/train/ground/images_000.tar

And similarly for the other files.

Using the provided script

mkdir train && cd train

# Downloads all modalities (ground/satellite/text) for train
bash ../mml-download.sh train

# Downloads ground images for train
bash ../mml-download.sh train ground 80

Download index set

The Index set is a large collection of Ground and Aerial images used as a challenging gallery from which to retrieve the correct corresponding landmark information:

  • Ground index: $714{,}554$ images from the GLDv2 index set, where the landmarks in MMLandmarks are filtered out.
  • Satellite index: $99{,}539$ images sampled from the NAIP, with the same distribution as MMLandmarks.
  • GPS index: $99{,}539$ GPS coordinates taken as the centers of the Satellite index set images.

Downloading the labels and metadata

Downloading the data:

The index/ground is split into 80 TAR files (each of size ~1GB), and index/satellite is split into 120 TAR files (each of size ~1GB). The files are located in the index/(ground/satellite) directory, and are e.g. named images_000.tar, images_001.tar, ..., images_079.tar for the ground files. To download them, access the following link:

https://archive.compute.dtu.dk/downloads/public/projects/MMLandmarks/index/ground/images_000.tar

And similarly for the other files.

Using the provided script

mkdir index && cd index

# Downloads all modalities (ground/satellite) for index
bash ../mml-download.sh index

# Downloads satellite images for index
bash ../mml-download.sh index satellite 120

Download query set

The query set contains $1{,}000$ landmarks with: $18{,}688$ Ground images, $1{,}000$ Satellite images, $1{,}000$ GPS coordinates and $1{,}000$ Text descriptions. While only the latest satellite images are used for retrieval in the original paper, we provide the full satellite query set ($10{,}631$ images).

Downloading the labels and metadata

Extra query:

Downloading the data:

The query/ground is split into 4 TAR files (each of size ~900MB), query/satellite is split into 10 TAR files (each of size ~950MB) and query/text has 1 TAR file (of size ~7MB). The files are located in the query/(ground/satellite/text) directory, and are e.g. named images_000.tar, images_001.tar, ..., images_079.tar for the ground files. To download them, access the following link:

https://archive.compute.dtu.dk/downloads/public/projects/MMLandmarks/query/ground/images_000.tar

And similarly for the other files.

Using the provided script

mkdir query && cd query

# Downloads all modalities (ground/satellite/text) for query
bash ../mml-download.sh query

# Downloads satellite images for query
bash ../mml-download.sh query satellite 10

Checking the download

md5sum files are made available to check the integrity of the downloaded files. Each md5sum file corresponds to one of the TAR files mentioned above, and are located in the same directory as the TAR files: (train/index/query)/(ground/satellite/text)/. For example, the md5sum file images_000.tar for the ground in the train set can be found via the following link.

https://archive.compute.dtu.dk/downloads/public/projects/MMLandmarks/train/ground/md5.images_000.txt

And similarly for the other files.

When downloading the dataset with the download-mml.sh script, the integrity of the files is already checked as part of the download process.

Extracting the data

The file structure follows that of GLDv2, namely the files in each directory (train, index, query) and modality (ground, satellite, text) are stored in a directory ${a}/${b}/${c}/${id}.ext (with ext: jpg for ground, png for satellite, and json for text). ${a}, ${b}, and ${c} are the first three letters of the ground/satellite images and text jsons, and ${id} is the image/json id found in the CSV files. For example:

  • a ground image from the train set with id 0123456789abcdef is stored in train/ground/0/1/2/0123456789abcdef.jpg.
  • a satellite image from the index set with id 0123456789abcdef is stored in index/satellite/0/1/2/0123456789abcdef.png.
  • a text json from the train set with id 0123456789abcdef is stored in train/text/0/1/2/0123456789abcdef.json

Dataset licenses

Wikimedia Commons licenses:

The ground images are licensed under Creative Commons and Public Domain licenses. The licenses for all images are available here:

National Agriculture Imagery Program (NAIP) license:

The satellite images are provided by the U.S. Department of Agriculture, Farm Service Agency, and are considered public domain information. Users of this dataset should acknowledge USDA Farm Production and Conservation - Business Center, Geospatial Enterprise Operations when using or distributing the satellite imagery.

Release history

May 2026 (version 1.0)

  • Initial version release.

Contact

For any comments/questions/advice/suggestions, feel free to create an issue on this GitHub repository.

Citation

If you make use of this dataset, consider giving the repository a star and citing our paper as:

@InProceedings{Kristoffersen_2026_MMLandmarks,
  author    = {Oskar Kristoffersen and Alba Reinders and Morten R. Hannemose and Anders B. Dahl and Dim P. Papadopoulos},
  title     = {MMLandmarks: a Cross-View Instance-Level Benchmark for Geo-Spatial Understanding},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2026},
}

Acknowledgements

The satellite imagery in MMLandmarks is sourced from the National Agriculture Imagery Program (NAIP). We acknowledge USDA Farm Production and Conservation - Business Center, Geospatial Enterprise Operations for providing this data.

About

Multi-modal dataset with 329k ground views, 197k aerial views, GPS and Text information from 18557 landmarks, collected to support fully contrastive multimodal training and solving a variety of geospatial tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors