This is a summary of public geospatial datasets that our team at JHU/APL has developed. We are grateful to our government sponsors and partners in industry and academia and especially to SpaceNet, IEEE DataPort, and Amazon for hosting many of these large datasets at no cost to support public research.
The IARPA Space-Based Machine Automated Recognition Technique (SMART) program was one of the first large-scale research program to target advancing the state of the art for automatically detecting, characterizing, and monitoring large-scale anthropogenic activity in global scale, multi-source, heterogeneous satellite imagery. The program was created with a goal of leveraging and advancing the latest techniques in artificial intelligence (AI), computer vision (CV), and machine learning (ML) applied to geospatial applications. JHU/APL led the development of a large, global-scale dataset containing spatio-temporal annotations of large scale heavy construction activity. More details as well as links to access the annotation dataset can be found in our Pubgeo repository here. Additional information about the problem formulation can be found in our 2023 SPIE DCS publication. Questions about the dataset can be directed to pubgeo@jhuapl.edu.
The CORE3D Open Evaluation CodaLab competition offers a public leaderboard to track progress toward accurate urban 3D building modeling with satellite images. The satellite image data used for this was released by IARPA in 2018 to enable public research and is hosted by SpaceNet on AWS. JHU/APL developed a baseline solution combining open source projects from Kitware and Cornell, discussed in this IGARSS'21 paper.
JHU/APL developed the Overhead Geopose Challenge in collaboration with DrivenData for the National Geospatial Intelligence Agency (NGA). JHU/APL extended the Urban Semantic 3D (US3D) dataset, developed the baseline solution, and published this work in a CVPR'21 EarthVision workshop paper. DrivenData produced this blog post with a tutorial describing how to use our code as a benchmark for the contest.
The 2019 Data Fusion Contest (DFC19), organized by the Image Analysis and Data Fusion Technical Committee (IADF TC) of the IEEE Geoscience and Remote Sensing Society (GRSS), the Johns Hopkins University (JHU), and the Intelligence Advanced Research Projects Activity (IARPA), aimed to promote research in semantic 3D reconstruction and stereo using machine intelligence and deep learning applied to satellite images. The DFC19 dataset is available on IEEE DataPort. For more information about the dataset and baselines, please see our WACV'19 paper. For more recent work extending this dataset, please see our CVPR'20 paper and the extended Urban Semantic 3D (US3D) dataset.
IARPA has publicly released DigitalGlobe satellite imagery for the Creation of Operationally Realistic 3D Environment (CORE3D) program to enable performer teams to crowdsource manual labeling efforts and to promote public research that aligns well with the CORE3D program’s objectives. SpaceNet is hosting the CORE3D public dataset in the SpaceNet repository to ensure easy access to the data.
The IARPA Functional Map of the World (fMoW) challenge invited experts from across the government, academia, industry, and developer communities-with or without experience in automated image analysis-to create fast and accurate classification algorithms for building and land use. Although the challenge has now ended, it was responsible for the release of the largest public annotated satellite image dataset to date. The dataset contains over one million images and over one million annotated points of interest and is designed to enable the development of novel algorithms and data fusion techniques to address several computer vision and remote sensing research problems. Source code for the baseline algorithm and winning solutions as well as links to download the fMoW dataset are available in this GitHub repository. For more information about the dataset and baseline, please see our CVPR'18 paper.
This challenge published a large-scale dataset containing 2D orthorectified RGB and 3D Digital Surface Models and Digital Terrain Models generated from commercial satellite imagery covering over 360 km of terrain and containing roughly 157,000 annotated building footprints. All image products are provided at 50 cm ground sample distance. This unique 2D/3D large scale dataset provides researchers an opportunity to utilize machine learning techniques to further improve state of the art performance. SpaceNet is hosting the Urban 3D Challenge dataset in the SpaceNet repository to ensure easy access to the data. Open source projects for winning solutions are hosted by TopCoder.
The IARPA Multi-View Stereo 3D Mapping Challenge invited experts from across government, academia, industry and solver communities to derive accurate 3D point clouds from multi-view satellite imagery. The challenge provided nearly 50 commercial satellite images covering 100 square kilometers. SpaceNet is hosting the Multi-View Stereo 3D Mapping dataset in the SpaceNet repository to ensure easy access to the data.