WinSyn is a dataset of photographs of building windows from around the world. It is described in the publication A High Resolution Testbed for Synthetic Data by Tom Kelly (Kaust), John Femiani (MiamiOh), and Peter Wonka (Kaust). See also the procedural model and a short video introduction.
This document describes the full dataset, it contains the high resolution photots - with jpg and raw files as output from the camera. There are also pre-rendered datasets which make various assumptions. These are (probably easier to use and) available from the Kaust datastore:
- 9k labeled real photos at 1024px resolution
- 89k 512px crops or 97k 1024px crops
- 72k 2024px photos of windows
Our synthetic renders & code are available:
- synthetic renders of windows and variations (alpha release)
- synthetic renders of windows: render-time "ms" dataset (alpha release)
- the synthetic model code
The rest of this document describes the organisation of the full dataset, and the tools available to process it. As well as the original jpg and raw photos, you can find labels polygons, crop information, location information, and a simple website to view the data. There are tools to generate the above pre-rendered datasets and crop images.
The data directory is the "root" of the project. It contains various folders (photos, metadata_single_elments, metadata_website, etc…) which each contain a different type of data.
The contents of these folders are divided into batches (subfolders) such as tom_london_20220418 for easier processing. They are named for the first name of the photographer, principal location, and date. For example, metadata relating to the image:
data/photos/tom_london_20220418/IMG_0206.JPG
can be found in:
data/**/tom_london_20220418/IMG_0206.*
where ** is a metadata folder (metadata_single_elements, metadata_window_labels, ...) and * is an extension dependent on the data type (usually .json). You can see a summary of available information for each photograph at the bottom of the photo webpage:
data/metadata_website/tom_london_20220418/IMG_0206.html
The project's data and code is split between different sources. Images files are available from the Kaust data repository, while metadata and source code is available in this github repo.
Jpg and raw files are available from the Globus datastore (notes on downloading):
- photos
- the photos in jpg format and raw format
- the total file size is around 4Tb
- the photographers were provided with this guidance document
Other metadata available from this repository:
- metadata_single_elements
- The crop information to identify single rectangular samples of windows.
- These were created manually with the crop_tool.py script by Tom, Michaella, and Prem, with the goals of identifying single windows that we could send to be per-pixel labelled and regularising the different equipment and styles of the many different photographers.
- Clusters of windows were sometimes annotated as a single window if they had shared frames/interconnected patterns.
- With glass-façades, we tried to take 2-3 repetitions in each direction.
- We tried to take no more than 4 similar windows from one façade - in one or multiple images.
- We tried to keep the crops square where possible.
- The tags
window
,door
,glass_facade
,shop
,church
andabnormal
are treated as windows. Some other tags (materials
,facades
) are available but are not consistently applied. - This file also contains per-photo tags such as
deleted
(file should not included in dataset) and rotation:rot90
,rot180
,rot270
(where exif rotation data is incorrect).
- metadata_window_labels
- Per-pixel labels created by LYD for the first 3002 images. Annotated for the first 11 (12 including none) classes. The instructions given to the labelers were collated in this document.
- Described as polygons; per-pixel bitmap datasets can be created with process_labels.py.
- These labels mostly should not overlap.
- metadata_window_labels_2
- Per-pixel Labels created by LYD for the next 6000 images. Annotated for 13 (14 including none) classes.
- These labels are defined with a z-order and overlap - they are rendered using this order to create the per-pixel labels.
- metadata_location
- Describes location of photographs
- locations_data.json contains the per-batch location information used for creating per-image location description files.
- The locations are of varying degrees of accuracy. The image source can be
coarse
(located only by a named city),camera
(a GPS location captured by the camera), ortrack
(a location computed from the photograph time and a separately captured GPS track). Several batches have no geolocation information available. - We were unable to verify the location accuracy of the batches
mussa_dar_es_salaam_20221212
,jan_cebu_20230120
, andsarabjot_newdehli_20230315
.
The following can be generated once you have the raw data and meta repository :
- metadata_website
- The very simple website created by the build_website.py script.
- An example of the site is currently hosted by miamiOh.
- It should be hosted on a webserver (it fetches html content for the batches).
- metadata_website/index.html shows all photographs. Select the radio button next to a batch to view all the photos for that batch. An icon with a red cross has been deleted. An icon with labels has been labelled.
- Clicking on an icon will take you to the webpage summarising all available information for each photo.
- metadata_website/crops.html similarly shows an icon for each crop.
- metadata_website/map shows the locations of the photos. Zooming in and clicking on a blue marker will show an icon and link to the photo summary page.
- metadata_cook/dataset_cook_crops_{resolution}px_{time.time()}.zip
- Datasets of images cropped to particular sizes with or without accompanying labels by the render_crops.py script. Simiarly for render_crops_and_labels.py.
data processing scripts from the code repository
A collection of scripts to process the data are available from the fast_crop repo. They include scripts to render rops, labels, build the website, crop photos, summarize and validate the dataset, and create some of the published figures.
- The dataset was grown organically as resources and applications were added to the project. Therefore early image metadata may be of lower quality than later.
- The batches started as a day of photography (for tom and michaella) or a contract (1 or 2 thousand images) for the upwork freelancers.
- The "tom_archive_xxx" folders are images taken before the project started and are from a variety of hardware and locations. Mostly we don't have location information for these.
- The exploratory datasets were created by tom (on holiday) in UK/Denmark, later Michaella contributed images from Austria and Germany. In the third phase we contracted freelancers on the upwork platform to collect images from other locations around the world. Adherance to the provided guidance document was generally good, but a minority of the freelance photographers did not follow this document - these were largely deleted during cropping.
- The guidance documents provided to the photographers and labellers evolved as edge-cases and issues with data collection were identified. For example, specific instructions for labelling unusual configurations of sunshades and Egyptian windows were added. The dates of major changes are noted at the top of these documents.
- There is an
easy
parition containing only easy labeled crops. It's about 4k images. We manually selected rectnagular windows with few other classes. You can find the parition in the easy branch.
We would like to thank our lead photographers Michaela Nömayr and Florian Rist, and engineer Prem Chedella, as well as our contributing photographers: Aleksandr Aleshkin, Angela Markoska, Artur Oliveira, Brian Benton, Chris West, Christopher Byrne, Elsayed Saafan, Florian Rist, George Iliin, Ignacio De Barrio, Jan Cuales, Kaitlyn Jackson, Kalina Mondzholovska, Kubra Ayse Guzel, Lukas Bornheim, Maria Jose Balestena, Michaela Nömayr, Mihai-Alexandru Filoneanu, Mokhtari Sid Ahmed Salim, Mussa Ubapa, Nestor Angulo Caballero, Nicklaus Suarez, Peter Fountain, Prem Chedella, Samantha Martucci, Sarabjot Singh, Scarlette Li, Serhii Malov, Simon R. A. Kelly, Stephanie Foden, Surafel Sunara, Tadiyos Shala, Susana Gomez, Vasileios Notis, Yuan Yuan, and finally LYD for the labeling.
We also thank the blender procedural artists Gabriel de Laubier for the UCP Wood material and Simon Thommes for the fantastic Br'cks material. Both were modified and used in our procedural model.
Please site the below paper if you use our work.
@inproceedings{winsyn,
title={WinSyn: A High Resolution Testbed for Synthetic Data},
author={Tom Kelly and John Femiani and Peter Wonka},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
month = {June},
year={2024}
}