These tools were designed for the purpose of (1) preparing orthomosaics for ingestion into computer vision systems that expect a collection of smaller images, (2) preparing a spatially referenced list of annotations that map onto that orthomosaic for model training or validation, and (3) collating the output of a computer vision system or manual annotation tool into a single spatially referenced list of annotations that map onto the source orthomosaic. The tools were generated for drone maps of seal colonies and rainforest vegetation, but can be applied to any instance of object detection in large-format spatial imagery with minimal adaptation.
tile_orthomosaic splits a large-format spatial image into small tiles of a specified dimension, with a specified overlap between adjacent tiles, while preserving spatial metadata in a json file (default name: tiling_scheme.json) that can be used to reintegrate data from the unreferenced tiles into a spatially referenced dataset.
crop_annotations_from_orthomosaic reads in a CSV of points with lat/lon coordinates, buffers the points into boxes, then crops each box from an orthomosaic. Output annotations are provided in a tif format, with metadata included in the filename. The script could easily be modified to output metadata in a separate file, or to include more or different information in the filename. Spatial metadata is preserved in the source input data, but not included in the output tif files. This could also be modified, if wanted.
global_annotations_to_tile reads in a CSV of points with lat/lon coordinates, buffers the points into boxes, then partitions the boxes to their appropriate tiles using tiling_scheme.json. The output annotations are provided in a format that can be read into VGG Image Annotator (https://www.robots.ox.ac.uk/~vgg/software/via/). Script inputs could easily be modified to receive a shapefile or KML instead of a CSV, and script outputs could easily be modified to yield an alternative annotation format.
tile_annotations_to_global reads in annotations - as might be generated during manual image annotation or from a computer vision output - in VGG Image Annotator format and collates them into a spatially referenced dataset. Redundant annotations can occur due to overlap between tiles or due to permissive parameters in a computer vision output, and we remove such redundancies using non-max suppression techniques on a within-class basis. These techniques involve threshold parameters that can be customized in-script to determine which annotations are discarded, though we have assigned parameter values that have worked for us so far. We recommend that users inspect the files that result from this script (probably in a GIS program) to ensure that they do not include redundancies nor are they omitting important data points. Manual editing of the file may be required after NMS. Certain classes mightentail high overlap among adjacent annotations (e.g. animals in coitus) that might not be effectively distinguished by NMS and might require annotation of multiple individuals as a distinct class. The input annotations expect the format of a VGG Image Annotator CSV, and the output annotations are provided as a shapefile. Note the comments of the last cell: GDAL and the Fiona tool do not always play well in their latest versions, at the time this was written. I was able to use a custom python environment to reconcile these issues, but your mileage may vary depending on your system. As usual, script inputs could easily be modified to ingest alternative annotation formats, and script outputs could be easily modified to yield an alternate export filetype or method - this might be preferable if you are unable to set up a viable Fiona installation on your system.
Please direct any questions or comments to gl7176@gmail.com and I will assist if able!