Skip to content

Latest commit

 

History

History
192 lines (147 loc) · 16.4 KB

README.md

File metadata and controls

192 lines (147 loc) · 16.4 KB

hoot_logo

Overview

Hootenanny:

  1. A gathering at which folksingers entertain often with the audience joining in

Conflation:

  1. Fancy word for merge

Hootenanny is an open source conflation tool developed with machine learning techniques to facilitate automated and semi-automated conflation of critical Foundation GEOINT features in the topographic domain. In short, it merges multiple maps into a single seamless map.

Hootenanny conflation occurs at the dataset level, where the user’s workflow determines the best reference dataset, source content, geometry, and attributes to transfer to the output map. Hootenanny's internal processing leverages the key value pair structure of OpenStreetMap (OSM) for improved utility and applicability to broader user groups. Normalized attributes can be used to aid in feature matching, and OSM’s free tagging system allows the map to include an unlimited number of attributes describing each feature.

Goals

  • Automatically combine geospatial features for decision making
  • Allow for reviewing and manually resolving features which cannot be automatically matched with sufficient certainty
  • Maintain geometry and attribute provenance for combined features
  • Create up-to-date routable transportation networks from multiple sources

Conflatable Feature Types

  • Areas
  • Buildings
  • Points of Interest (POIs)
  • Power Lines
  • Railways
  • Rivers
  • Roads

Additional feature types may have custom conflation routines created for them via Javascript by using Hootenanny's Generic Conflation capability. Any feature that does not fit into the list above will be conflated with Generic Geometry Conflation.

Conflation Workflows

  • Reference Conflation (default) - Keep the best of both - Conflate the best geometry and tag parts of map B into map A, favoring map A's data. Use this type of conflation when you want conflated output based on the best state of both input datasets.
  • Horizontal Conflation (aka Cookie Cutter Conflation) - Replace a section - Either 1) Define a region in map A and replace data in that region with data in the same region from map B OR 2) Define a region in map A to preserve and replace data outside of it with data outside of the region from map B. Use this type of conflation if you have a specific region of your dataset that you would like to replace with data from another dataset or you would like to surround your dataset with new data.
  • Differential Conflation - Add new features - Conflate map A with B where the only data added to the output from B is in areas that don't overlap with A. Use this type of conflation when you want to fill in holes in your dataset with data from another source without modifying any data in the first dataset.
  • Differential Conflation With Tags - Add new features and new tags to existing features - This workflow is the same as Differential Conflation with the added step of transferring tags to existing features in map A from features in map B. Conflate map A with B where only tags are transferred from B to matching features in A and entire features are added from B to A in areas where B does not overlap with A.
  • Attribute Conflation - Transfer attributes over to geometries - Conflate map A with B where only tags are transferred from B to matching features in A and no changes are made to A's geometries. Use this type of conflation when the first dataset's geometry is superior to a second dataset, but the attributes of the second dataset are superior to that of the first dataset.

Attribute Translation

Hootenanny leverages the OSM key value pair tag concept to support translation between various data schemas and supports automated schema conversion between:

  • Topographic Data Store (TDS)
  • Multi-National Geospatial Co-Production Program (MGCP)
  • Geonames
  • OSM
  • others

Users can define their own custom translations via Javascript or Python.

Feature Validation and Cleaning

Hootenanny has a variety of map cleaning capabilities to automatically correct erroneous data. In addition to the built-in cleaning operations, Hootenanny is integrated with the feature cleaning capabilities in JOSM. For situations where you want feature validation only and no automatic cleaning, JOSM validation may be used alone. There is more information on Hootenanny validation and cleaning here.

Feature Filtering

Hootenanny has the capability to let you selectively pick the features that are conflated from your data to save you from some pre-conflation data wrangling. Some examples:

Hootenanny has many additional available filters that can also be specified to perform feature filtering during conflation.

When To Use

No automated map conflation technology is perfect. If you are conflating a relatively small number of features, you may be best served to conflate them manually yourself, given you are looking for perfectly conflated output and want to avoid any potential time spent configuring Hootenanny options to get the best conflated output.

For larger datasets, Hootenanny can be used standalone or as an inital step in conjunction with a crowd sourced campaign to conflate new data into your dataset. You will find that the conflation automation provided by Hootenanny saves effort overall, and that most inaccuracies in the conflated output are a small subset of the input data which end up being flagged for human review so they may later be manually corrected.

Scalability

Hootenanny currently does not strive to conflate data at the global level. An earlier implementation of Hootenanny supported a map-reduce architecture that was capable of global conflation for some data types but was shelved due to general lack of interest and the maintenance costs to support the seldomly used capability (so some of the conflation algorithms are actually capable of supporting distributed computing...with some limitations).

Hootenanny generally can scale well running on a single machine from the larger city level up to the smaller country level, depending on the density of the data being conflated and the RAM available on the machine. Beyond that, new algorithms may need to be developed to handle very large quantities of map data.

Configuration

There are a wide range of configuration options available to customize the conflation and translation workflows.

Web User Interface

Hootenanny's web user interface is built upon the open source Mapbox iD Editor, which provides an intuitive and user-friendly conflation experience.

Web Services API

Access to Hootenanny core capabilities are exposed through a web services API for those wishing to develop their own conflation clients. The web services use OAuth authentication.

Command Line Interface

Command line access is available to aid in custom scripting of conflation capabilities.

Example:

#  conflate two datasets together
hoot conflate input1.osm input2.osm output.osm

More examples

Programming Language Bindings

Hootenanny has nodejs bindings available which expose core conflation capabilities for creating custom workflows.

Example:

//  conflate two datasets together
var hoot = require(process.env.HOOT_HOME + '/lib/HootJs');
var map = new hoot.OsmMap();
hoot.loadMap(map, "input1.osm", false, 1);
hoot.loadMap(map, "input2.osm", false, 2);
new hoot.UnifyingConflator().apply(map)
hoot.saveMap(map, "output.osm");

Additional Features

In addition to running conflation jobs with map data, Hootenanny also provides finer-grained capabilities:

Documentation

Installation

Support

Don't hesitate to ask for help if features aren't conflating how you expect them to or if you're experiencing difficulty while installing the software. If you have any support questions, please create an issue in this repository.

As there are lot of different conflation scenarios out in the wild, there is no one-size fits all conflation workflow or algorithm. Hootenanny attempts to capture most conflation scenarios with the default configuration options, but sometimes you will need to modify configuration options specific to the data you are conflating in order to get the best results.

Additionally, the availability of new software features to the user interface may lag their initial availability from the CLI by multiple development cycles. If you find a conflation feature you wish to use that is mentioned in the CLI documentation but is not present within the UI, let us know.

Development