Skip to content

Latest commit

 

History

History
500 lines (325 loc) · 18.3 KB

HootenannyConfigOptions.asciidoc

File metadata and controls

500 lines (325 loc) · 18.3 KB

Configuration Options

Configuration settings can dramatically change the operation of Hootenanny. The configuration options can be set in a configuration file ($HOOT_HOME/conf/hoot.json), environment variables or specified on the command line. Most of the configuration values are documented in more detail in the following sections.

These configuration options are advanced manipulations of the Hootenanny engine and may result in unexpected results. Please create an issue at https://github.com/ngageoint/hootenanny if you have specific questions.

Specifying Configuration Options

There are two ways to specify configuration options on the command line interface.

  1. Specify a single option with --define or -D

  2. Specify multiple options in a configuration file with --conf or -C

An example using -D is below:

hoot convert \
  -D ogr.writer.script=test-files/io/SampleTranslation.js \
  -D ogr.writer.pre.layer.name=bar \
  test-files/io/SampleTranslation.osm \
  PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"

If you want to specify some values for a short iteration, or within a script you can specify the settings in the HOOT_OPTIONS environment variables. This environment variable uses the same syntax as the command line. For example:

export HOOT_OPTIONS="-D ogr.writer.script=test-files/io/SampleTranslation.js \
  -D ogr.writer.pre.layer.name=bar"
hoot test-files/io/SampleTranslation.osm \
  PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"

If you find yourself using a set of defines routinely you can specify them within a JSON configuration file such as:

{
  "ogr.writer.script" : "test-files/io/SampleTranslation.js",
  "ogr.writer.pre.layer.name" : "bar"
}

If this file is placed in $HOOT_HOME/conf/ then it will be found with a relative path. E.g.

hoot convert \
  -C Example.json \
  test-files/io/SampleTranslation.osm \
  PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"

The $HOOT_HOME/conf/hoot.json configuration file is used with all commands without specifying the configuration on the command line.

Conflation Configuration

There are a number of other options that can impact the lower working conflation functions (such as tag merging), but these options impact the top level conflation operation.

Configuration parameters:

  • add.review.tags - This enables a number of review tags that will be created when a feature match should be reviewed by a user. The valid values are yes for enabled and empty for disabled.

  • <feature-type>.match.threshold - where feature-type = building, highway, or poi; The threshold for calling a relationship a match. The higher the value the lower the True Positive Results (TPR), but likely also the lower the False Positive Results (FPR).

  • <feature-type>.miss.threshold - where feature-type = building, highway, or poi; The higher the value the lower the True Positive Results (TPR), but likely also the lower the False Positive Results (FPR).

Map Writer Configuration

OGR Writer

The OGR Writer supports advanced translation of OSM formatted files into well defined tabular formats such as Shapefile and FileGDB. See the --convert documentation for specific examples of how to use the OGR Writer.

Configuration Parameters:

  • ogr.writer.create.all.layers - A boolean value ( true or false ). If true all layers defined in the schema are created whether or not they contain features. If false (the default) then only layers that contain features are created.

  • ogr.writer.pre.layer.name - A string to prepend onto layer names before writing. This is most useful when exporting a number of layers into a single database.

  • ogr.writer.script - The script to use for performing the translations.

  • ogr.strict.checking - Takes one of three values; on , off or warn . This reports issues between the defined schema and the produced output. If strict checking is on then an error will be reported and translation will stop. If warn is specified a warning will be reported in most instances and the program will continue running. If off is specified then most common errors are silenced. Defaults to on.

  • ogr.split.o2s - A boolean value ( true or false ). If true, the list of OSM tags for o2s (Out Of Spec) features will be split into four 254 character attributes on export. This is to avoid errors when exporting to ESRI Shapefiles.

  • ogr.esri.fcsubtype - A boolean value ( true or false ). If true, the ESRI specific FcSubType field and associated values are added to the output.

This driver will likely work with most supported OGR formats, but is only actively tested with Shapefile and FileGDB. FileGDB will only work if it has been compiled into your version of GDAL. If not, see this .

Examples:

# Convert example .osm file and write the resulting layers to a PostGIS database.
# The PostGIS database must be initialized first (using instructions found with the PostGIS documentation).
hoot convert \
  -D ogr.writer.script=test-files/io/SampleTranslation.js \
  -D ogr.writer.pre.layer.name=bar \
  test-files/io/SampleTranslation.osm \
  PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"

# Same as above.
hoot convert \
  -D ogr.writer.pre.layer.name=bar \
  test-files/io/SampleTranslation.osm \
  PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'" --trans test-files/io/SampleTranslation.js

# Convert example .osm file and write the resulting layers to a directory full of Shapefiles.
# The directory will be named tmp/MyShapefiles, not tmp/MyShapefiles.shp
hoot convert \
  -D ogr.writer.script=test-files/io/SampleTranslation.js \
  test-files/io/SampleTranslation.osm \
  tmp/MyShapefiles.shp

Map Operation Configuration

Rubber Sheeting

The hoot::RubberSheet operation will find common high confidence intersection matches and rubber sheet the two data sets to bring them into alignment.

Example usage:

hoot conflate -D map.cleaner.transforms=hoot::RubberSheet test-files/ToyTestA.osm test-files/ToyTestB.osm tmp/dum.osm

Splitting Ways

The hoot::WaySplitterOp will split ways greater than a specified max length into ways that are less than or equal to max length. All distance calculations are performed in a projected plane.

The parameter way.splitter.max.length can be used to specify this max length.

Example usage:

hoot clean -D map.cleaner.transforms=hoot::WaySplitterOp -D way.splitter.max.length=100 test-files/ToyTestA.osm tmp/dum.osm

Perty Duplicate POI

Randomly duplicates POIs in the given map.

Hoot proper name: hoot::PertyDuplicatePoiOp

Configuration parameters:

  • perty.duplicate.poi.p - The probability that any feature will be duplicated.

  • perty.duplicate.poi.duplicate.sigma - The number of times a feature will be duplicated where the number of times is defined as round(abs(N(0, sigma^2^))) + 1 . Setting sigma to 0 will guarantee that it will only be duplicated one time.

  • perty.duplicate.poi.move.multiplier - Determines the position of the new point where the distance from the old point is given as N(0, sigma^2^) x moveMultiplier where sigma is the circular error of the reference point divided by two.

Perty Duplicate POI Example Use:
hoot perty -D perty.ops=hoot::PertyDuplicatePoiOp -D perty.duplicate.poi.p=0.5 TigerDc.osm Pertified.osm

Map Visitor Configuration

Map visitors perform operations on individual elements in a map. For example, reprojection is a simple visitor operation where each element can be changed without direct knowledge of any other element.

Perty Way Split

Randomly split ways on the given map.

Hoot proper name: hoot::PertyWaySplitVisitor

Configuration parameters:

  • perty.way.split.probability - The probability between 0.0 and 1.0 that a way will be generalized into a simpler form. Defaults to 0.1.

  • perty.way.split.min.node.spacing - The minimum spacing in meters that may occur between nodes created by way splits. Defaults to 1.0.

Perty Way Split Example Use:
hoot perty -D perty.ops=hoot::PertyWaySplitVisitor -D perty.way.split.probability=0.5 TigerDc.osm Pertified.osm

Perty Way Generalize

Randomly generalize (simplify) ways on the given map by removing nodes from them.

Hoot proper name: hoot::PertyWayGeneralizeVisitor

Configuration parameters:

  • perty.way.generalize.probability - The probability between 0.0 and 1.0 that a way will be generalized into a simpler form. Defaults to 0.1.

  • perty.way.generalize.epsilon - Distance parameter that determines to what degree a way is generalized. Higher values result in more generalization (more nodes are removed from a way). Defaults to 5.0.

Perty Way Generalize Example Use:
hoot perty -D perty.ops=hoot::PertyWayGeneralizeVisitor -D perty.way.generalize.probability=0.5 -D perty.way.generalize.epsilon=5.0 TigerDc.osm Pertified.osm

Perty Remove Element

Randomly remove elements from the given map.

Hoot proper name: hoot::PertyRemoveRandomElementVisitor

Configuration parameters:

  • perty.remove.random.p - The probability that any element will be removed. Defaults to 0.05.

Perty Remove Element Example Use:
hoot perty -D perty.ops=hoot::PertyRemoveRandomElementVisitor -D perty.remove.random.p=0.5 TigerDc.osm Pertified.osm

Perty Remove Tag

Randomly remove tags from the given map.

Hoot proper name: hoot::PertyRemoveTagVisitor

Configuration parameters:

  • perty.remove.tag.p - The probability that any tag will be removed. Defaults to 0.05.

  • perty.remove.tag.visitor.exempt.tag.keys - A list of tag key names which are exempt from the random tag removal done by PertyRemoveTagVisitor. This is useful for preventing the removal of tags Hootenanny relies on during conflation.

  • perty.remove.tag.visitor.substitution.keys - A list of tag keys which, rather than being removed by the PertyRemoveTagVisitor, will have their values replaced instead. The tag keys in the list match one to one with the replacement values in perty.remove.tag.visitor.subsitution.values.

  • perty.remove.tag.visitor.substitution.values - A list of tag values which, should be substituted by PertyRemoveTagVisitor. The tag values in the list match one to one with the replacement values in perty.remove.tag.visitor.subsitution.keys.

Perty Remove Tag Example Use:
hoot perty -D perty.ops=hoot::PertyRemoveTagVisitor -D perty.remove.tag.p=0.5 -D perty.remove.tag.visitor.exempt.tag.keys="REF1;REF2" -D perty.remove.tag.visitor.substitution.keys=highway -D perty.remove.tag.visitor.substitution.values=road TigerDc.osm Pertified.osm

Perty Name Visitor

Randomly changes names by replacing characters with random characters or randomly swapping characters.

If a character is marked for change then there are even odds that it will be swapped or replaced. If the character is swapped it can be swapped with any character in the name (uniformly distributed). If the character is replaced it will be randomly replaced by a-z or space (uniformly distributed). Any script will be handled properly, but the random characters used for replacement do not change.

Hoot proper name: hoot::PertyNameVisitor

Configuration parameters:

  • perty.name.p - The probability that any name will be changed. Defaults to 0.05.

  • perty.name.change.p - The probability that any character will be changed in a name. Defaults to 0.05.

Perty Name Visitor Example Use:
hoot perty -D perty.ops=hoot::PertyNameVisitor -D perty.name.change.p=0.5 TigerDc.osm Pertified.osm

Tag Merging Configuration

When two features/elements are merged into a single feature/element the tags must be merged into a single set of tags. Depending on the use case there may be many different desirable ways to merge tags. Below are the options exposed to the user for merging tags. The default way to set this configuration for many functions is through the tag.merger.default configuration key. Some older operations may not support configurable merging techniques. The default value is hoot::AverageTagMerger .

Overwrite Tag 1

Use the value hoot::OverwriteTag1Merger . For example:

hoot conflate -D tag.merger.default=hoot::OverwriteTag1Merger input1.osm input2.osm output.osm

Overwrite the values in the first input with the values in the second input. This will still keep all unique names, but all other tags will simply be overwritten. For example:

Tags on First Input

key

value

name

Simonovsky Val Expressway

name:en

Simonovsky Val Expressway

highway

trunk

ref

RU 24

oneway

yes

lanes

2

uuid

foo

Tags on Second Input

key

value

highway

secondary

name

Simonovsky Val Expy

name:en

Simonovsky Val Expressway

name:ru

улица Симоновский Вал

ref

24

oneway

true

bridge

yes

uuid

bar

Result

key

value

name

Simonovsky Val Expy

name:en

Simonovsky Val Expressway

name:ru

улица Симоновский Вал

alt_name

Simonovsky Val Expressway;RU 24

bridge

yes

highway

secondary

ref

24

oneway

true

lanes

2

uuid

bar

Overwrite Tag 2

Use the value hoot::OverwriteTag2Merger . For example:

hoot conflate -D tag.merger.default=hoot::OverwriteTag2Merger input1.osm input2.osm output.osm

Overwrite the values in the second input with the values in the first input. This will still keep all unique names, but all other tags will simply be overwritten. For example:

Tags on Input 1

key

value

name

Simonovsky Val Expressway

name:en

Simonovsky Val Expressway

highway

trunk

ref

RU 24

oneway

yes

lanes

2

uuid

foo

Tags on Input 2

key

value

highway

secondary

name

Simonovsky Val Expy

name:en

Simonovsky Val Expressway

name:ru

улица Симоновский Вал

ref

24

oneway

true

bridge

yes

uuid

bar

Result

key

value

name

Midland Expressway

name:en

Midland Expressway

name:ru

улица Симоновский Вал

alt_name

24;Simonovsky Val Expy

bridge

yes

highway

trunk

ref

RU 24

oneway

yes

lanes

2

uuid

foo

Average Tags

Use an internal method to average the tags between two features.

Use the value hoot::AverageTagMerger . For example:

hoot conflate -D tag.merger.default=hoot::AverageTagMerger input1.osm input2.osm output.osm
Generalize Tag Merger

Use an internal method to generalize the tags between two features. For instance, generalizing between highway=primary and highway=secondary results in highway=road since road is the least general form that represents both primary and secondary.

Use the value hoot::GeneralizeTagMerger . For example:

hoot conflate -D tag.merger.default=hoot::GeneralizeTagMerger input1.osm input2.osm output.osm
Provenance Aware Overwrite Tag Merger

This tag merger assumes features in different input files have their provenance tracked with a tag with the key "hoot:source". This tag merger will ensure that tag value is not overwritten for either feature being merged. Instead, the values from both features are concatenated together.

Use the value hoot::ProvenanceAwareOverwriteTagMerger . For example:

hoot conflate -D tag.merger.default=hoot::ProvenanceAwareOverwriteTagMerger input1.osm input2.osm output.osm

Road Conflation Configuration

This will be replaced with new documentation in the not-so-distant future as our conflation routines evolve.

For road conflation only you can specify manipulators that determine how the conflation routines run. There are many experimental operations available, but the two most useful are listed below.

Standard Way Manipulator

Matches and merges a partial or full way to a partial or full way. The result is a new, averaged, conflated way as well as one or more partial "left-overs". This is the default configuration option and can be run with:

hoot conflate input1.osm input2.osm output.osm

or

hoot conflate -D conflator.manipulators=hoot::WayMerger input1.osm input2.osm output.osm

Snap Way Manipulator

This is most useful if you have one input data set that clearly has superior geometry.

Matches and merges a partial or full way to a partial or full way. The result is a new conflated way where the second input is snapped to the first input as well as one or more partial "left-overs". This can be run with:

hoot conflate -D conflator.manipulators=hoot::WaySnapMerger input1.osm input2.osm output.osm

There is also a convenience configuration file that enables both the Snap Way Manipulator( conflator.manipulators=hoot::WaySnapMerger ) and the Overwrite Tag 2tag merging operation ( tag.merger.default=hoot::OverwriteTag2Merger ). This is most useful if the first input is superior in terms of geometry and attribution and the second data set is just supplementing with missing attributes and geometries. This can be run with:

hoot conflate -C core/Snap.json input1.osm input2.osm output.osm

Configuration Parameters