Configuration settings can dramatically change the operation of Hootenanny. The configuration options can be set in a configuration file ($HOOT_HOME/conf/hoot.json), environment variables or specified on the command line. Most of the configuration values are documented in more detail in the following sections.
These configuration options are advanced manipulations of the Hootenanny engine and may result in unexpected results. Please create an issue at https://github.com/ngageoint/hootenanny if you have specific questions.
There are two ways to specify configuration options on the command line interface.
-
Specify a single option with --define or -D
-
Specify multiple options in a configuration file with --conf or -C
An example using -D is below:
hoot convert \ -D ogr.writer.script=test-files/io/SampleTranslation.js \ -D ogr.writer.pre.layer.name=bar \ test-files/io/SampleTranslation.osm \ PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"
If you want to specify some values for a short iteration, or within a script you
can specify the settings in the HOOT_OPTIONS
environment variables. This
environment variable uses the same syntax as the command line. For example:
export HOOT_OPTIONS="-D ogr.writer.script=test-files/io/SampleTranslation.js \ -D ogr.writer.pre.layer.name=bar" hoot test-files/io/SampleTranslation.osm \ PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"
If you find yourself using a set of defines routinely you can specify them within a JSON configuration file such as:
{ "ogr.writer.script" : "test-files/io/SampleTranslation.js", "ogr.writer.pre.layer.name" : "bar" }
If this file is placed in $HOOT_HOME/conf/ then it will be found with a relative path. E.g.
hoot convert \ -C Example.json \ test-files/io/SampleTranslation.osm \ PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'"
The $HOOT_HOME/conf/hoot.json configuration file is used with all commands without specifying the configuration on the command line.
There are a number of other options that can impact the lower working conflation functions (such as tag merging), but these options impact the top level conflation operation.
Configuration parameters:
-
add.review.tags - This enables a number of review tags that will be created when a feature match should be reviewed by a user. The valid values are yes for enabled and empty for disabled.
-
<feature-type>.match.threshold - where feature-type = building, highway, or poi; The threshold for calling a relationship a match. The higher the value the lower the True Positive Results (TPR), but likely also the lower the False Positive Results (FPR).
-
<feature-type>.miss.threshold - where feature-type = building, highway, or poi; The higher the value the lower the True Positive Results (TPR), but likely also the lower the False Positive Results (FPR).
The OGR Writer supports advanced translation of OSM formatted files into well defined tabular formats such as Shapefile and FileGDB. See the --convert documentation for specific examples of how to use the OGR Writer.
Configuration Parameters:
-
ogr.writer.create.all.layers - A boolean value ( true or false ). If true all layers defined in the schema are created whether or not they contain features. If false (the default) then only layers that contain features are created.
-
ogr.writer.pre.layer.name - A string to prepend onto layer names before writing. This is most useful when exporting a number of layers into a single database.
-
ogr.writer.script - The script to use for performing the translations.
-
ogr.strict.checking - Takes one of three values; on , off or warn . This reports issues between the defined schema and the produced output. If strict checking is on then an error will be reported and translation will stop. If warn is specified a warning will be reported in most instances and the program will continue running. If off is specified then most common errors are silenced. Defaults to on.
-
ogr.split.o2s - A boolean value ( true or false ). If true, the list of OSM tags for o2s (Out Of Spec) features will be split into four 254 character attributes on export. This is to avoid errors when exporting to ESRI Shapefiles.
-
ogr.esri.fcsubtype - A boolean value ( true or false ). If true, the ESRI specific FcSubType field and associated values are added to the output.
This driver will likely work with most supported OGR formats, but is only actively tested with Shapefile and FileGDB. FileGDB will only work if it has been compiled into your version of GDAL. If not, see this .
Examples:
# Convert example .osm file and write the resulting layers to a PostGIS database. # The PostGIS database must be initialized first (using instructions found with the PostGIS documentation). hoot convert \ -D ogr.writer.script=test-files/io/SampleTranslation.js \ -D ogr.writer.pre.layer.name=bar \ test-files/io/SampleTranslation.osm \ PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'" # Same as above. hoot convert \ -D ogr.writer.pre.layer.name=bar \ test-files/io/SampleTranslation.osm \ PG:"dbname='osm_gis2' host='localhost' port='5432' user='hoot' password='hoottest'" --trans test-files/io/SampleTranslation.js # Convert example .osm file and write the resulting layers to a directory full of Shapefiles. # The directory will be named tmp/MyShapefiles, not tmp/MyShapefiles.shp hoot convert \ -D ogr.writer.script=test-files/io/SampleTranslation.js \ test-files/io/SampleTranslation.osm \ tmp/MyShapefiles.shp
The hoot::RubberSheet operation will find common high confidence intersection matches and rubber sheet the two data sets to bring them into alignment.
Example usage:
hoot conflate -D map.cleaner.transforms=hoot::RubberSheet test-files/ToyTestA.osm test-files/ToyTestB.osm tmp/dum.osm
The hoot::WaySplitterOp will split ways greater than a specified max length into ways that are less than or equal to max length. All distance calculations are performed in a projected plane.
The parameter way.splitter.max.length can be used to specify this max length.
Example usage:
hoot clean -D map.cleaner.transforms=hoot::WaySplitterOp -D way.splitter.max.length=100 test-files/ToyTestA.osm tmp/dum.osm
Randomly duplicates POIs in the given map.
Hoot proper name: hoot::PertyDuplicatePoiOp
Configuration parameters:
-
perty.duplicate.poi.p - The probability that any feature will be duplicated.
-
perty.duplicate.poi.duplicate.sigma - The number of times a feature will be duplicated where the number of times is defined as round(abs(N(0, sigma^2^))) + 1 . Setting sigma to 0 will guarantee that it will only be duplicated one time.
-
perty.duplicate.poi.move.multiplier - Determines the position of the new point where the distance from the old point is given as N(0, sigma^2^) x moveMultiplier where sigma is the circular error of the reference point divided by two.
Map visitors perform operations on individual elements in a map. For example, reprojection is a simple visitor operation where each element can be changed without direct knowledge of any other element.
Randomly split ways on the given map.
Hoot proper name: hoot::PertyWaySplitVisitor
Configuration parameters:
-
perty.way.split.probability - The probability between 0.0 and 1.0 that a way will be generalized into a simpler form. Defaults to 0.1.
-
perty.way.split.min.node.spacing - The minimum spacing in meters that may occur between nodes created by way splits. Defaults to 1.0.
Randomly generalize (simplify) ways on the given map by removing nodes from them.
Hoot proper name: hoot::PertyWayGeneralizeVisitor
Configuration parameters:
-
perty.way.generalize.probability - The probability between 0.0 and 1.0 that a way will be generalized into a simpler form. Defaults to 0.1.
-
perty.way.generalize.epsilon - Distance parameter that determines to what degree a way is generalized. Higher values result in more generalization (more nodes are removed from a way). Defaults to 5.0.
Randomly remove elements from the given map.
Hoot proper name: hoot::PertyRemoveRandomElementVisitor
Configuration parameters:
-
perty.remove.random.p - The probability that any element will be removed. Defaults to 0.05.
Randomly remove tags from the given map.
Hoot proper name: hoot::PertyRemoveTagVisitor
Configuration parameters:
-
perty.remove.tag.p - The probability that any tag will be removed. Defaults to 0.05.
-
perty.remove.tag.visitor.exempt.tag.keys - A list of tag key names which are exempt from the random tag removal done by PertyRemoveTagVisitor. This is useful for preventing the removal of tags Hootenanny relies on during conflation.
-
perty.remove.tag.visitor.substitution.keys - A list of tag keys which, rather than being removed by the PertyRemoveTagVisitor, will have their values replaced instead. The tag keys in the list match one to one with the replacement values in perty.remove.tag.visitor.subsitution.values.
-
perty.remove.tag.visitor.substitution.values - A list of tag values which, should be substituted by PertyRemoveTagVisitor. The tag values in the list match one to one with the replacement values in perty.remove.tag.visitor.subsitution.keys.
Randomly changes names by replacing characters with random characters or randomly swapping characters.
If a character is marked for change then there are even odds that it will be swapped or replaced. If the character is swapped it can be swapped with any character in the name (uniformly distributed). If the character is replaced it will be randomly replaced by a-z or space (uniformly distributed). Any script will be handled properly, but the random characters used for replacement do not change.
Hoot proper name: hoot::PertyNameVisitor
Configuration parameters:
-
perty.name.p - The probability that any name will be changed. Defaults to 0.05.
-
perty.name.change.p - The probability that any character will be changed in a name. Defaults to 0.05.
When two features/elements are merged into a single feature/element the tags must be merged into a single set of tags. Depending on the use case there may be many different desirable ways to merge tags. Below are the options exposed to the user for merging tags. The default way to set this configuration for many functions is through the tag.merger.default configuration key. Some older operations may not support configurable merging techniques. The default value is hoot::AverageTagMerger .
Use the value hoot::OverwriteTag1Merger . For example:
hoot conflate -D tag.merger.default=hoot::OverwriteTag1Merger input1.osm input2.osm output.osm
Overwrite the values in the first input with the values in the second input. This will still keep all unique names, but all other tags will simply be overwritten. For example:
Tags on First Input
key |
value |
name |
Simonovsky Val Expressway |
name:en |
Simonovsky Val Expressway |
highway |
trunk |
ref |
RU 24 |
oneway |
yes |
lanes |
2 |
uuid |
foo |
Tags on Second Input
key |
value |
highway |
secondary |
name |
Simonovsky Val Expy |
name:en |
Simonovsky Val Expressway |
name:ru |
улица Симоновский Вал |
ref |
24 |
oneway |
true |
bridge |
yes |
uuid |
bar |
Result
key |
value |
name |
Simonovsky Val Expy |
name:en |
Simonovsky Val Expressway |
name:ru |
улица Симоновский Вал |
alt_name |
Simonovsky Val Expressway;RU 24 |
bridge |
yes |
highway |
secondary |
ref |
24 |
oneway |
true |
lanes |
2 |
uuid |
bar |
Use the value hoot::OverwriteTag2Merger . For example:
hoot conflate -D tag.merger.default=hoot::OverwriteTag2Merger input1.osm input2.osm output.osm
Overwrite the values in the second input with the values in the first input. This will still keep all unique names, but all other tags will simply be overwritten. For example:
Tags on Input 1
key |
value |
name |
Simonovsky Val Expressway |
name:en |
Simonovsky Val Expressway |
highway |
trunk |
ref |
RU 24 |
oneway |
yes |
lanes |
2 |
uuid |
foo |
Tags on Input 2
key |
value |
highway |
secondary |
name |
Simonovsky Val Expy |
name:en |
Simonovsky Val Expressway |
name:ru |
улица Симоновский Вал |
ref |
24 |
oneway |
true |
bridge |
yes |
uuid |
bar |
Result
key |
value |
name |
Midland Expressway |
name:en |
Midland Expressway |
name:ru |
улица Симоновский Вал |
alt_name |
24;Simonovsky Val Expy |
bridge |
yes |
highway |
trunk |
ref |
RU 24 |
oneway |
yes |
lanes |
2 |
uuid |
foo |
Use an internal method to average the tags between two features.
Use the value hoot::AverageTagMerger . For example:
hoot conflate -D tag.merger.default=hoot::AverageTagMerger input1.osm input2.osm output.osm
Use an internal method to generalize the tags between two features. For instance, generalizing between highway=primary and highway=secondary results in highway=road since road is the least general form that represents both primary and secondary.
Use the value hoot::GeneralizeTagMerger . For example:
hoot conflate -D tag.merger.default=hoot::GeneralizeTagMerger input1.osm input2.osm output.osm
This tag merger assumes features in different input files have their provenance tracked with a tag with the key "hoot:source". This tag merger will ensure that tag value is not overwritten for either feature being merged. Instead, the values from both features are concatenated together.
Use the value hoot::ProvenanceAwareOverwriteTagMerger . For example:
hoot conflate -D tag.merger.default=hoot::ProvenanceAwareOverwriteTagMerger input1.osm input2.osm output.osm
This will be replaced with new documentation in the not-so-distant future as our conflation routines evolve.
For road conflation only you can specify manipulators that determine how the conflation routines run. There are many experimental operations available, but the two most useful are listed below.
Matches and merges a partial or full way to a partial or full way. The result is a new, averaged, conflated way as well as one or more partial "left-overs". This is the default configuration option and can be run with:
hoot conflate input1.osm input2.osm output.osm
or
hoot conflate -D conflator.manipulators=hoot::WayMerger input1.osm input2.osm output.osm
This is most useful if you have one input data set that clearly has superior geometry.
Matches and merges a partial or full way to a partial or full way. The result is a new conflated way where the second input is snapped to the first input as well as one or more partial "left-overs". This can be run with:
hoot conflate -D conflator.manipulators=hoot::WaySnapMerger input1.osm input2.osm output.osm
There is also a convenience configuration file that enables both the Snap Way Manipulator( conflator.manipulators=hoot::WaySnapMerger ) and the Overwrite Tag 2tag merging operation ( tag.merger.default=hoot::OverwriteTag2Merger ). This is most useful if the first input is superior in terms of geometry and attribution and the second data set is just supplementing with missing attributes and geometries. This can be run with:
hoot conflate -C core/Snap.json input1.osm input2.osm output.osm