Skip to content
Thomas Schwotzer edited this page Oct 5, 2020 · 16 revisions

Convert OSM to OHDM data

I2O figure

java -classpath <your-jdbc-driver.jar> -jar OHDMConverter.jar -i intermediate_parameter -d ohdm_parameter

A fresh OHDM instance is usually set up by a OSM file which was parsed into the intermediate database.

In this second step, geo objects, geometries and their relations are extracted from those original osm data. This process has three steps: processing nodes, ways and relations.

A little movie shows the process with example data.

About objects and geometries in the flux of time

Geographic systems always deal with two things: objects and geometries, even if they don't do it explicitely. Geometry is the shape of something. It is best describe with a vector geometry. The described something is the object, see als n-m-relationsship between object and geometries in each GIS

OSM does mix both concepts - which is completely ok because OSM does not deal with time. Objects and geometries have the tendency to be persistent. Their relationship changes more often. Imagine a shop moving from location A to B.

Geometry of A and B will not change (necessarily). Neither will the shop object. It is still the same shop but at another location. What changed was the relationship between object (shop) and geometry (A to B). A system that deals with historic location based data must explicitely describe that relationship.

Converting OSM

Examples:

<node id="xx" ... lat=".." long=".."/>

describes a pure geometry. There are no object information. This node (point) will only (!) be used as a part of a way or relation but not as a object by itself.

<node id="xx" ... lat=".." long="..">
  <tag k='highway' v='traffic_signals' />
</node>

That node describes a geometry (a point) but also an object: That is a traffic signal. OHDM considers every OSM entity to be an object if it has either a name or a map feature.

<node id="xx" lat=".." long="..">
  <tag k='name' v='here_is_a_name' />
</node>

Why? Users felt the urge to give this point a name. That is an object.

There can be more tags in each osm entity.

<node id='xx' ... lat='52.4413316' lon='13.5936747'>
    <tag k='crossing' v='traffic_signals' />
    <tag k='highway' v='traffic_signals' />
    <tag k='tactile_paving' v='yes' />
    <tag k='traffic_signals:direction' v='backward' />
    <tag k='traffic_signals:sound' v='yes' />
</node>

Here are the rules:

  • OSM entities are objects if the have a name tag or a map feature
  • all other tags are considered to describe the relationsship between object an geometry.

OSM to OHDM conversion in a nutshell

Objects and geometries form a n-m-relation in OHDM. Objects are stored in the object table in the OHDM database. Geometries are stored in one of three tables (points, lines, polygons). Table object_geometry establishs the relation between both. Beside foreign key to object and geometry, it contains timestamps (valid_from, valid_since), classid (number describing the map feature (subclass)). It also keeps all other tags (others than name, map feature).

The conversion process steps through all three intermediate database tables (nodes, ways, relations) and produces OHDM entries.

Nodes

Nodes without a name tag or a mpa feature description are ignored in the first round. Nodes with such a tag describe an object. In this case, a new entry points table is created.

A new entry in object table is created, if the node has a name tag. Otherwise, the pseudo-object called OHDM is used for the following step.

A new entry in object_geometry is created. It references the newly added point and the object. Alls further tags and classid are copied from intermediate database. The valid_since column is set to the timestamp found in OSM. (It can strongly be assumed that this object existed before it was described in OSM. But what date should be taken instead? That's part of other OHDM sub projects.)

Valid_unti is set to now, the date that conversion is performed. We strongly assume that the world will exist afterwards. We are on the safe side with that assumption, though.

Ways

Each way is transformed into a linestring or a polygon. This can be very time consuming in a large database because all node information must be selected from the nodes table (which has more than 5 billion entries in 2019) the linestring in stored in lines table.

Object creation is identically to nodes processing. Object_geometry table is filled in the same way.

Relations

From an data structure perspective, relations can describe two very different things: A multipolygon or association of object.

Relation are used to describe polygons with one or more wholes. OHDM transforms such geometries and creates a new entry in polygon table.

Relations can als describe associations. A thing is made up by a number of other things in an association. Moreover, these other things can have their identity on their own. (A campus is a campus. A university can have more than one campus. University would be an aggregation. Each campus has still its identity.)

In that case, object (e.g. university) refer to other objects (e.g. campus1, campus2). This association is described with the objetc_geometry table. An object (target_object) is referenced instead of a geometry. The rest is identical to nodes and ways.

OHDM transforms OSM data into PostGIS geometries, extracts object information and the relationship between both. That's the basis to produce time aware maps.