duplicate key error in slim mode #16

milovanderlinden · 2013-04-08T13:45:33Z

I am merging two planets from planet.openstreetmap.nl. The planets for Aruba and Curacao. Although both islands are pretty far apart, they share one common relations; the nautical boundary.

During import in regular mode, all goes well. But in slim mode, I am getting errors:

--2013-04-08 15:43:51-- http://planet.openstreetmap.nl/aruba/planet-aruba-130408.osm.pbf
Resolving planet.openstreetmap.nl (planet.openstreetmap.nl)... 2a00:d10:101::13:1, 93.186.179.161
Connecting to planet.openstreetmap.nl (planet.openstreetmap.nl)|2a00:d10:101::13:1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1069130 (1.0M)
Saving to: ‘planet-aruba-130408.osm.pbf’

100%[===============================================================================================================================================================================>] 1,069,130 1.83MB/s in 0.6s

2013-04-08 15:43:52 (1.83 MB/s) - ‘planet-aruba-130408.osm.pbf’ saved [1069130/1069130]

osm2pgsql SVN version 0.81.0 (64bit id space)

Using projection SRS 4326 (Latlong)
Setting up table: planet_osm_point
NOTICE: table "planet_osm_point_tmp" does not exist, skipping
Setting up table: planet_osm_line
NOTICE: table "planet_osm_line_tmp" does not exist, skipping
Setting up table: planet_osm_polygon
NOTICE: table "planet_osm_polygon_tmp" does not exist, skipping
Setting up table: planet_osm_roads
NOTICE: table "planet_osm_roads_tmp" does not exist, skipping
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=800MB, maxblocks=102401*8192, allocation method=11
Mid: pgsql, scale=10000000 cache=800
Setting up table: planet_osm_nodes
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_nodes_pkey" for table "planet_osm_nodes"
Setting up table: planet_osm_ways
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_ways_pkey" for table "planet_osm_ways"
Setting up table: planet_osm_rels
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "planet_osm_rels_pkey" for table "planet_osm_rels"

Reading in file: planet-aruba-130408.osm.pbf
Processing: Node(112k 112.2k/s) Way(8k 4.40k/s) Relation(50 50.00/s) parse time: 3s

Node stats: total(112174), max(2170360891) in 1s
Way stats: total(8792), max(196359215) in 2s
Relation stats: total(51), max(2323309) in 0s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads

Going over pending ways...
4885 ways are pending

Using 1 helper-processes
Helper process 0 out of 1 initialised
Process 0 finished processing 4885 ways in 4 sec

All child processes exited

4885 Pending ways took 4s at a rate of 1221.25/s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads

Going over pending relations...
0 relations are pending

Using 1 helper-processes
Process 0 finished processing 0 relations in 0 sec

All child processes exited

node cache: stored: 112174(100.00%), storage efficiency: 50.92% (dense blocks: 18, sparse nodes: 100935), hit rate: 100.00%
Sorting data and creating indexes for planet_osm_roads
Sorting data and creating indexes for planet_osm_point
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_line
Stopping table: planet_osm_nodes
Stopping table: planet_osm_ways
Building index on table: planet_osm_ways (fastupdate=off)
Stopping table: planet_osm_rels
Building index on table: planet_osm_rels (fastupdate=off)
Stopped table: planet_osm_nodes in 0s
Stopped table: planet_osm_rels in 0s
Analyzing planet_osm_roads finished
Analyzing planet_osm_line finished
Analyzing planet_osm_polygon finished
Analyzing planet_osm_point finished
Stopped table: planet_osm_ways in 0s
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating osm_id index on planet_osm_roads
Copying planet_osm_line to cluster by geometry finished
Creating geometry index on planet_osm_line
Creating indexes on planet_osm_roads finished
Creating osm_id index on planet_osm_line
All indexes on planet_osm_roads created in 2s
Completed planet_osm_roads
Copying planet_osm_polygon to cluster by geometry finished
Creating geometry index on planet_osm_polygon
Creating osm_id index on planet_osm_polygon
Creating indexes on planet_osm_line finished
All indexes on planet_osm_line created in 2s
Completed planet_osm_line
Creating indexes on planet_osm_polygon finished
Copying planet_osm_point to cluster by geometry finished
Creating geometry index on planet_osm_point
All indexes on planet_osm_polygon created in 2s
Completed planet_osm_polygon
Creating osm_id index on planet_osm_point
Creating indexes on planet_osm_point finished
All indexes on planet_osm_point created in 4s
Completed planet_osm_point

Osm2pgsql took 15s overall

Reading in file: planet-curacao-130408.osm.pbf
Processing: Node(34k 34.7k/s) Way(5k 5.02k/s) Relation(30 30.00/s) parse time: 1s

Node stats: total(34735), max(2229592357) in 1s
Way stats: total(5020), max(213219417) in 0s
Relation stats: total(33), max(2676830) in 0s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
COPY_END for COPY planet_osm_rels FROM STDIN;
failed: ERROR: duplicate key value violates unique constraint "planet_osm_rels_pkey"
DETAIL: Key (id)=(2323309) already exists.
CONTEXT: COPY planet_osm_rels, line 30: "2323309 1 354 {268396336,202476411,202476412,86298925,202487355,202487361,202487362,86297905,8629810..."

milovanderlinden · 2013-04-08T13:46:42Z

By the way; this issue is with a self compiled osm2pgsql grabbed from git master. My system is debian wheezy

apmon · 2013-04-08T16:11:23Z

This is a well known issue and one that is likely not going to be addressed any time soon.

While the rendering tables allow multiple geometries per osm feature / id and therefore don't have a unique constraint on osm_id, the slim tables are used for update processing and thus need to do lookups based on osm_id, for which it has to be unique.

For performance reasons osm2pgsql uses the COPY command, that batches up inserts into a single command. As such, if there is a unique constraint (or duplicate Key) violation, the whole batch fails. As osm2pgsql does stream based processing, it does not have the ability to go back and reprocess that batch of nodes / ways / relations to properly deal with the duplicate entry. Therefore, it cannot recover from such a duplicate key failure and terminates.

Osm2pgsql therefore doesn't support importing multiple files into one database in slim mode if there are duplicate osm features. Instead, I would recommend merging the two files prior to importing with osm2pgsql. For example with osmosis that supports de-duplication on merge.

pnorman · 2014-09-12T18:57:02Z

@apmon, can we close this, as importing multiple overlapping files in slim mode isn't something osm2pgsql does?

apmon · 2014-09-16T04:12:41Z

Yes, closing this, as it is not likely that we will support overlapping files in slim mode anytime soon. Given the duplicate key errors happen somewhere deep in the copy operations, osm2pgsql would have to catch them and then replay all statements in the failed copy operation. However, there is currently no concept to cache/buffer those statements to replay them, as everything is done in a memoryless streaming process.

mirabilos · 2015-01-24T20:49:17Z

@apmon can you point out some howto for:

• Import several extracts once (e.g. Belgium plus Nordrhein-Westfalen plus Rheinland-Pfalz plus Luxembourg)
• Keeping these up to date

Thanks! I’m trying to dig through the infos here, on the OSM wiki, on the switch2osm page, on Osmosis documentation, but I don’t find anything comprehensible to me for doing just that. I don’t think I’ve got the disc space needed for a full Europe extract right now.

By the way… what are the disc space requirements for those extracts from Geofabrik? I’m using up about 8 GiB for the entire program installation plus an imported Belgium extract already…

iqqmuT · 2015-04-03T22:30:01Z

Use osmconvert to merge multiple pbf files:

Download pbf files
Convert each pbf file to o5m file: osmconvert extract1.osm.pbf -o=extract1.o5m
Merge o5m files into one pbf file: osmconvert extract1.o5m extract2.o5m -o=merged.osm.pbf
Now you can import merged.osm.pbf with osm2pgsql in slim mode

mirabilos · 2015-04-07T09:11:12Z

Ah, nice. Does that also work for the diffs?

ReckyXXX · 2017-02-20T10:34:12Z

You can import merged file as @iqqmuT says, but it will dropped already loaded map-data in PostgreDatabase and start new import process.

On the other hand, you can substract already existing data from new region file using:

osmconvert new_region.pbf -o=new_region.o5m
osmconvert already_loaded_map.pbf -o=already_loaded_map.o5m
osmconvert new_region.o5m --subtract already_loaded_map.o5m -o=new_region_cleaned.o5m
osmconvert new_region_cleaned.o5m -o=new_region_cleaned.pbf

And then you can just append new file to PostgreSQL using:
osm2pgsql --append ... new_region_cleaned.pbf
without duplicate key errors.

mirabilos · 2017-02-20T21:52:23Z

ReckyXXX dixit:

On the other hand, you can substract already existing data from new region file using:

This doesn’t scale up to, say, five regions. Combinatory explosion.

ReckyXXX · 2017-02-24T17:26:34Z

@mirabilos :

This doesn’t scale up to, say, five regions.

You can merge all new regions together in one file and then add it to postgresql using my method.
The idea is that you don't need to merge already loaded map file with them and wait for loading it anew.
Substracting with osmconvert is not very time consuming and using it with osm2pgsql --append instead of --create could save a lot of time.

springmeyer mentioned this issue Apr 9, 2013

Providing OSX binaries/packages #15

Closed

apmon closed this as completed Sep 16, 2014

jendrusk mentioned this issue Nov 19, 2016

Segmentation fault while importing diff with tiles expire option #651

Closed

jecor mentioned this issue Mar 1, 2017

Pointer being freed was not allocated during import #695

Closed

pnorman mentioned this issue Jun 28, 2017

Command osm2pgsql --append --slim gives ERROR: duplicate key value violates unique constraint "planet_osm_rels_pkey" #769

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

duplicate key error in slim mode #16

duplicate key error in slim mode #16

milovanderlinden commented Apr 8, 2013

milovanderlinden commented Apr 8, 2013

apmon commented Apr 8, 2013

pnorman commented Sep 12, 2014

apmon commented Sep 16, 2014

mirabilos commented Jan 24, 2015

iqqmuT commented Apr 3, 2015

mirabilos commented Apr 7, 2015

ReckyXXX commented Feb 20, 2017

mirabilos commented Feb 20, 2017 via email

ReckyXXX commented Feb 24, 2017

duplicate key error in slim mode #16

duplicate key error in slim mode #16

Comments

milovanderlinden commented Apr 8, 2013

milovanderlinden commented Apr 8, 2013

apmon commented Apr 8, 2013

pnorman commented Sep 12, 2014

apmon commented Sep 16, 2014

mirabilos commented Jan 24, 2015

iqqmuT commented Apr 3, 2015

mirabilos commented Apr 7, 2015

ReckyXXX commented Feb 20, 2017

mirabilos commented Feb 20, 2017 via email

ReckyXXX commented Feb 24, 2017