Create a rake task for populating the database #282

gravitystorm · 2013-05-14T08:58:01Z

Currently the installation notes spend about 90 lines explaining how to populate the database from an extract. A substantial portion of this concerns osmosis options and resetting sequences.

This could be simplified by creating a rake task that calls osmosis with the configured database parameters and takes care of the sequences.

tomhughes · 2013-05-14T09:32:32Z

Well populating the database should very much be relegated to an "optional extras" section of the documentation - most people just looking to install the rails code in order to develop it should never need to to do any sort of import.

So I don't object to this idea, but really the easiest solution for most people is not to bother, which also means you don't have to explain how to install osmosis, making things even simpler!

gravitystorm · 2013-05-14T09:35:42Z

Oh indeed - I'm just aiming to reduce the length of the 'optional extras' section too!

pnorman · 2013-07-15T18:45:18Z

Once the schema is set up it is "just" one osmosis command to load in data. Perhaps just redirect to the osmosis install instructions and give the command?

gravitystorm · 2013-07-15T18:56:06Z

The original instructions have lots of resetting of sequences - does osmosis now handle all of these?

pnorman · 2013-07-15T19:00:16Z

That's for loading data then editing it locally, which is more specialized. If all you need to do is load some data to have something to view, it's pretty easy. Perhaps have a rake task to set sequences to MAX(id) or whatever is appropriate, and run the task after loading data?

danstowell · 2013-09-17T07:20:26Z

Just to note that I'm currently hoping to try some tweaks to the /browse/ pages, which do require a data import, or else there are no items to browse. (Especially since, by default, the search box's results come from remote.)

At present there is no documentation (since the doc has gone from wiki into the CONFIG.md, which tells me literally nothing except this issue number), and no script. We are stuck! Please, something... (even if it's just a link to deprecated wiki instructions...)

tomhughes · 2013-09-17T07:34:34Z

@danstowell I usually just jump into the editor and draw something when I want to test something like that - that way I can be sure it will have the tags I need for whatever I'm testing.

But sure, I'm not going to refuse a patch that provides (sensible) instructions on how to load some data. I believe it is a non-trivial thing to do however, so they will need to be well written.

pnorman · 2013-09-17T07:42:56Z

Just to note that I'm currently hoping to try some tweaks to the /browse/ pages, which do require a data import, or else there are no items to browse. (Especially since, by default, the search box's results come from remote.)

Are you wanting to import data, or import then edit data? The first is relatively easy - create the database then use the osmosis --write-apidb task to import data (keeping in mind that --write-apidb is amazingly slow). To do it properly (not giving superuser to every postgres account involved) can be a bit annoying.

If the latter, then you need to muck about with sequences so when you add a node it gets a suitable ID.

Having set up an apidb database a few times for testing, I can say that it was never properly documented anywhere.

danstowell · 2013-09-19T07:40:01Z

I only need to import data. (I need fairly rich data, so drawing a few ways myself isn't quite enough for me.)

Just for the record (for anyone drafting a rake task!), here's what I've done which seems to have given me a usable read copy of the apidb:

Of course I had first done the database setup described in INSTALL.md, so I already had the databases and the postgres user set up.
Ensure the database is empty (so that the import doesn't clash on IDs):
osmosis --truncate-apidb host="localhost" database="openstreetmap" user="openstreetmap" password="" validateSchemaVersion="no"
Load the dump into the db (for Greater London, this took about 20 minutes on my laptop):
osmosis --read-pbf file=greater-london-latest.osm.pbf --write-apidb host="localhost" database="openstreetmap" user="openstreetmap" password="" validateSchemaVersion="no"
Optional step: postgres tools to reclaim disk space:
sudo -u postgres vacuumdb -afvz
sudo -u postgres reindexdb

This does not reset the sequences, as mentioned above in this thread. However it doesn't seem to need any weird privileges in the postgres user accounts. But then, since I don't know the database internals I don't know if the sequences issue is the only quirk of the (development) database I've landed myself with.

hanchao · 2016-09-19T09:26:14Z

@danstowell Error occurred. Any help?

[hanchao@map ~]$ osmosis --truncate-apidb host="******" database="******" user="******" password="******" validateSchemaVersion="no"
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Osmosis Version 0.45
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Preparing pipeline.
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Launching pipeline execution.
九月 19, 2016 5:14:58 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline executing, waiting for completion.
九月 19, 2016 5:14:59 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline complete.
九月 19, 2016 5:14:59 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Total execution time: 674 milliseconds.

[hanchao@map ~]$ osmosis --read-pbf file=china-latest.osm.pbf --write-apidb host="******" database="******" user="******" password="****" validateSchemaVersion="no"
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Osmosis Version 0.45
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Preparing pipeline.
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Launching pipeline execution.
九月 19, 2016 5:15:11 下午 org.openstreetmap.osmosis.core.Osmosis run
信息: Pipeline executing, waiting for completion.
九月 19, 2016 5:16:18 下午 org.openstreetmap.osmosis.core.pipeline.common.ActiveTaskManager waitForCompletion
严重: Thread for task 1-read-pbf failed
org.openstreetmap.osmosis.core.OsmosisRuntimeException: Unable to insert user with id 3642735 into the database.
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.insertUser(UserManager.java:143)
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.addOrUpdateUser(UserManager.java:191)
        at org.openstreetmap.osmosis.apidb.v0_6.ApidbWriter.process(ApidbWriter.java:1098)
        at crosby.binary.osmosis.OsmosisBinaryParser.parseDense(OsmosisBinaryParser.java:138)
        at org.openstreetmap.osmosis.osmbinary.BinaryParser.parse(BinaryParser.java:124)
        at org.openstreetmap.osmosis.osmbinary.BinaryParser.handleBlock(BinaryParser.java:68)
        at org.openstreetmap.osmosis.osmbinary.file.FileBlock.process(FileBlock.java:135)
        at org.openstreetmap.osmosis.osmbinary.file.BlockInputStream.process(BlockInputStream.java:34)
        at crosby.binary.osmosis.OsmosisReader.run(OsmosisReader.java:45)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "users_display_name_idx"
  详细：Key (display_name)=(Nodes&Roads) already exists.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
        at org.openstreetmap.osmosis.apidb.v0_6.impl.UserManager.insertUser(UserManager.java:140)
        ... 9 more

九月 19, 2016 5:16:18 下午 org.openstreetmap.osmosis.core.Osmosis main
严重: Execution aborted.
org.openstreetmap.osmosis.core.OsmosisRuntimeException: One or more tasks failed.
        at org.openstreetmap.osmosis.core.pipeline.common.Pipeline.waitForCompletion(Pipeline.java:146)
        at org.openstreetmap.osmosis.core.Osmosis.run(Osmosis.java:92)
        at org.openstreetmap.osmosis.core.Osmosis.main(Osmosis.java:37)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
        at org.codehaus.classworlds.Launcher.main(Launcher.java:47)

mojodna · 2016-09-19T17:29:54Z

@hanchao this is the root cause: drolbr/Overpass-API#257 (even if it didn't come from Overpass).

Here's a shell script that will remap user IDs after the extract has been converted to XML: https://github.com/AmericanRedCross/posm-replay-tool/blob/c25d8e1f62af44e0664190723eba51eef7b93adc/remap-userid.sh

My notes in https://github.com/AmericanRedCross/posm-replay-tool/blob/28deac4193859b71af34d845f013194a37871870/LOCAL.md#initialization explain what's going on and steps to fix it (locally).

hanchao · 2016-09-20T03:06:57Z

@mojodna Thanks. This helped greatly。

3642735 1708958 have the same name (Nodes&Roads)

<node id="1546831475" version="2" timestamp="2016-08-31T17:30:44Z" uid="3642735" user="Nodes&amp;Roads" changeset="41831739" lat="34.6455964" lon="110.3129604"/>
<node id="1582851239" version="3" timestamp="2016-02-18T20:41:39Z" uid="1708958" user="Nodes&amp;Roads" changeset="37297109" lat="38.9261979" lon="113.8765729"/>

https://www.openstreetmap.org/api/0.6/node/1582851239

<node id="1582851239" visible="true" version="3" changeset="37297109" timestamp="2016-02-18T20:41:39Z" user="georhoko" uid="1708958" lat="38.9261979" lon="113.8765729"/>

The name in china-latest.osm.pbf is not updated
http://download.geofabrik.de/asia/china-latest.osm.pbf

tomhughes · 2017-01-23T10:21:23Z

@maxdeepfield Your comment has nothing to do with this thread and in any case is not appropriate here - please ask on the dev or rails-dev mailing lists, or on #osm-dev on IRC if you need help using the code.

maxdeepfield · 2019-11-25T16:45:43Z

reading "about 90 lines explaining how to populate the database" is not very hard task if these lines exists and will work. any progress on this?

maxdeepfield · 2019-12-02T20:58:10Z

Anyway, osmosis --write-apidb works fine with geofabrik extracts, after import I can see and edit data, then get the result via osmosis --read-apidb-current, what with --write-pbf gives new dataset.

Also it works with xml/osm files exported directly from openstreetmap website via bounding box, so it is super easy to test on "real" data.

If processes done without errors - do I need to care about something?
How actually I can feel not to "able to edit the data I have loaded"?

prusswan · 2020-06-19T13:25:35Z

Managed to import my region extract from geofabrik after resolving various data issues, which can be resolved with some database knowledge and background understanding (e.g. data extracts need to satisfy some integrity conditions) of these issues #1988, #2449, #2543

Anyway, I feel the real issue now may be that users don't realize "populating the database from an OSM extract" requires the data to be imported to meet certain conditions, for the step to "just work" (and possibly for the proposed rake task to run without getting tripped). Would it help to have another section covering the kind of data preparation which may be required?

victorovento · 2022-08-16T20:44:46Z

Well I think this script will be never written.

relates to openstreetmap#282

gravitystorm mentioned this issue Mar 23, 2014

sequences #723

Closed

This comment was marked as off-topic.

Sign in to view

adam21 mentioned this issue Oct 19, 2015

Undefined method when trying to edit #1072

Closed

mmd-osm mentioned this issue Oct 1, 2016

Two user_id use the same display_name in the .pbf data file #1303

Closed

zerebubuth mentioned this issue May 20, 2017

Adding osm extract #1545

Closed

gravitystorm added the dx Developer Experience label Dec 18, 2019

saerdnaer added a commit to saerdnaer/openstreetmap-website that referenced this issue Sep 25, 2022

add some notes about importing data

b594bfc

relates to openstreetmap#282

saerdnaer mentioned this issue Sep 25, 2022

add some notes about importing data to INSTALL.md #3719

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a rake task for populating the database #282

Create a rake task for populating the database #282

gravitystorm commented May 14, 2013

tomhughes commented May 14, 2013

gravitystorm commented May 14, 2013

pnorman commented Jul 15, 2013

gravitystorm commented Jul 15, 2013

pnorman commented Jul 15, 2013

danstowell commented Sep 17, 2013

tomhughes commented Sep 17, 2013

pnorman commented Sep 17, 2013

danstowell commented Sep 19, 2013

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

hanchao commented Sep 19, 2016 •

edited

mojodna commented Sep 19, 2016

hanchao commented Sep 20, 2016 •

edited

tomhughes commented Jan 23, 2017

maxdeepfield commented Nov 25, 2019

maxdeepfield commented Dec 2, 2019 •

edited

prusswan commented Jun 19, 2020

victorovento commented Aug 16, 2022

Create a rake task for populating the database #282

Create a rake task for populating the database #282

Comments

gravitystorm commented May 14, 2013

tomhughes commented May 14, 2013

gravitystorm commented May 14, 2013

pnorman commented Jul 15, 2013

gravitystorm commented Jul 15, 2013

pnorman commented Jul 15, 2013

danstowell commented Sep 17, 2013

tomhughes commented Sep 17, 2013

pnorman commented Sep 17, 2013

danstowell commented Sep 19, 2013

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

hanchao commented Sep 19, 2016 • edited

mojodna commented Sep 19, 2016

hanchao commented Sep 20, 2016 • edited

tomhughes commented Jan 23, 2017

maxdeepfield commented Nov 25, 2019

maxdeepfield commented Dec 2, 2019 • edited

prusswan commented Jun 19, 2020

victorovento commented Aug 16, 2022

hanchao commented Sep 19, 2016 •

edited

hanchao commented Sep 20, 2016 •

edited

maxdeepfield commented Dec 2, 2019 •

edited