diff --git a/docs/flex.md b/docs/flex.md index f9e1246e5..48e066b7d 100644 --- a/docs/flex.md +++ b/docs/flex.md @@ -36,14 +36,14 @@ The following functions are defined: * `osm2pgsql.define_table(options)`: Define a table. This is the more flexible function behind all the other `define_*_table()` functions. It gives you more control than the more convenient other functions. -* `osm2pgsql.mark_way(id)`: Mark the OSM way with the specified id. This way - will be processed (again) in stage 2. You are expected to define one or more of the following functions: -* `osm2pgsql.process_node()`: Called for each node. -* `osm2pgsql.process_way()`: Called for each way. -* `osm2pgsql.process_relation()`: Called for each relation. +* `osm2pgsql.process_node()`: Called for each new or changed node. +* `osm2pgsql.process_way()`: Called for each new or changed way. +* `osm2pgsql.process_relation()`: Called for each new or changed relation. +* `osm2pgsql.select_relation_members()`: Called for each deleted or added + relation. See below for more details. Osm2pgsql also provides some additional functions in the [lua-lib.md](Lua helper library). @@ -76,7 +76,7 @@ stored as is, relation ids will be stored as negative numbers. With the `osm2pgsql.define_table()` function you can also define tables that * don't have any ids, but those tables will never be updated by osm2pgsql * take *any OSM object*, in this case the type of object is stored in an - additional column. + additional `char(1)` column. * are in a specific PostgresSQL tablespace (set option `data_tablespace`) or that get their indexes created in a specific tablespace (set option `index_tablespace`). @@ -242,25 +242,72 @@ a default transformation. These are the defaults: ## Stages -Osm2pgsql processes the data in up to two stages. You can mark ways in stage 1 -for processing in stage 2 by calling `osm2pgsql.mark_way(id)`. If you don't -mark any ways, nothing will be done in stage 2. +When processing OSM data, osm2pgsql reads the input file(s) in order, nodes +first, then ways, then relations. This means that when the ways are read and +processed, osm2pgsql can't know yet whether a way is in a relation (or in +several). But for some use cases we need to know in which relations a way is +and what the tags of these relations are or the roles of those member ways. +The typical case are relations of type `route` (bus routes etc.) where we +might want to render the `name` or `ref` from the route relation onto the +way geometry. + +The osm2pgsql flex backend supports this use case by adding an additional +"reprocessing" step. Osm2pgsql will call the Lua function +`osm2pgsql.select_relation_members()` for each added, modified, or deleted +relation. Your job is to figure out which way members in that relation might +need the information from the relation to be rendered correctly and return +those ids in a Lua table with the only field 'ways'. This is usually done with +a function like this: -You can look at `osm2pgsql.stage` to see in which stage you are. +``` +function osm2pgsql.select_relation_members(relation) + if relation.tags.type == 'route' then + return { ways = osm2pgsql.way_member_ids(relation) } + end +end +``` + +Instead of using the helper function `osm2pgsql.way_member_ids()` which +returns the ids of all way members, you can write your own code, for instance +if you want to check the roles. + +Note that `select_relation_members()` is called for deleted relations and for +the old version of a modified relation as well as for new relations and the +new version of a modified relation. This is needed, for instance, to correctly +mark member ways of deleted relations, because they need to be updated, too. +The decision whether a way is to be marked or not can only be based on the +tags of the relation and/or the roles of the members. If you take other +information into account, updates might not work correctly. -In stage 1 you can only look at each OSM object on its own. You can see -its id and tags (and possibly timestamp, changeset, user, etc.), but you don't -know how this OSM objects relates to other OSM objects (for instance whether a -way you are looking at is a member in a relation). If this is enough to decide -in which database table(s) and with what data an OSM object should end up in, -then you can process the OSM object in stage 1. If, on the other hand, you -need some extra information, you have to defer processing to the second stage. +In addition you have to store whatever information you need about the relation +in your `process_relation()` function in a global variable. + +After all relations are processed, osm2pgsql will reprocess all marked ways by +calling the `process_way()` function for them again. This time around you have +the information from the relation in the global variable and can use it. + +If you don't mark any ways, nothing will be done in this reprocessing stage. + +(It is currently not possible to mark nodes or relations. This might or might +not be added in future versions of osm2pgsql.) + +You can look at `osm2pgsql.stage` to see in which stage you are. You want to do all the processing you can in stage 1, because it is faster -and there is less memory overhead. For most use cases, stage 1 is enough. If -it is not, use stage 1 to store information about OSM objects you will need -in stage 2 in some global variable. In stage 2 you can read this information -again and use it to decide where and how to store the data in the database. +and there is less memory overhead. For most use cases, stage 1 is enough. + +Processing in two stages can add quite a bit of overhead. Because this feature +is new, there isn't much operational experience with it. So be a bit careful +when you are experimenting and watch memory and disk space consumption and +any extra time you are using. Keep in mind that: + +* All data stored in stage 1 for use in stage 2 in your Lua script will use + main memory. +* Keeping track of way ids marked in stage 1 needs some memory. +* To do the extra processing in stage 2, time is needed to get objects out + of the object store and reprocess them. +* Osm2pgsql will create an id index on all way tables to look up ways that + need to be deleted and re-created in stage 2. ## Command line options diff --git a/docs/lua-lib.md b/docs/lua-lib.md index 865e9df11..65901d239 100644 --- a/docs/lua-lib.md +++ b/docs/lua-lib.md @@ -36,6 +36,22 @@ if object.tags.highway then end ``` +## `way_member_ids` + +Synopsis: `osm2pgsql.way_member_ids(RELATION)` + +Description: Return an array table with the ids of all way members of RELATION. + +Example: + +``` +function osm2pgsql.select_relation_members(relation) + if relation.tags.type == 'route' then + return { ways = osm2pgsql.way_member_ids(relation) } + end +end +``` + ## `make_clean_tags_func` Synopsis: `osm2pgsql.make_clean_tags_func(KEYS)` diff --git a/flex-config/route-relations.lua b/flex-config/route-relations.lua index cfd251c5d..7a83e1ed6 100644 --- a/flex-config/route-relations.lua +++ b/flex-config/route-relations.lua @@ -22,8 +22,13 @@ tables.routes = osm2pgsql.define_relation_table('routes', { { column = 'tags', type = 'hstore' }, }) --- This will be used to store lists of relation ids queryable by way id -by_way_id = {} +-- This will be used to store information about relations queryable by member +-- way id. It is a table of tables. The outer table is indexed by the way id, +-- the inner table indexed by the relation id. This way even if the information +-- about a relation is added twice, it will be in there only once. It is +-- always good to write your osm2pgsql Lua code in an idempotent way, i.e. +-- it can be called any number of times and will lead to the same result. +local w2r = {} function clean_tags(tags) tags.odbl = nil @@ -40,54 +45,59 @@ function osm2pgsql.process_way(object) return end - -- In stage 1: Mark all remaining ways so we will see them again in stage 2 - if osm2pgsql.stage == 1 then - osm2pgsql.mark_way(object.id) - return - end - - -- We are now in stage 2 - clean_tags(object.tags) - -- Data we will store in the "highways" table always has the way tags + -- Data we will store in the "highways" table always has the tags from + -- the way local row = { tags = object.tags } - -- If there is any data from relations, add it in - local d = by_way_id[object.id] + -- If there is any data from parent relations, add it in + local d = w2r[object.id] if d then - table.sort(d.refs) - table.sort(d.ids) - row.rel_refs = table.concat(d.refs, ',') - row.rel_ids = '{' .. table.concat(d.ids, ',') .. '}' + local refs = {} + local ids = {} + for rel_id, rel_ref in pairs(d) do + refs[#refs + 1] = rel_ref + ids[#ids + 1] = rel_id + end + table.sort(refs) + table.sort(ids) + row.rel_refs = table.concat(refs, ',') + row.rel_ids = '{' .. table.concat(ids, ',') .. '}' end tables.highways:add_row(row) end -function osm2pgsql.process_relation(object) +-- This function is called for every added, modified, or deleted relation. +-- Its only job is to return the ids of all member ways of the specified +-- relation we want to see in stage 2 again. It MUST NOT store any information +-- about the relation! +function osm2pgsql.select_relation_members(relation) -- Only interested in relations with type=route, route=road and a ref + if relation.tags.type == 'route' and relation.tags.route == 'road' and relation.tags.ref then + return { ways = osm2pgsql.way_member_ids(relation) } + end +end + +-- The process_relation() function should store all information about way +-- members that might be needed in stage 2. +function osm2pgsql.process_relation(object) if object.tags.type == 'route' and object.tags.route == 'road' and object.tags.ref then tables.routes:add_row({ - tags = object.tags, - geom = { create = 'line' } + tags = object.tags }) - -- Go through all the members and store relation ids and refs so it + -- Go through all the members and store relation ids and refs so they -- can be found by the way id. for _, member in ipairs(object.members) do if member.type == 'w' then - if not by_way_id[member.ref] then - by_way_id[member.ref] = { - ids = {}, - refs = {} - } + if not w2r[member.ref] then + w2r[member.ref] = {} end - local d = by_way_id[member.ref] - table.insert(d.ids, object.id) - table.insert(d.refs, object.tags.ref) + w2r[member.ref][object.id] = object.tags.ref end end end diff --git a/src/init.lua b/src/init.lua index f0f4cf4d0..eb6316921 100644 --- a/src/init.lua +++ b/src/init.lua @@ -29,8 +29,14 @@ function osm2pgsql.define_area_table(_name, _columns, _options) return _define_table_impl('area', _name, _columns, _options) end -function osm2pgsql.mark_way(id) - return osm2pgsql.mark('w', id) +function osm2pgsql.way_member_ids(relation) + local ids = {} + for _, member in ipairs(relation.members) do + if member.type == 'w' then + ids[#ids + 1] = member.ref + end + end + return ids end function osm2pgsql.clamp(value, low, high) diff --git a/src/middle-pgsql.cpp b/src/middle-pgsql.cpp index 41db506f5..9a9007d78 100644 --- a/src/middle-pgsql.cpp +++ b/src/middle-pgsql.cpp @@ -580,8 +580,6 @@ void middle_pgsql_t::commit() m_db_copy.sync(); // release the copy thread and its query connection m_copy_thread->finish(); - - m_db_connection.close(); } void middle_pgsql_t::flush() { m_db_copy.sync(); } diff --git a/src/osmdata.cpp b/src/osmdata.cpp index fa5cb55f4..48b6177eb 100644 --- a/src/osmdata.cpp +++ b/src/osmdata.cpp @@ -114,6 +114,10 @@ void osmdata_t::relation_modify(osmium::Relation const &rel) const { auto &slim = slim_middle(); + for (auto &out : m_outs) { + out->select_relation_members(rel.id()); + } + slim.relation_delete(rel.id()); slim.relation_set(rel); @@ -223,6 +227,18 @@ class multithreaded_processor process_queue("relation", std::move(list), &output_t::pending_relation); } + /** + * Process all relations in the list in stage1c. + * + * \param list List of relation ids to work on. The list is moved into the + * function. + */ + void process_relations_stage1c(idlist_t &&list) + { + process_queue("relation", std::move(list), + &output_t::pending_relation_stage1c); + } + /** * Collect expiry tree information from all clones and merge it back * into the original outputs. @@ -371,27 +387,41 @@ progress_display_t osmdata_t::process_file(osmium::io::File const &file, return handler.progress(); } -void osmdata_t::process_stage1b() const +void osmdata_t::process_dependents() const { - if (m_dependency_manager->has_pending()) { - multithreaded_processor proc{m_conninfo, m_mid, m_outs, - (std::size_t)m_num_procs}; + multithreaded_processor proc{m_conninfo, m_mid, m_outs, + (std::size_t)m_num_procs}; + // stage 1b processing: process parents of changed objects + if (m_dependency_manager->has_pending()) { proc.process_ways(m_dependency_manager->get_pending_way_ids()); proc.process_relations( m_dependency_manager->get_pending_relation_ids()); proc.merge_expire_trees(); } + + // stage 1c processing: mark parent relations of marked objects as changed + for (auto &out : m_outs) { + for (auto const id : out->get_marked_way_ids()) { + m_dependency_manager->way_changed(id); + } + } + + // process parent relations of marked ways + if (m_dependency_manager->has_pending()) { + proc.process_relations_stage1c( + m_dependency_manager->get_pending_relation_ids()); + } } -void osmdata_t::process_stage2() const +void osmdata_t::reprocess_marked() const { for (auto &out : m_outs) { - out->stage2_proc(); + out->reprocess_marked(); } } -void osmdata_t::process_stage3() const +void osmdata_t::postprocess_database() const { // All the intensive parts of this are long-running PostgreSQL commands. // They will be run in a thread pool. @@ -432,10 +462,10 @@ void osmdata_t::stop() const } if (m_append) { - process_stage1b(); + process_dependents(); } - process_stage2(); + reprocess_marked(); - process_stage3(); + postprocess_database(); } diff --git a/src/osmdata.hpp b/src/osmdata.hpp index 62aea8744..4b2af8b2e 100644 --- a/src/osmdata.hpp +++ b/src/osmdata.hpp @@ -48,8 +48,8 @@ class osmdata_t osmium::Box const &bbox) const; /** - * Rest of the processing (stages 1b, 2, and 3). This is called once - * after process_file() was called for each input file. + * Rest of the processing (stages 1b, 1c, 2, and database postprocessing). + * This is called once after process_file() was called for each input file. */ void stop() const; @@ -68,21 +68,20 @@ class osmdata_t private: /** - * Run stage 1b processing: Process dependent objects. - * In append mode we need to process dependent objects that were marked - * earlier. + * Run stage 1b and stage 1c processing: Process dependent objects in + * append mode. */ - void process_stage1b() const; + void process_dependents() const; /** - * Run stage 2 processing: Process objects marked in stage 1 (if any). + * Run stage 2 processing: Reprocess objects marked in stage 1 (if any). */ - void process_stage2() const; + void reprocess_marked() const; /** - * Run stage 3 processing: Clustering and index creation. + * Run postprocessing on database: Clustering and index creation. */ - void process_stage3() const; + void postprocess_database() const; slim_middle_t &slim_middle() const noexcept; diff --git a/src/output-flex.cpp b/src/output-flex.cpp index e6598ded7..0153ce3ef 100644 --- a/src/output-flex.cpp +++ b/src/output-flex.cpp @@ -57,7 +57,6 @@ static std::mutex lua_mutex; } TRAMPOLINE(app_define_table, define_table) -TRAMPOLINE(app_mark, mark) TRAMPOLINE(app_get_bbox, get_bbox) TRAMPOLINE(table_name, name) TRAMPOLINE(table_schema, schema) @@ -512,22 +511,6 @@ void output_flex_t::write_row(table_connection_t *table_connection, copy_mgr->finish_line(); } -int output_flex_t::app_mark() -{ - char const *type_name = luaL_checkstring(lua_state(), 1); - if (!type_name) { - return 0; - } - - osmium::object_id_type const id = luaL_checkinteger(lua_state(), 2); - - if (type_name[0] == 'w') { - m_stage2_way_ids->set(id); - } - - return 0; -} - // Gets all way nodes from the middle the first time this is called. std::size_t output_flex_t::get_way_nodes() { @@ -798,6 +781,10 @@ int output_flex_t::table_tostring() int output_flex_t::table_add_row() { + if (m_disable_add_row) { + return 0; + } + if (m_calling_context != calling_context::process_node && m_calling_context != calling_context::process_way && m_calling_context != calling_context::process_relation) { @@ -828,10 +815,6 @@ int output_flex_t::table_add_row() throw std::runtime_error{ "Trying to add way to table '{}'"_format(table.name())}; } - if (m_in_stage2) { - delete_from_table(&table_connection, osmium::item_type::way, - m_context_way->id()); - } add_row(&table_connection, *m_context_way); } else if (m_context_relation) { if (!table.matches_type(osmium::item_type::relation)) { @@ -1020,8 +1003,6 @@ void output_flex_t::add_row(table_connection_t *table_connection, void output_flex_t::call_lua_function(prepared_lua_function_t func, osmium::OSMObject const &object) { - std::lock_guard guard{lua_mutex}; - m_calling_context = func.context(); lua_pushvalue(lua_state(), func.index()); // the function to call @@ -1039,6 +1020,13 @@ void output_flex_t::call_lua_function(prepared_lua_function_t func, m_calling_context = calling_context::main; } +void output_flex_t::get_mutex_and_call_lua_function( + prepared_lua_function_t func, osmium::OSMObject const &object) +{ + std::lock_guard guard{lua_mutex}; + call_lua_function(func, object); +} + void output_flex_t::pending_way(osmid_t id) { if (!m_process_way) { @@ -1055,33 +1043,130 @@ void output_flex_t::pending_way(osmid_t id) auto &way = m_buffer.get(0); m_context_way = &way; - call_lua_function(m_process_way, way); + get_mutex_and_call_lua_function(m_process_way, way); m_context_way = nullptr; m_num_way_nodes = std::numeric_limits::max(); m_buffer.clear(); } +void output_flex_t::select_relation_members(osmium::Relation const &relation) +{ + if (!m_select_relation_members) { + return; + } + + std::lock_guard guard{lua_mutex}; + + m_context_relation = &relation; + call_lua_function(m_select_relation_members, relation); + m_context_relation = nullptr; + + // If the function returned nil there is nothing to be marked. + if (lua_type(lua_state(), -1) == LUA_TNIL) { + lua_pop(lua_state(), 1); // return value (nil) + return; + } + + if (lua_type(lua_state(), -1) != LUA_TTABLE) { + throw std::runtime_error{"select_relation_members() returned something " + "other than nil or a table"}; + } + + // We have established that we have a table. Get the 'ways' field... + lua_getfield(lua_state(), -1, "ways"); + int const ltype = lua_type(lua_state(), -1); + + // No 'ways' field, that is okay, nothing to be marked. + if (ltype == LUA_TNIL) { + lua_pop(lua_state(), 2); // return value (a table), ways field (nil) + return; + } + + if (ltype != LUA_TTABLE) { + throw std::runtime_error{ + "Table returned from select_relation_members() contains 'ways' " + "field, but it isn't an array table"}; + } + + // Iterate over the 'ways' table to get all ids... + lua_pushnil(lua_state()); + while (lua_next(lua_state(), -2) != 0) { + if (!lua_isnumber(lua_state(), -2)) { + throw std::runtime_error{ + "Table returned from select_relation_members() contains 'ways' " + "field, but it isn't an array table"}; + } + + osmid_t const id = lua_tointeger(lua_state(), -1); + if (id == 0) { + throw std::runtime_error{ + "Table returned from select_relation_members() contains 'ways' " + "field, which must contain an array of non-zero integer way " + "ids"}; + } + + m_stage2_way_ids->set(id); + lua_pop(lua_state(), 1); // value pushed by lua_next() + } + + lua_pop(lua_state(), 2); // return value (a table), ways field (a table) +} + +void output_flex_t::select_relation_members(osmid_t id) +{ + if (!m_select_relation_members) { + return; + } + + if (!m_mid->relation_get(id, m_rels_buffer)) { + return; + } + + select_relation_members(m_rels_buffer.get(0)); + + m_rels_buffer.clear(); +} + void output_flex_t::pending_relation(osmid_t id) { - if (!m_process_relation) { + if (!m_process_relation && !m_select_relation_members) { return; } - // Try to fetch the relation from the DB - // Note that we cannot use the global buffer here because - // we cannot keep a reference to the relation and an autogrow buffer - // might be relocated when more data is added. if (!m_mid->relation_get(id, m_rels_buffer)) { return; } + auto const &relation = m_rels_buffer.get(0); + select_relation_members(relation); delete_from_tables(osmium::item_type::relation, id); + if (m_process_relation) { + m_context_relation = &relation; + get_mutex_and_call_lua_function(m_process_relation, relation); + m_context_relation = nullptr; + } + + m_rels_buffer.clear(); +} + +void output_flex_t::pending_relation_stage1c(osmid_t id) +{ + if (!m_process_relation) { + return; + } + + if (!m_mid->relation_get(id, m_rels_buffer)) { + return; + } auto const &relation = m_rels_buffer.get(0); + m_disable_add_row = true; m_context_relation = &relation; - call_lua_function(m_process_relation, relation); + get_mutex_and_call_lua_function(m_process_relation, relation); m_context_relation = nullptr; + m_disable_add_row = false; + m_rels_buffer.clear(); } @@ -1113,7 +1198,7 @@ void output_flex_t::node_add(osmium::Node const &node) } m_context_node = &node; - call_lua_function(m_process_node, node); + get_mutex_and_call_lua_function(m_process_node, node); m_context_node = nullptr; } @@ -1126,7 +1211,7 @@ void output_flex_t::way_add(osmium::Way *way) } m_context_way = way; - call_lua_function(m_process_way, *way); + get_mutex_and_call_lua_function(m_process_way, *way); m_context_way = nullptr; m_num_way_nodes = std::numeric_limits::max(); } @@ -1137,8 +1222,10 @@ void output_flex_t::relation_add(osmium::Relation const &relation) return; } + select_relation_members(relation); + m_context_relation = &relation; - call_lua_function(m_process_relation, relation); + get_mutex_and_call_lua_function(m_process_relation, relation); m_context_relation = nullptr; } @@ -1185,6 +1272,7 @@ void output_flex_t::way_delete(osmid_t osm_id) void output_flex_t::relation_delete(osmid_t osm_id) { + select_relation_members(osm_id); delete_from_tables(osmium::item_type::relation, osm_id); } @@ -1228,7 +1316,8 @@ output_flex_t::clone(std::shared_ptr const &mid, { return std::make_shared( mid, *get_options(), copy_thread, true, m_lua_state, m_process_node, - m_process_way, m_process_relation, m_tables, m_stage2_way_ids); + m_process_way, m_process_relation, m_select_relation_members, m_tables, + m_stage2_way_ids); } output_flex_t::output_flex_t( @@ -1237,6 +1326,7 @@ output_flex_t::output_flex_t( std::shared_ptr lua_state, prepared_lua_function_t process_node, prepared_lua_function_t process_way, prepared_lua_function_t process_relation, + prepared_lua_function_t select_relation_members, std::shared_ptr> tables, std::shared_ptr stage2_way_ids) : output_t(mid, o), m_tables(std::move(tables)), @@ -1246,7 +1336,8 @@ output_flex_t::output_flex_t( m_buffer(32768, osmium::memory::Buffer::auto_grow::yes), m_rels_buffer(1024, osmium::memory::Buffer::auto_grow::yes), m_process_node(process_node), m_process_way(process_way), - m_process_relation(process_relation) + m_process_relation(process_relation), + m_select_relation_members(select_relation_members) { assert(copy_thread); @@ -1284,7 +1375,6 @@ void output_flex_t::init_lua(std::string const &filename) luaX_add_table_func(lua_state(), "define_table", lua_trampoline_app_define_table); - luaX_add_table_func(lua_state(), "mark", lua_trampoline_app_mark); lua_setglobal(lua_state(), "osm2pgsql"); @@ -1335,25 +1425,42 @@ void output_flex_t::init_lua(std::string const &filename) // Check whether the process_* functions are available and store them on // the Lua stack for fast access later lua_getglobal(lua_state(), "osm2pgsql"); + m_process_node = prepared_lua_function_t{ lua_state(), calling_context::process_node, "process_node"}; m_process_way = prepared_lua_function_t{ lua_state(), calling_context::process_way, "process_way"}; m_process_relation = prepared_lua_function_t{ lua_state(), calling_context::process_relation, "process_relation"}; + m_select_relation_members = prepared_lua_function_t{ + lua_state(), calling_context::select_relation_members, + "select_relation_members", 1}; lua_remove(lua_state(), 1); // global "osm2pgsql" } -void output_flex_t::stage2_proc() +idset_t const &output_flex_t::get_marked_way_ids() +{ + if (m_stage2_way_ids->empty()) { + fmt::print(stderr, "Skipping stage 1c (no marked ways).\n"); + } else { + fmt::print(stderr, + "Entering stage 1c processing of {} ways...\n"_format( + m_stage2_way_ids->size())); + m_stage2_way_ids->sort_unique(); + } + + return *m_stage2_way_ids; +} + +void output_flex_t::reprocess_marked() { if (m_stage2_way_ids->empty()) { - fmt::print(stderr, "Skipping stage 2 (no marked ways).\n"); + fmt::print(stderr, "No marked ways (Skipping stage 2).\n"); return; } - fmt::print(stderr, "Entering stage 2...\n"); - m_in_stage2 = true; + fmt::print(stderr, "Reprocess marked ways (stage 2)...\n"); if (!m_options.append) { fmt::print(stderr, "Creating id indexes...\n"); @@ -1380,7 +1487,7 @@ void output_flex_t::stage2_proc() lua_setfield(lua_state(), -2, "stage"); lua_pop(lua_state(), 1); // osm2pgsql - fmt::print(stderr, "Entering stage 2 processing of {} ways...\n"_format( + fmt::print(stderr, "There are {} ways to reprocess...\n"_format( m_stage2_way_ids->size())); m_stage2_way_ids->sort_unique(); @@ -1390,8 +1497,10 @@ void output_flex_t::stage2_proc() continue; } auto &way = m_buffer.get(0); - way_add(&way); + way_modify(&way); } + + // We don't need these any more so can free the memory. m_stage2_way_ids->clear(); } diff --git a/src/output-flex.hpp b/src/output-flex.hpp index d2b51c9f6..bf7a47a7b 100644 --- a/src/output-flex.hpp +++ b/src/output-flex.hpp @@ -37,7 +37,8 @@ enum class calling_context main = 0, ///< In main context, i.e. the Lua script outside any callbacks process_node = 1, ///< In the process_node() callback process_way = 2, ///< In the process_way() callback - process_relation = 3 ///< In the process_relation() callback + process_relation = 3, ///< In the process_relation() callback + select_relation_members = 4 ///< In the select_relation_members() callback }; /** @@ -95,6 +96,7 @@ class output_flex_t : public output_t prepared_lua_function_t process_node = {}, prepared_lua_function_t process_way = {}, prepared_lua_function_t process_relation = {}, + prepared_lua_function_t select_relation_members = {}, std::shared_ptr> tables = std::make_shared>(), std::shared_ptr stage2_way_ids = std::make_shared()); @@ -115,10 +117,14 @@ class output_flex_t : public output_t void stop(thread_pool_t *pool) override; void sync() override; - void stage2_proc() override; + idset_t const &get_marked_way_ids() override; + void reprocess_marked() override; void pending_way(osmid_t id) override; void pending_relation(osmid_t id) override; + void pending_relation_stage1c(osmid_t id) override; + + void select_relation_members(osmid_t id) override; void node_add(osmium::Node const &node) override; void way_add(osmium::Way *way) override; @@ -135,7 +141,7 @@ class output_flex_t : public output_t void merge_expire_trees(output_t *other) override; int app_define_table(); - int app_mark(); + int app_mark_way(); int app_get_bbox(); int table_tostring(); @@ -146,6 +152,7 @@ class output_flex_t : public output_t private: void init_clone(); + void select_relation_members(osmium::Relation const &relation); /** * Call a Lua function that was "prepared" earlier with the OSMObject @@ -154,6 +161,10 @@ class output_flex_t : public output_t void call_lua_function(prepared_lua_function_t func, osmium::OSMObject const &object); + /// Aquire the lua_mutex and the call `call_lua_function()`. + void get_mutex_and_call_lua_function(prepared_lua_function_t func, + osmium::OSMObject const &object); + void init_lua(std::string const &filename); flex_table_t &create_flex_table(); @@ -192,16 +203,25 @@ class output_flex_t : public output_t std::shared_ptr> m_tables; std::vector m_table_connections; + // This is shared between all clones of the output and must only be + // accessed while protected using the lua_mutex. std::shared_ptr m_stage2_way_ids; std::shared_ptr m_copy_thread; + // This is shared between all clones of the output and must only be + // accessed while protected using the lua_mutex. std::shared_ptr m_lua_state; geom::osmium_builder_t m_builder; expire_tiles m_expire; osmium::memory::Buffer m_buffer; + + // This buffer is used for one relation at a time only. Members of the + // relation are stored in m_buffer instead. If we would store more objects + // in this buffer, it might autogrow which would invalidate the reference + // we take to the relation inside. osmium::memory::Buffer m_rels_buffer; osmium::Node const *m_context_node = nullptr; @@ -213,10 +233,15 @@ class output_flex_t : public output_t prepared_lua_function_t m_process_node; prepared_lua_function_t m_process_way; prepared_lua_function_t m_process_relation; + prepared_lua_function_t m_select_relation_members; calling_context m_calling_context = calling_context::main; - bool m_in_stage2 = false; + /** + * This is set before calling stage1c process_relation() to disable the + * add_row() command. + */ + bool m_disable_add_row = false; }; #endif // OSM2PGSQL_OUTPUT_FLEX_HPP diff --git a/src/output.hpp b/src/output.hpp index 5259ccdf8..b2fbd0c2b 100644 --- a/src/output.hpp +++ b/src/output.hpp @@ -10,6 +10,8 @@ * Associated tags: name, type etc. */ +#include + #include "options.hpp" #include "thread-pool.hpp" @@ -35,12 +37,21 @@ class output_t virtual void stop(thread_pool_t *pool) = 0; virtual void sync() = 0; - virtual void stage2_proc() {} + virtual osmium::index::IdSetSmall const &get_marked_way_ids() + { + static osmium::index::IdSetSmall const ids{}; + return ids; + } + + virtual void reprocess_marked() {} virtual bool need_forward_dependencies() const noexcept { return true; } virtual void pending_way(osmid_t id) = 0; virtual void pending_relation(osmid_t id) = 0; + virtual void pending_relation_stage1c(osmid_t) {} + + virtual void select_relation_members(osmid_t) {} virtual void node_add(osmium::Node const &node) = 0; virtual void way_add(osmium::Way *way) = 0; diff --git a/tests/data/test_output_flex_extra.lua b/tests/data/test_output_flex_extra.lua index 6b868ab55..1c0f37b38 100644 --- a/tests/data/test_output_flex_extra.lua +++ b/tests/data/test_output_flex_extra.lua @@ -21,52 +21,46 @@ tables.routes = osm2pgsql.define_table{ } } -local by_way_id = {} +local w2r = {} function osm2pgsql.process_way(object) - if osm2pgsql.stage == 1 then - osm2pgsql.mark_way(object.id) - return - end - local row = { tags = object.tags, geom = { create = 'line' } } - -- if there is any data from relations, add it in - local d = by_way_id[object.id] + local d = w2r[object.id] if d then - local keys = {} - for k,v in pairs(d.refs) do - keys[#keys + 1] = k + local refs = {} + for rel_id, rel_ref in pairs(d) do + refs[#refs + 1] = rel_ref end + table.sort(refs) - row.refs = table.concat(keys, ',') - -- row.rel_ids = '{' .. table.concat(d.ids, ',') .. '}' + row.refs = table.concat(refs, ',') end tables.highways:add_row(row) end +function osm2pgsql.select_relation_members(relation) + if relation.tags.type == 'route' then + return { ways = osm2pgsql.way_member_ids(relation) } + end +end + function osm2pgsql.process_relation(object) if object.tags.type ~= 'route' then return end local mlist = {} - for i, member in ipairs(object.members) do + for _, member in ipairs(object.members) do if member.type == 'w' then - osm2pgsql.mark_way(member.ref) - if not by_way_id[member.ref] then - by_way_id[member.ref] = { - ids = {}, - refs = {} - } + if not w2r[member.ref] then + w2r[member.ref] = {} end - local d = by_way_id[member.ref] - table.insert(d.ids, object.id) - d.refs[object.tags.ref] = 1 + w2r[member.ref][object.id] = object.tags.ref mlist[#mlist + 1] = member.ref end end diff --git a/tests/regression.py b/tests/regression.py index daf48aa16..54962b048 100644 --- a/tests/regression.py +++ b/tests/regression.py @@ -601,7 +601,7 @@ class TestPgsqlUpdate(BaseUpdateRunner, unittest.TestCase, class TestPgsqlUpdateParallel(BaseUpdateRunner, unittest.TestCase, PgsqlBaseTests): - extra_params = ['--slim', '--number-processes', '16'] + extra_params = ['--slim', '--number-processes', '15'] class TestPgsqlUpdateSmallCache(BaseUpdateRunner, unittest.TestCase, PgsqlBaseTests): diff --git a/tests/test-output-flex-extra.cpp b/tests/test-output-flex-extra.cpp index bfa7b95ce..3ec33a947 100644 --- a/tests/test-output-flex-extra.cpp +++ b/tests/test-output-flex-extra.cpp @@ -182,3 +182,348 @@ TEST_CASE("relation data on ways") CHECK(0 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); CHECK(2 == conn.get_count("osm2pgsql_test_highways", "refs IS NULL")); } + +TEST_CASE("relation data on ways: delete or re-tag relation") +{ + testing::opt_t options = testing::opt_t() + .slim() + .flex("test_output_flex_extra.lua") + .srs(PROJ_LATLONG); + + // create database with three ways and a relation on two of them + REQUIRE_NOTHROW( + db.run_import(options, "n10 v1 dV x10.0 y10.0\n" + "n11 v1 dV x10.0 y10.2\n" + "n12 v1 dV x10.2 y10.2\n" + "n13 v1 dV x10.2 y10.0\n" + "n14 v1 dV x10.3 y10.0\n" + "n15 v1 dV x10.4 y10.0\n" + "w20 v1 dV Thighway=primary Nn10,n11,n12\n" + "w21 v1 dV Thighway=secondary Nn12,n13\n" + "w22 v1 dV Thighway=secondary Nn13,n14,n15\n" + "r30 v1 dV Ttype=route,ref=X11 Mw20@,w21@\n")); + + auto conn = db.db().connect(); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(2 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs IS NULL")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + + SECTION("delete relation") + { + REQUIRE_NOTHROW(db.run_import(options.append(), "r30 v2 dD\n")); + } + + SECTION("change tags on relation") + { + REQUIRE_NOTHROW(db.run_import(options.append(), + "r30 v2 dV Ttype=foo Mw20@,w21@\n")); + } + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(0 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(0 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(3 == conn.get_count("osm2pgsql_test_highways", "refs IS NULL")); + + CHECK(0 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); +} + +TEST_CASE("relation data on ways: delete way in other relation") +{ + testing::opt_t options = testing::opt_t() + .slim() + .flex("test_output_flex_extra.lua") + .srs(PROJ_LATLONG); + + // create database with three ways and two relations on them + REQUIRE_NOTHROW( + db.run_import(options, "n10 v1 dV x10.0 y10.0\n" + "n11 v1 dV x10.0 y10.2\n" + "n12 v1 dV x10.2 y10.2\n" + "n13 v1 dV x10.2 y10.0\n" + "n14 v1 dV x10.3 y10.0\n" + "n15 v1 dV x10.4 y10.0\n" + "w20 v1 dV Thighway=primary Nn10,n11,n12\n" + "w21 v1 dV Thighway=secondary Nn12,n13\n" + "w22 v1 dV Thighway=secondary Nn13,n14,n15\n" + "r30 v1 dV Ttype=no-route Mw20@,w21@\n" + "r31 v1 dV Ttype=route,ref=X11 Mw21@,w22@\n")); + + auto conn = db.db().connect(); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(2 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs IS NULL")); + + CHECK(0 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); + + SECTION("change way node list") + { + REQUIRE_NOTHROW(db.run_import(options.append(), + "w20 v2 dV Thighway=primary Nn10,n11\n")); + } + + SECTION("change way tags") + { + REQUIRE_NOTHROW(db.run_import( + options.append(), + "w20 v2 dV Thighway=primary,name=foo Nn10,n11,n12\n")); + } + + SECTION("change way node") + { + REQUIRE_NOTHROW( + db.run_import(options.append(), "n10 v2 dV x11.0 y10.0\n")); + } + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(2 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs IS NULL")); + + CHECK(0 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); +} + +TEST_CASE("relation data on ways: changing things in one relation should not change output") +{ + testing::opt_t options = testing::opt_t() + .slim() + .flex("test_output_flex_extra.lua") + .srs(PROJ_LATLONG); + + // create database with three ways and two relations on them + REQUIRE_NOTHROW( + db.run_import(options, "n10 v1 dV x10.0 y10.0\n" + "n11 v1 dV x10.0 y10.2\n" + "n12 v1 dV x10.2 y10.2\n" + "n13 v1 dV x10.2 y10.0\n" + "n14 v1 dV x10.3 y10.0\n" + "n15 v1 dV x10.4 y10.0\n" + "w20 v1 dV Thighway=primary Nn10,n11,n12\n" + "w21 v1 dV Thighway=secondary Nn12,n13\n" + "w22 v1 dV Thighway=secondary Nn13,n14,n15\n" + "r30 v1 dV Ttype=route,ref=Y11 Mw20@,w21@\n" + "r31 v1 dV Ttype=route,ref=X11 Mw21@,w22@\n")); + + auto conn = db.db().connect(); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(2 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Y11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); + + SECTION("new version of relation") + { + REQUIRE_NOTHROW(db.run_import( + options.append(), "r30 v2 dV Ttype=route,ref=Y11 Mw20@,w21@\n")); + } + + SECTION("change way node list") + { + REQUIRE_NOTHROW(db.run_import(options.append(), + "w20 v2 dV Thighway=primary Nn10,n11\n")); + } + + SECTION("change way tags") + { + REQUIRE_NOTHROW(db.run_import( + options.append(), + "w20 v2 dV Thighway=primary,name=foo Nn10,n11,n12\n")); + } + + SECTION("change way node") + { + REQUIRE_NOTHROW( + db.run_import(options.append(), "n10 v2 dV x11.0 y10.0\n")); + } + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(2 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Y11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); +} + +TEST_CASE("relation data on ways: change relation (two rels)") +{ + testing::opt_t options = testing::opt_t() + .slim() + .flex("test_output_flex_extra.lua") + .srs(PROJ_LATLONG); + + // create database with three ways and two relations on them + REQUIRE_NOTHROW( + db.run_import(options, "n10 v1 dV x10.0 y10.0\n" + "n11 v1 dV x10.0 y10.2\n" + "n12 v1 dV x10.2 y10.2\n" + "n13 v1 dV x10.2 y10.0\n" + "n14 v1 dV x10.3 y10.0\n" + "n15 v1 dV x10.4 y10.0\n" + "w20 v1 dV Thighway=primary Nn10,n11,n12\n" + "w21 v1 dV Thighway=secondary Nn12,n13\n" + "w22 v1 dV Thighway=secondary Nn13,n14,n15\n" + "r30 v1 dV Ttype=route,ref=Y11 Mw20@,w21@\n" + "r31 v1 dV Ttype=route,ref=X11 Mw21@,w22@\n")); + + auto conn = db.db().connect(); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(2 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Y11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); + + REQUIRE_NOTHROW(db.run_import( + options.append(), "r30 v2 dV Ttype=route,ref=Z11 Mw20@,w21@\n")); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(2 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Z11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Z11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); +} + +TEST_CASE("relation data on ways: change relation (three rels)") +{ + testing::opt_t options = testing::opt_t() + .slim() + .flex("test_output_flex_extra.lua") + .srs(PROJ_LATLONG); + + // create database with three ways and two relations on them + REQUIRE_NOTHROW( + db.run_import(options, "n10 v1 dV x10.0 y10.0\n" + "n11 v1 dV x10.0 y10.2\n" + "n12 v1 dV x10.2 y10.2\n" + "n13 v1 dV x10.2 y10.0\n" + "n14 v1 dV x10.3 y10.0\n" + "n15 v1 dV x10.4 y10.0\n" + "w20 v1 dV Thighway=primary Nn10,n11,n12\n" + "w21 v1 dV Thighway=secondary Nn12,n13\n" + "w22 v1 dV Thighway=secondary Nn13,n14,n15\n" + "r30 v1 dV Ttype=route,ref=Y11 Mw20@,w21@\n" + "r31 v1 dV Ttype=route,ref=X11 Mw21@,w22@\n" + "r32 v1 dV Ttype=route,ref=Z11 Mw22@\n")); + + auto conn = db.db().connect(); + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(3 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Z11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '22'")); + + SECTION("change way node list") + { + REQUIRE_NOTHROW(db.run_import(options.append(), + "w20 v2 dV Thighway=primary Nn10,n11\n")); + } + + SECTION("change way tags") + { + REQUIRE_NOTHROW(db.run_import( + options.append(), + "w20 v2 dV Thighway=primary,name=foo Nn10,n11,n12\n")); + } + + SECTION("change way node") + { + REQUIRE_NOTHROW( + db.run_import(options.append(), "n10 v2 dV x11.0 y10.0\n")); + } + + + CHECK(3 == conn.get_count("osm2pgsql_test_highways")); + CHECK(3 == conn.get_count("osm2pgsql_test_routes")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'primary'")); + CHECK(2 == conn.get_count("osm2pgsql_test_highways", + "tags->'highway' = 'secondary'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Y11'")); + CHECK(1 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11,Z11'")); + CHECK(0 == conn.get_count("osm2pgsql_test_highways", "refs = 'X11'")); + + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '20,21'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '21,22'")); + CHECK(1 == conn.get_count("osm2pgsql_test_routes", "members = '22'")); +}