Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using osm2pgsql flex output #1

Open
joto opened this issue Nov 3, 2023 · 5 comments
Open

Using osm2pgsql flex output #1

joto opened this issue Nov 3, 2023 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@joto
Copy link

joto commented Nov 3, 2023

With modern osm2pgsql you don't need all the postprocessing SQL files. All that is done in there can be done by creating a config file for osm2pgsql that creates the data in the database in the format you need from the beginning. This is not only easier, it should also be a lot faster.

Here is a config file that should get you one table called planet_osm with everything in it very similar to what you had before:

local dtable = osm2pgsql.define_table{
    name = 'planet_osm',
    ids = { type = 'any', id_column = 'osm_id', type_column = 'osm_type' },
    columns = {
        { column = 'tags',  type = 'hstore' }, -- or use jsonb!
        { column = 'way',  type = 'geometry', not_null = true },
    },
    indexes = {
        { column = 'tags', method = 'gin' },
        { column = 'way', method = 'gist' }
    }
}

local function process(object, geometry)
    dtable:insert({
        tags = object.tags,
        way = geometry
    })
end

function osm2pgsql.process_node(object)
    process(object, object:as_point())
end

function osm2pgsql.process_way(object)
    if object.is_closed then
        process(object, object:as_polygon())
    else
        process(object, object:as_linestring())
    end
end

function osm2pgsql.process_relation(object)
    local t = object.tags.type
    if t == 'multipolygon' or t == 'boundary' then
        process(object, object:as_multipolygon())
    end
end

To import use

osm2pgsql -d osm -O flex -S config.lua planet-latest.osm.pbf

As an added benefit, you can use osm2pgsql to keep this database up-to-date, something that wasn't possible before due to the postprocessing.

@loganwilliams
Copy link
Contributor

@joto Thank you for this, looks to be a huge improvement. I will implement this when I do the next database update.

@loganwilliams loganwilliams added the enhancement New feature or request label Jan 22, 2024
@loganwilliams
Copy link
Contributor

@joto This works great. To be able to update the database, I'd add --slim to the arguments and then run something like osm2pgsql-replication update --verbose -- --output flex --style config.lua?

@joto
Copy link
Author

joto commented Jan 26, 2024

@loganwilliams Yes. If you have version 1.9.0 (or above) of osm2pgsql you don't even need any extra parameters to the osm2pgsql-replication command, because osm2pgsql remembers the settings from the import.

@loganwilliams
Copy link
Contributor

Hm, I'm seeing significantly degraded performance when testing queries against the planet_osm table that this produces vs. the planet_osm view that I was using in earlier versions. I wonder why this could be? Query planner:

old:

 Limit  (cost=257805.96..401022.30 rows=100 width=80)
   ->  Unique  (cost=257805.96..10217070.66 rows=6954 width=80)
         ->  Nested Loop  (cost=257805.96..10216526.84 rows=54382 width=80)
               ->  Gather Merge  (cost=257805.28..265904.59 rows=69542 width=127)
                     Workers Planned: 2
                     ->  Sort  (cost=256805.26..256877.70 rows=28976 width=127)
                           Sort Key: (st_centroid(planet_osm_point.way)), ((planet_osm_point.tags -> 'name'::text)), (st_y(st_transform(st_centroid(planet_osm_point.way), 4326))), (st_x(st_transform(st_centroid(planet_osm_point.way), 4326)))
                           ->  Result  (cost=42.20..254657.76 rows=28976 width=127)
                                 ->  Parallel Append  (cost=42.20..254657.76 rows=28976 width=127)
                                       ->  Parallel Bitmap Heap Scan on planet_osm_point  (cost=4778.71..251871.97 rows=28706 width=125)
                                             Recheck Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
                                             ->  Bitmap Index Scan on planet_osm_point_way_tags_idx  (cost=0.00..4761.49 rows=68894 width=0)
                                                   Index Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
                                       ->  Parallel Bitmap Heap Scan on planet_osm_polygon  (cost=42.20..2243.33 rows=324 width=292)
                                             Recheck Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
                                             ->  Bitmap Index Scan on planet_osm_polygon_way_tags_idx  (cost=0.00..42.06 rows=551 width=0)
                                                   Index Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
                                       ->  Parallel Bitmap Heap Scan on planet_osm_line  (cost=9.54..397.59 rows=57 width=333)
                                             Recheck Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
                                             ->  Bitmap Index Scan on planet_osm_line_way_tags_idx  (cost=0.00..9.52 rows=97 width=0)
                                                   Index Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags ? 'shop'::text))
               ->  Append  (cost=0.68..101.02 rows=3 width=196)
                     ->  Index Scan using planet_osm_point_way_tags_idx on planet_osm_point planet_osm_point_1  (cost=0.68..33.66 rows=1 width=32)
                           Index Cond: ((way && st_expand(planet_osm_point.way, '100'::double precision)) AND (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags @> '"amenity"=>"place_of_worship"'::hstore))
                           Filter: st_dwithin(planet_osm_point.way, way, '100'::double precision)
                     ->  Index Scan using planet_osm_line_way_tags_idx on planet_osm_line planet_osm_line_1  (cost=0.68..33.67 rows=1 width=272)
                           Index Cond: ((way && st_expand(planet_osm_point.way, '100'::double precision)) AND (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags @> '"amenity"=>"place_of_worship"'::hstore))
                           Filter: st_dwithin(planet_osm_point.way, way, '100'::double precision)
                     ->  Index Scan using planet_osm_polygon_way_tags_idx on planet_osm_polygon planet_osm_polygon_1  (cost=0.68..33.68 rows=1 width=223)
                           Index Cond: ((way && st_expand(planet_osm_point.way, '100'::double precision)) AND (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags @> '"amenity"=>"place_of_worship"'::hstore))
                           Filter: st_dwithin(planet_osm_point.way, way, '100'::double precision)
 JIT:
   Functions: 35
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(34 rows)

new:

 Limit  (cost=974297.51..1049612.74 rows=100 width=80)
   ->  Unique  (cost=974297.51..40719655.34 rows=52772 width=80)
         ->  Nested Loop  (cost=974297.51..40718909.38 rows=74596 width=80)
               ->  Gather Merge  (cost=973649.49..979795.65 rows=52772 width=327)
                     Workers Planned: 2
                     ->  Sort  (cost=972649.46..972704.43 rows=21988 width=327)
                           Sort Key: (st_centroid(planet_osm.way)), ((planet_osm.tags -> 'name'::text)), (st_y(st_transform(st_centroid(planet_osm.way), 4326))), (st_x(st_transform(st_centroid(planet_osm.way), 4326)))
                           ->  Parallel Bitmap Heap Scan on planet_osm  (cost=765309.86..971063.64 rows=21988 width=327)
                                 Recheck Cond: ((tags ? 'shop'::text) AND (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry))
                                 ->  BitmapAnd  (cost=765309.86..765309.86 rows=52772 width=0)
                                       ->  Bitmap Index Scan on planet_osm_tags_idx  (cost=0.00..51232.34 rows=5710979 width=0)
                                             Index Cond: (tags ? 'shop'::text)
                                       ->  Bitmap Index Scan on planet_osm_way_idx  (cost=0.00..714050.88 rows=11234444 width=0)
                                             Index Cond: (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry)
               ->  Bitmap Heap Scan on planet_osm planet_osm_1  (cost=648.02..677.04 rows=1 width=254)
                     Recheck Cond: ((way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry) AND (tags @> '"amenity"=>"place_of_worship"'::hstore))
                     Filter: st_dwithin(planet_osm.way, way, '100'::double precision)
                     ->  BitmapAnd  (cost=648.02..648.02 rows=1 width=0)
                           ->  Bitmap Index Scan on planet_osm_way_idx  (cost=0.00..75.60 rows=1123 width=0)
                                 Index Cond: ((way && st_expand(planet_osm.way, '100'::double precision)) AND (way && '0103000020110F00000100000005000000769F620324CD5FC1F38C35A62D1A5341769F620324CD5FC129634C157A8054413DA2C93BD5175DC129634C157A8054413DA2C93BD5175DC1F38C35A62D1A5341769F620324CD5FC1F38C35A62D1A5341'::geometry))
                           ->  Bitmap Index Scan on planet_osm_tags_idx  (cost=0.00..572.04 rows=56539 width=0)
                                 Index Cond: (tags @> '"amenity"=>"place_of_worship"'::hstore)
 JIT:
   Functions: 12
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(25 rows)

@joto
Copy link
Author

joto commented Feb 15, 2024

It is difficult to see from that EXPLAIN output alone what is happening here, without having access to the database. Looks like maybe some of the indexes behave differently for some reason or another.

I think the way forward would probably be to first go back to the original database layout with the three tables but use the flex output to generate it. And keep the original VIEW. That would already be a simplification compared to the situation before. Ideally the tables would then look the same and behave the same. Once that works reliably you can introduce more changes. But it is all a question of what kinds of queries you are doing exactly and what you want the database to do at that point.

@loganwilliams loganwilliams self-assigned this Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants