Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for TWKB output #1085

Open
giggls opened this issue Mar 3, 2020 · 5 comments
Open

Support for TWKB output #1085

giggls opened this issue Mar 3, 2020 · 5 comments

Comments

@giggls
Copy link

giggls commented Mar 3, 2020

TWKB is a multi-purpose format for serializing vector geometry data into a byte buffer, with an emphasis on minimizing size of the buffer. The format is already supported by PostGIS.

See https://github.com/TWKB/Specification/blob/master/twkb.md for details.

It is somewhat similar to PBF format.

While this will impact the database import and query speed in some way it will help to reduce the size of production tables in a significant way.

Importing a small extract of the planet I was able to reduce the size of the standard production tables by 36.3 % using seven digits after the decimal point, which should be the original resolution of the imported OSM data.

Here are the results of calling pg_relation_size on them:

 twkb_osm_point | planet_osm_point                                                                                                          
----------------+-----------------
 77 MB          | 86 MB
(1 Zeile)

 twkb_osm_line | planet_osm_line 
---------------+----------------
 98 MB         | 168 MB
(1 Zeile)

 twkb_osm_polygon | planet_osm_polygon 
------------------+-------------------
 226 MB           | 377 MB
(1 Zeile)

 twkb_osm_roads | planet_osm_roads 
----------------+-----------------
 16 MB          | 24 MB
(1 Zeile)

All tables:

----------------+-----------------
 417 MB         | 655 MB

A TWKB output option might well make sense using the generic backend as well as the new flex backend.

@joto
Copy link
Collaborator

joto commented Jan 26, 2023

I have done a lot of refactoring on the code that creates the geometries for PostgreSQL/PostGIS in the last months. It should now be relatively straightforward to add in code for generating TWKB instead of the default EWKB. But it's not something high on anybody's priority list and it's unclear whether it is worth the extra complexity.

If somebody wants to tackle this: Look for the geom_to_ewkb() function and a a geom_to_twkb() which does basically the same thing only for TWKB. Then add some kind of flag so the user can choose which one is called from the Lua config. Please talk to us here first, though, before you spend a lot of time on it.

@giggls
Copy link
Author

giggls commented Jan 26, 2023

I have been using ST_AsTWKB function from PostGIS to generate the tables for my example. Maybe these can be re-used.

TWKB will only make sense for an epsg:4326 version of the database.

If I correctly understand the geom_to_ewkb function correctly, the geometry is passed in geom format which I assume would be floating-point, right?

I don't know if converting fixed-point raw osm to floating point and back to fixed-point (for twkb) will work well.

@joto
Copy link
Collaborator

joto commented Jan 26, 2023

Unfortunately you can not use ST_AsTWKB() with osm2pgsql, because we are using COPY to import the data which doesn't allow putting a function in there. You could do something with a trigger though. If PostGIS would simply allow "magic" conversion of the normal geometry type to twkb it would simply work. I don't see any reason why that wouldn't be possible but PostGIS doesn't allow it.

Yes, the geometry classes in osm2pgsql use double precision floating point values for the coordinates. That's what you have to work with.

@rustprooflabs
Copy link
Contributor

If PostGIS would simply allow "magic" conversion of the normal geometry type to twkb it would simply work.

Would it be possible to define a generated column via the Lua style? If we could create generated columns it could be defined with ST_AsTWKB(). Of course that duplicates the geometry in the database but I already do that in a variety of scenarios because the benefit is often worth that storage overhead.

@giggls
Copy link
Author

giggls commented Jan 26, 2023

The sole reason for using TWKB is to reduce table size thus adding an additional column would invalidate using TWKB at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants