Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#936 causes postgres to run out of memory #938

Closed
arichnad opened this issue Jun 22, 2020 · 30 comments
Closed

#936 causes postgres to run out of memory #938

arichnad opened this issue Jun 22, 2020 · 30 comments

Comments

@arichnad
Copy link
Contributor

@TomPohys @daliborjanak

Most recent commit ( c86f4a5, #936 ) causes postgres to run out of memory and terminate.

my .env has this at the end.

COPY_CONCURRENCY=30
UV_THREADPOOL_SIZE=8

MIN_ZOOM=0
MAX_ZOOM=14
BBOX=-125,25,-66,49

ran ./quickstart.sh north-america

postgres docker log says this:

LOG:  server process (PID 383) was terminated by signal 9: Killed
DETAIL:  Failed process was running: CREATE MATERIALIZED VIEW osm_building_block_gen1 AS
	SELECT *
	FROM osm_building_block_gen1();

I have 32gb of ram on this machine.

from quickstart:

Importing /sql/parallel/building.sql (md5 46d71248779b990be89fbf01ac27b833  -  254 lines) into Postgres...
Timing is on.
psql:/sql/parallel/building.sql:1: NOTICE:  Processing layer building
DO
Time: 1.028 ms
psql:/sql/parallel/building.sql:5: NOTICE:  schema "buildings" does not exist, skipping
DROP TRIGGER
Time: 0.185 ms
psql:/sql/parallel/building.sql:6: NOTICE:  trigger "trigger_flag" for relation "osm_building_polygon" does not exist, skipping
DROP TRIGGER
Time: 0.262 ms
CREATE FUNCTION
Time: 11.451 ms
psql:/sql/parallel/building.sql:78: NOTICE:  materialized view "osm_building_block_gen1" does not exist, skipping
DROP MATERIALIZED VIEW
Time: 0.213 ms
psql:psql:/sql/parallel/building.sql:/sql/parallel/water__waterway.sql:82:502:  server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
psql:/sql/parallel/building.sql:82: fatal: connection to server was lost
psql:psql:/sql/parallel/poi.sql:168: WARNING:  terminating connection because of crash of another server process

*** WARNING detected, aborting
                              make: *** [Makefile:357: import-sql] Error 1

If I run the same thing with the previous commit ( 6ac544f ), I do not get this error.

Please let me know if there's something I am doing wrong. I can test again with a different .env file if you would like.

@TomPohys
Copy link
Member

Hi @arichnad. Could you please check the configuration of PostgreSQL? It can be set up by PGtune

You can set up postgres.conf in a postgres container.

@arichnad
Copy link
Contributor Author

arichnad commented Jun 25, 2020

Hello, @TomPohys I'm using quickstart.sh. (sorry if this is too much detail, but it calls "make db-start-preloaded" which uses the "postgres" service in the "docker-compose.yml" file. That starts the "openmaptiles/postgis-preloaded:5.2" docker image, which has a default postgres configuration). So, any questions you have about the configuration of my postgres can be found in the openmaptiles/postgis-preloaded:5.2 image.

I asked postgres for its configuration though, using this: docker exec openmaptiles_postgres_1 psql -Uopenmaptiles --command='select * from pg_file_settings;'

And it replied with this:

                sourcefile                | sourceline | seqno |            name            |      setting       | applied | error 
------------------------------------------+------------+-------+----------------------------+--------------------+---------+-------
 /var/lib/postgresql/data/postgresql.conf |         59 |     1 | listen_addresses           | *                  | t       | 
 /var/lib/postgresql/data/postgresql.conf |         64 |     2 | max_connections            | 100                | t       | 
 /var/lib/postgresql/data/postgresql.conf |        113 |     3 | shared_buffers             | 128MB              | t       | 
 /var/lib/postgresql/data/postgresql.conf |        127 |     4 | dynamic_shared_memory_type | posix              | t       | 
 /var/lib/postgresql/data/postgresql.conf |        449 |     5 | log_timezone               | Etc/UTC            | t       | 
 /var/lib/postgresql/data/postgresql.conf |        553 |     6 | datestyle                  | iso, mdy           | t       | 
 /var/lib/postgresql/data/postgresql.conf |        555 |     7 | timezone                   | Etc/UTC            | t       | 
 /var/lib/postgresql/data/postgresql.conf |        568 |     8 | lc_messages                | en_US.utf8         | t       | 
 /var/lib/postgresql/data/postgresql.conf |        570 |     9 | lc_monetary                | en_US.utf8         | t       | 
 /var/lib/postgresql/data/postgresql.conf |        571 |    10 | lc_numeric                 | en_US.utf8         | t       | 
 /var/lib/postgresql/data/postgresql.conf |        572 |    11 | lc_time                    | en_US.utf8         | t       | 
 /var/lib/postgresql/data/postgresql.conf |        575 |    12 | default_text_search_config | pg_catalog.english | t       | 
(12 rows)

It seems like maybe you want me to change the postgis or postgis-preloaded section of https://github.com/openmaptiles/openmaptiles-tools repo to tune the postgres server differently? Is that true? I'm fine doing that, but I want to be sure that's what you think I should do.

To reiterate, the current config works on every commit up until the most recent one ( #936 ). 6ac544f works just fine.

@pka
Copy link

pka commented Jun 30, 2020

I saw the following error message in the logs which might be the same problem:

psql:/sql/parallel/building.sql:82: ERROR:  invalid memory alloc request size 1448987188
    CONTEXT:  PL/pgSQL function osm_building_block_gen1() line 6 at FOR over SELECT rows

@arichnad
Copy link
Contributor Author

arichnad commented Jul 1, 2020

@pka Nice, thanks for the info. Have you been able to test the previous version? 6ac544f? Does that version work for you? Thanks!

@TomPohys
Copy link
Member

TomPohys commented Jul 2, 2020

Hi @arichnad, I am sorry for the late answer. Could you please try to do it manually? Like from readme by step-by-step import and generating? There is not used postgis-preloaded.

The PostgreSQL configuration can be changed in pgdata/postgres.conf (can by done by ALTER SYSTEM). And you do not need to change openmaptiles-tools repo.

Thanks

@arichnad
Copy link
Contributor Author

arichnad commented Jul 2, 2020

@TomPohys Yes, thank you for your reply. I followed your process, and it did work!

However, you agree quickstart/postgis-preloaded (and/or #936) needs to be fixed, right? This is still a new bug introduced by the most recent commit? quickstart is mentioned at the top of the readme.

No, on a second try, it is still failing. I just ran:

make clean then make then make import-sql and got LOG: server process (PID 126) was terminated by signal 9: Killed

The PostgreSQL configuration can be changed in pgdata/postgres.conf (can by done by ALTER SYSTEM).

Should we describe this in the documentation then? I'm confused if this is something only I need to do. I've been using openmaptiles for years, and only now I've needed to do this. To be clear, before c86f4a5 this wasn't a problem.

@TomPohys
Copy link
Member

TomPohys commented Jul 6, 2020

Hi @arichnad, the PostgreSQL configuration is mostly OK with default values. It is very strange that PostgreSQL is Killed. When I tested the building aggregation it was fine and result was awesome. Now I am on slow internet, but tomorrow I will try to figure out, what cause the PostgreSQL kill.

@StephenAtty
Copy link

I just got hit by this after it spending 5 days building a materialized view (it only used 1 core)

ERROR: invalid memory alloc request size 1454083140
CONTEXT: PL/pgSQL function osm_building_block_gen1() line 6 at FOR over SELECT rows
STATEMENT: CREATE MATERIALIZED VIEW osm_building_block_gen1 AS
SELECT *
FROM osm_building_block_gen1();

@arichnad
Copy link
Contributor Author

arichnad commented Jul 7, 2020

@TomPohys what regions did you test against? Did you do the planet? Or certain continents? I will test against whatever you've tested against.

@pka @StephenAtty I'm not sure I've seen the "invalid memory alloc". Where do you see this? docker logs? output from quickstart? Thanks!

@StephenAtty can you test the old version? 6ac544f? Thanks.

@Duiesel
Copy link

Duiesel commented Jul 7, 2020

@pka same to me.
ERROR: invalid memory alloc request size 3186865628 CONTEXT: PL/pgSQL function osm_building_block_gen1() line 6 at FOR over SELECT rows STATEMENT: CREATE MATERIALIZED VIEW osm_building_block_gen1 AS SELECT * FROM osm_building_block_gen1();

@StephenAtty
Copy link

@arichnad - its in the docker logs.

I'll try that version but it will be a few days until I can tell you if the problem has gone away due to the length of time the materialized view takes to build.

@TomPohys
Copy link
Member

TomPohys commented Jul 8, 2020

Hi @arichnad, it was tested on the Czech Republic and Switzerland. And it was passing Github Actions testing.

@rakzcs
Copy link

rakzcs commented Jul 9, 2020

Same issue for me when trying to generate France. Postgres gets killed. Using 16 cores and 32gb of ram.

@Duiesel
Copy link

Duiesel commented Jul 9, 2020

@TomPohys what regions did you test against? Did you do the planet? Or certain continents? I will test against whatever you've tested against.

@pka @StephenAtty I'm not sure I've seen the "invalid memory alloc". Where do you see this? docker logs? output from quickstart? Thanks!

@StephenAtty can you test the old version? 6ac544f? Thanks.

Thanks @arichnad, switching to old version 6ac544fc9610b5 resolve the problem.

@arichnad
Copy link
Contributor Author

arichnad commented Jul 9, 2020

@TomPohys @daliborjanak I've tested with "czech-republic" and "switzerland". They do not fail on your commit. (./quickstart.sh switzerland and ./quickstart.sh czech-republic succeed for me on c86f4a5)

Can you test master against "north-america" or planet? I think they will fail for you. Based on what @Duiesel and I are reporting above, 6ac544f doesn't have this problem with larger areas.

@rakzcs Thanks. Can you test against the old version? 6ac544f? Thanks.

@rakzcs
Copy link

rakzcs commented Jul 10, 2020

psql:/sql/parallel/water__waterway.sql:psql:502:psql:psql:/sql/parallel/poi.sql:/sql/parallel/building.sql:/sql/parallel/transportation__transportation_name.sql:168:82:1129:   psql:server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
/sql/parallel/housenumber.sql:WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
psql: 23:/sql/parallel/building.sql:WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
82:  WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
fatal: psql:connection to server was lost
/sql/parallel/poi.sql:psql:psql:168:/sql/parallel/water__waterway.sql:/sql/parallel/transportation__transportation_name.sql: 502:1129: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
 server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
psql:psql:psql:/sql/parallel/poi.sql:/sql/parallel/water__waterway.sql:psql:/sql/parallel/housenumber.sql:168:502:/sql/parallel/transportation__transportation_name.sql:23:  1129: fatal: fatal:  connection to server was lost
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
connection to server was lost
fatal: psql:connection to server was lost
/sql/parallel/housenumber.sql:23: fatal: connection to server was lost
Importing /sql/parallel/landuse.sql (md5 2cf933c5c89b3c66327819350aba4313  -  225 lines) into Postgres...
Importing /sql/parallel/mountain_peak.sql (md5 4b5628e9db5a07ca87be6238ca6efe0f  -  104 lines) into Postgres...
Importing /sql/parallel/park.sql (md5 941d7ddc94358f2d7bc30123696684b9  -  506 lines) into Postgres...
psql:psql:psql:   error: error: error: could not connect to server: FATAL:  the database system is in recovery mode
could not connect to server: FATAL:  the database system is in recovery mode
could not connect to server: FATAL:  the database system is in recovery mode
Makefile:357: recipe for target 'import-sql' failed
make: *** [import-sql] Error 123

This is what i get when trying to generate france

@pka
Copy link

pka commented Jul 13, 2020

@arichnad After returning from holidays, I can report that my Europe import finished successfully (after 10 days...) with 6ac544f. I also tested Switzerland first (what else?), which finishes with #936 without problems. I can't remember where I saw the error message, probably in the PostgreSQL container output.

@arichnad
Copy link
Contributor Author

arichnad commented Jul 14, 2020

@TomPohys or @daliborjanak Can you test master against "north-america" or "europe" or planet? I think they will fail for you.

Based on what @Duiesel @pka and I are reporting above, 6ac544f doesn't have this problem with these larger areas.

@TomPohys
Copy link
Member

TomPohys commented Jul 16, 2020

Hi @arichnad, I test building aggregation on France too. There is a problem with RAM consuming of this building aggregation. For the Czech Republic or Switzerland is 32GB OK. For France, I have to extend swapfile to 64GB. During import-sql was consumed all 32GB of RAM and around 32GB of swapfile.

There has to be a way how to optimize RAM consuming for building aggregation on zoom 13. I am thinking about some area splitting or something like that.

@StephenAtty
Copy link

Europe generated to level 14 in just under 9 days under 6ac544f

@arichnad
Copy link
Contributor Author

@TomPohys @frodrigo is there an easy way to disable the building-aggregation-zoom-level-13 feature? I'm trying to figure out how to do this: by modifying building.yaml and building.sql, but it seems difficult. Thoughts on how to do this? Thanks in advance!

@frodrigo
Copy link
Contributor

@arichnad yes, like this makina-maps@6327477

@TomPohys
Copy link
Member

Hi @arichnad. If you do not want buildings on zoom level 13, you can delete lines 84 - 96 in building.sql https://github.com/openmaptiles/openmaptiles/blob/master/layers/building/building.sql#L84-L96

after this change, you need to run again make import-sql to make the changes in PostgreSQL function layer_building

If you still want building on zoom level 13, you can uncomment generalized table in mapping.yaml https://github.com/openmaptiles/openmaptiles/blob/master/layers/building/mapping.yaml#L1-L6, which create a table building_polygon_gen1 which you can use for zoom level 13. Then change table in https://github.com/openmaptiles/openmaptiles/blob/master/layers/building/building.sql#L93 and then run again make import-osm (which create table building_polygon_gen1) and make import-sql (which modify function layer_building)

@arichnad
Copy link
Contributor Author

arichnad commented Aug 13, 2020

perfect answers. thanks @TomPohys and @frodrigo.

arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Aug 17, 2020
buildings at zoom level 13 were going through a new "optimization" step that was crashing.

New optimization step described here:
	openmaptiles#936

Bugs about optimization failures:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Aug 17, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
@arichnad
Copy link
Contributor Author

@TomPohys I noticed you closed this ticket. I still get postgres crashing on large datasets (North America, Planet, Europe, etc). I'm very happy I have your temporary work-around, but I'm not sure that's a long-term solution.

If the new goal is to require 96gb of memory (64 swap + 32) for "Planet", then I'm fine with that, but I'm not sure Planet has been tested with 96gb, has it?

Thoughts? Thanks.

@TomPohys TomPohys reopened this Aug 31, 2020
@TomPohys
Copy link
Member

Hi @arichnad, I closed it because I thought, that this issue was solved (the building layer is hugely memory consuming) and quick workaround was proposed (extend swap or disable buildings on zoom level 13).

We can open a new issue building layer optimization to not be so memory demanding.

Is it OK?

@arichnad
Copy link
Contributor Author

arichnad commented Aug 31, 2020

Your new issue #974 is fine. I'll close this. Thanks!

arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Sep 2, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Sep 5, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Sep 16, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Sep 26, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Oct 3, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
@shermanw
Copy link

FWIW, I ran into this exact same memory issue while processing the Europe dataset (md5 c5d996b967d0af078af2e5817e1d43c8) from geofabrik.

psql:/sql/parallel/building.sql:82: ERROR: invalid memory alloc request size 1448987188

....and I was running on system with 256GB of RAM. I tested using a cfc243e and 7216593 etc... I tried the suggested workaround here https://github.com/openmaptiles/openmaptiles/blob/master/layers/building/building.sql#L84-L96 and I got this error:

psql:/sql/parallel/building.sql:82: ERROR: invalid memory alloc request size 1482909420^M CONTEXT: PL/pgSQL function osm_building_block_gen1() line 6 at FOR over SELECT rows

I reverted to 6ac544f and was able to complete a successful import.

@arichnad
Copy link
Contributor Author

@shermanw you might want to mention it on the new ticket (974) as well, as they are discussing how to fix 936 there.

@TomPohys
Copy link
Member

Hi @shermanw, maybe I miss one step and that is comment out a line https://github.com/openmaptiles/openmaptiles/blob/master/layers/building/building.yaml#L23.

There is

CREATE MATERIALIZED VIEW osm_building_block_gen1 AS
SELECT *
FROM osm_building_block_gen1();

which rise an error

arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Oct 22, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Oct 29, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Nov 4, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Nov 19, 2020
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
arichnad added a commit to cyclemap/openmaptiles-cycle that referenced this issue Jan 25, 2023
buildings at zoom level 13 were going through a new optimization step that was crashing (running out of memory I think).

New optimization step described here:
	openmaptiles#936

Bugs about optimization here:
	openmaptiles#938
	openmaptiles#937
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants