Heap exhausted during garbage collection #962

AnrDaemon · 2019-05-07T12:01:58Z

While trying to run a full new import using fresh pgloader container, I stumbled upon

2019-05-07T11:56:36.063000Z DATA < #(" 0" "1e73536d-364a-4ac5-9556-26080ae675be"
    "a2159ea6-dcd4-4bfb-a659-8a30d17471a3" " 7" "000" "0" " 0" "008" "" " 2"
    "2014-06-03" "НСТ Весна-6" "5074" "5074"
    "c24a92ab-0163-469a-a56a-081cf40ddbe4" "НСТ Весна-6" "46436000000"
    "46736000" "21" "5a11404e-3a4f-4943-b60c-51480729dcfb" "000"
    "500000080000260" "142180" "02bf0bee-3882-4364-8ad1-dc004a0ef2be" "50"
    "тер" "2014-02-28" "0260" "" "" "2016-08-09" "000" "0000" "000" " 0"
    "3e5ba4cb-7b5e-4258-86e0-8b91ca39d601" "0000" "" "0")
2019-05-07T11:56:36.065000Z DATA < #(" 0" "2dbae092-4d28-4692-bfd7-509c92eb6896"
    "30e0f893-20d3-4c65-b476-7db50ad93492" " 7" "000" "0" " 0" "008" "" " 2"
    "2014-06-03" "Овальная" "5074" "5074"
    "5e1ee261-ee43-4004-8c93-2f5f33d3c864" "Овальная" "46436000000" "46736000"
    "21" "5a11404e-3a4f-4943-b60c-51480729dcfb" "000" "500000080000051"
    "142180" "2e68e5bf-0f7b-4bab-aac9-e99e57445dfc" "50" "ул" "2014-02-28"
    "0051" "" "" "2016-04-14" "000" "0000" "000" " 0"
    "3e5ba4cb-7b5e-4258-86e0-8b91ca39d601" "0000" "" "0")
Heap exhausted during garbage collection: 288 bytes available, 432 requested.
 Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB   LUB  !move  Alloc  Waste   Trig    WP  GCs Mem-age
   0:     0     0     0     0     0     0     0     0     0        0     0 10737418    0   0  0.0000
   1:     0     0     0     0     0     0     0     0     0        0     0 10737418    0   0  0.0000
   2: 28005 28307     0     0   855  9423     0     0    12 334403376 2386128 10737418    0   0  0.5853
   3: 10322 32767     0     0   793  7934     8     0     0 284273120 1955360 10737418    0   0  0.0000
   4:     0     0     0     0     0     0     0     0     0        0     0 10737418    0   0  0.0000
   5:     0     0     0     0  1027  9584    38     0    50 345980496 2965936 356717914  769   1  0.0000
   6:     0     0     0     0  1886  1220     0     0     0 101777408     0  2000000 1787   0  0.0000
   Total bytes allocated    = 1066434400
   Dynamic-space-size bytes = 1073741824
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = false
   *STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 133(tid 140627020216064):
Heap exhausted, game over.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>

I can assert that nothing has significantly changed on my end, this is the same exact unaltered files I've imported last time, and the only change was in loader script, and it was minimal:

-        SET search_path to 'fias'
+        SET search_path TO 'fias

The text was updated successfully, but these errors were encountered:

dimitri · 2019-05-07T21:15:19Z

You can try to play with batch size options, and in particular reduce the prefetch rows setting, which defaults to 100000 rows (see https://pgloader.readthedocs.io/en/latest/pgloader.html#with).

AnrDaemon · 2019-05-07T22:25:07Z

But what would change the issue, if the same database were loaded just fine previously? Any server changes that may affect it?

AnrDaemon · 2019-05-08T09:08:28Z

Also, how to detect optimal setting? What should I look at?

dimitri · 2019-05-08T11:32:09Z

The problem is with SBCL Garbage Collector, and I don't have proper answers for how to avoid it in general. I guess at some point we should get back to speaking with SBCL maintainers and see about improving the GC there. It sounds too much of a task for me to handle though, unfortunately.

AnrDaemon · 2019-05-08T13:16:03Z

Sad story.
It placated at 25k records for now, but given your explanation, I'm compelled to make it configurable.

phoe · 2019-05-08T22:30:04Z

What heap size was SBCL started with? Does it help if you start SBCL with a bigger heap?

AnrDaemon · 2019-05-09T08:19:04Z

How can I know? I'm using provided Docker container, and not setting any extra options.
However, the error text indicates a bug in GC, to me.

phoe · 2019-05-09T09:08:32Z

I see.

It is not necessarily a GC bug per se - it is possible that pgloader requires a bigger heap to function properly with this database. The provided Docker container needs to be modified to take this into account.
Normally this is achieved through running SBCL with a --dynamic-space-size flag and specifying the new heap size. I do not know how to pass this flag into the container as I'm not a container person.

phoe · 2019-05-09T09:09:17Z

@dimitri Is there any kind of flag for the container that allows specifying the SBCL heap size? If not, there should be one.

dimitri · 2019-05-10T14:24:06Z

I'm not sure if it's possible to set that to a saved image at run-time, my understanding was that it needs to happen when cooking the image somehow?

phoe · 2019-05-10T15:07:18Z

From #sbcl on Freenode:

17:00 < phoe> I have a SAVE-LISP-AND-DIE'd SBCL image that executes an application. 
              Can I pass a --dynamic-size argument to that binary like I would to a 
              normal SBCL image?
17:03 < phoe> Or is it the case of :SAVE-RUNTIME-OPTIONS argument passed to S-L-A-D?
17:05 < jackdaniel> save-runtime-options makes the executable inherit dynamic size, 
                    if save-runtime-options is nil, then executable will process 
                    --dynamic-space-size argument and set appropriate values

Via the SBCL manual:

:save-runtime-options
If true, values of runtime options –dynamic-space-size and –control-stack-size that were used to start sbcl are stored in the standalone executable, and restored when the executable is run. This also inhibits normal runtime option processing, causing all command line arguments to be passed to the toplevel. Meaningless if :executable is nil.

dimitri · 2019-05-10T17:14:40Z

Nice! Do you want to try to hack ./src/save.lisp so that we can pass a dynamic space size either when building the docker container, or maybe to the resulting pgloader image? I guess that with the current version of things the --dynamic-space-size might work already, it's just not documented, right?

phoe · 2019-05-10T17:26:16Z

I guess that with the current version of things the --dynamic-space-size might work already, it's just not documented, right?

No, currently :SAVE-RUNTIME-OPTIONS is set to T:

pgloader/src/save.lisp

Line 83 in 954eca0

:save-runtime-options t

This means that the heap size cannot be modified from outside if I understand it correctly.

So far, it seems like we either have a choice of being able to pass --dynamic-space-size or normal pgloader argv, but not both at the same time; one option might be starting SBCL with a predetermined heap size in the build script, so the executable is built with a bigger heap by default. Where in the buildscript is the sbcl binary invoked?

I'm also exploring other options on #sbcl on Freenode.

dimitri · 2019-05-10T20:43:02Z

Have a look at https://github.com/dimitri/pgloader/blob/master/Makefile#L9 for how to define the dynamic space size in the build process for pgloader. Currently, this only works for the legacy build system that relies on buildapp. I intend to switch to the new build system at src/save.lisp sometime, and it's now used in both Travis and Docker builds to make sure it's reliable.

It should be fairly easy to add --dynamic-space-size $(DYNSIZE) to SBCL builds when using the make save target?

phoe · 2019-05-10T21:24:19Z

Yes, it doesn't seem hard to do that.

Nonetheless, you cannot use save.lisp for setting the heap size. It is a Lisp file, so it is invoked once SBCL is already started, so the heap size is already set in stone.

mdrozdz17 · 2019-06-17T13:06:51Z

I just ran into this issue this past weekend. I am using one linux server that has a load file that contains with 2 RDS instances:

LOAD DATABASE
FROM mysql://confuser:PASSWORD@ac5enz3hx9sixl.cq1bwppz1j9k.us-east-1.rds.amazonaws.com/confluence
INTO pgsql://confuser:PASSWORD@ac16f4chba0zdi6.cq1bwppz1j9k.us-east-1.rds.amazonaws.com/confluence

   CAST type bigint when (= 20 precision) to bigint drop typemod

   ALTER schema 'confluence' rename to 'public';

Here is my version:
pgloader version "3.6.1"
compiled with SBCL 1.3.1.debian

I was reading that i could possibly increase the --dynamic-space-size but once I am in the sbcl terminal i am unsure the syntax on how to increase it. Does anyone know?

phoe · 2019-06-17T14:54:03Z

I was reading that i could possibly increase the --dynamic-space-size but once I am in the sbcl terminal i am unsure the syntax on how to increase it. Does anyone know?

Try the syntax:

$ sbcl --dynamic-space-size 8192

Where 8192 is your preferred heap size.

mdrozdz17 · 2019-06-17T15:13:06Z

Looks like 8192 is out of range but i tried 3500 and that brought me to the terminal. Do you know if there is a way to verify this now?

I'll also try the migration to see if it gets me past this error for now to see if that helps.

mdrozdz17 · 2019-06-17T15:15:07Z

I just tried it and it looks like i'm still running into the issue

Heap exhausted during garbage collection: 32768 bytes available, 115984 requested.
Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB LUB !move Alloc Waste Trig WP GCs Mem-age
0: 0 0 0 0 0 0 0 0 0 0 0 85899345 0 0 0.0000
1: 0 0 0 0 0 0 0 0 0 0 0 85899345 0 0 0.0000
2: 42723 248946 0 0 1980 133485 0 47815 55 5150103072 855615968 85899345 0 0 1.5745
3: 5770 27142 0 0 1342 11646 1900 0 154 484057216 3792768 2000000 2474 0 0.4397
4: 0 0 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
5: 0 0 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
6: 0 0 0 0 3826 1329 0 0 0 168919040 0 2000000 3622 0 0.0000
Total bytes allocated = 7421137312
Dynamic-space-size bytes = 8589934592
GC control variables:
GC-INHIBIT = true
GC-PENDING = false
STOP-FOR-GC-PENDING = false
fatal error encountered in SBCL pid 1850(tid 140736755595008):
Heap exhausted, game over.

mdrozdz17 · 2019-06-17T19:03:58Z

I was able to get past this error. Basically at first i was specifying DYNSIZE=8192, and this got me the error above. As a test i upped the DYNSIZE=20000 and it got me past my error.

dimitri · 2019-06-17T20:37:46Z

You can also try to limit the amount of memory that's used by pgloader by tweaking the batch size parameters, starting with prefetch rows = 10000 (it defaults to 100 000 which is a lot of rows if you have a table with either lots of columns or maybe very large values). See https://pgloader.readthedocs.io/en/latest/pgloader.html#batch-behaviour-options.

appy2401 · 2019-09-06T06:34:40Z

I was able to get past this error. Basically at first i was specifying DYNSIZE=8192, and this got me the error above. As a test i upped the DYNSIZE=20000 and it got me past my error.

Can you please let me know how that can be done from the command line. I am facing same issue while migrating my confluence database to postgresql.

LincolnBryant · 2019-09-10T21:09:37Z

hi @appy2401 if you are building from source, you can run something like

make DYNSIZE=20000 pgloader

which will build a binary that has the increased heap size.

if you want to tweak the prefetch rows parameter, you can build a command file that looks like this:

load database
	from <source data>
        into <destination postgres>
with prefetch rows = 10000;

hasufell · 2019-10-31T08:13:01Z

Sorry, I don't understand how this is usable with huge database imports when the garbage collector barfs out. Given that I don't want to deal with the esoteric toolchain, what's the best option, try the python version 2?

In addition, pgloader --with "prefetch rows = 1000000" ./db.sqlite3 postgresql:///foo barfs out with an unreadable error

#<CLOSURE (LAMBDA (COMMAND-LINE-ARGUMENTS::VALUE)
            :IN
            COMMAND-LINE-ARGUMENTS::MAKE-OPTION-ACTION) {10061971BB}> fell through ETYPECASE expression.
Wanted one of (NULL KEYWORD SYMBOL CONS).

phoe · 2019-10-31T08:37:12Z

The latter is a type-error and is a bug in logic, and this warrants a separate ticket. We would need a backtrace to debug it properly.

@dimitri How does one produce a Lisp backtrace with pgloader?

cgricb · 2020-02-03T15:34:57Z

prefetch rows = 10000 worked for me.

bruno-lopes · 2020-03-24T13:28:36Z

Using prefetch rows, I could load more data, but the problem continues.

dimitri · 2020-03-27T22:41:32Z

@dimitri How does one produce a Lisp backtrace with pgloader?

Using --debug triggers a condition handling that uses trivial-backtrace and provides some level of information. The code being multi-threaded (using lparallel) sometimes makes those stack traces not as useful as one would like to, though I think I got it in a good shape after some tweaking.

dastanko · 2020-05-21T07:16:49Z

Problem is persisting even with using smaller prefetch rows = 1000.

itworx4biz · 2020-06-17T21:40:48Z

Hi, I also faced this failure.
It occurred when loading with 4 workers.
Reducing workers and using only 2 helped.

Cheers, G.

hekep · 2021-06-14T09:35:09Z

Hi, I also faced this failure.
It occurred when loading with 4 workers.
Reducing workers and using only 2 helped.

Cheers, G.

How did you reduce workers ?
I have tried:

pgloader --debug --verbose --logfile=./pgloader.log --with "batch concurrency = 1 " mysql://db:db@192.168.255.40/db postgresql://db:db@localhost

mecampbellsoup · 2021-11-28T17:57:31Z

Strangely I think that removing the --debug flag solved this issue in my case... is that possible?! (Sorry I have nearly lost my mind tweaking my loadfile/banging my head against this wall, i.e. Heap exhausted during garbage collection error printed and dropping into the backtrace debugger.)

bearlin · 2021-12-16T09:45:44Z

Reduce default prefetch rows works for me.

AnrDaemon · 2021-12-16T10:27:15Z

Strangely I think that removing the --debug flag solved this issue in my case...

If it affecting the total heap usage, then that's possible.

mowshon · 2022-05-29T17:10:17Z

@mecampbellsoup thank you. I removed the --debug flag and the process started without any problems. Thanks for your comment, I was about to give up.

nilay-g · 2022-10-13T07:22:53Z

You can also try to limit the amount of memory that's used by pgloader by tweaking the batch size parameters, starting with prefetch rows = 10000 (it defaults to 100 000 which is a lot of rows if you have a table with either lots of columns or maybe very large values). See https://pgloader.readthedocs.io/en/latest/pgloader.html#batch-behaviour-options.

This one worked for me. Here is the sample command:

pgloader --with "prefetch rows = 10000" mysql://user_name:password@127.0.0.1/db_name postgresql://user_name:password@127.0.0.1/db_name

linux-root · 2022-11-22T03:24:05Z

I still got the "Heap exhausted during garbage collection: 65536 bytes available, 75968 requested." when migrating about 1.5 GB data, even set the prefetch rows = 1000
Tried to build from the source by command make DYNSIZE=8192 save but It still didn't work out. I'm wondering where is value 65536 bytes from?

AnrDaemon · 2022-11-22T08:45:07Z

"64k available" is how many bytes left on the heap at the time the call was made. Simply, it comes out of the heap manager itself.

linux-root · 2022-11-24T09:35:37Z

Finally, setting prefetch rows = 100 works in my case. For those who also face this problem, try reducing prefetch rows. Btw, I also got error debugger invoked on a PGLOADER.CONNECTION:DB-CONNECTION-ERROR in thread, and adding option workers = 8, concurrency = 1 solved that. Thanks @AnrDaemon for opening this issue, Thanks everyone contributing to the solution!

MichaelAnckaert · 2023-04-19T11:07:59Z

I'm currently also experiencing the heap exhausted issue.

I tried building pgloading (master branch) using make DYNSIZE=60000 pgloader, which should allocate 60Gb to sbcl when running pgloader. FYI: I'm running this on a host with 128Gb of ram, so no system issue here.

I'm running pgloader with 10 workers, concurrency of 1 and 10k prefetch rows. The max memory usage on the system is less than 1Gb before heap exhaustion.

Two different errors are reported, no pattern when one error or another occurs:

First error:

Heap exhausted during garbage collection: 80 bytes available, 96 requested.
Gen  Boxed   Code    Raw  LgBox LgCode  LgRaw  Pin       Alloc     Waste        Trig      WP GCs Mem-age
fatal error encountered in SBCL pid 14277 tid 14289:
GC invariant lost, file "gencgc.c", line 488

Second error:

Heap exhausted during garbage collection: 49152 bytes available, 63808 requested.
Gen  Boxed   Code    Raw  LgBox LgCode  LgRaw  Pin       Alloc     Waste        Trig      WP GCs Mem-age
 2     590      0  12842      0      0   3151   26   249156768  22539104    10737418       0   0  2.5581
 3     539      0   6726   1335      0  12490 1286   325262496  20276064     2000000    1590   0  0.0000
 4       0      0      0      0      0      0    0           0         0     2000000       0   0  0.0000
 5       0      0      0      0      0      0    0           0         0     2000000       0   0  0.0000
 6    3317     13   2765    604      0    324    0   113057072   2007760     2000000    6836   0  0.0000
 7     228      0   7681      0      0  12687    0   314159808  23285056     2000000       0   0  0.0000
           Total bytes allocated    =    1001636144
           Dynamic-space-size bytes =    1073741824
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = false
   *STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 14305 tid 14315:
Heap exhausted, game over.

The second error seems to report a dynamic-space-size of 1073741824 bytes, which equals 1Gb. Very different from the value specified in DYNSIZE.

FYI: I'm trying to switch to a CCL based build, but fail due to #1479

MichaelAnckaert · 2023-04-19T11:41:14Z

Quick followup: it seems that I can pass the dynamic-space-size option directly to pgloader. Memory usage output seems to confirm that the virtual memory for the pgloader processes is now 60Gb.

philsmy · 2023-06-15T05:30:41Z

@MichaelAnckaert how did you pass that option to pgloader?

PaulLockett · 2023-12-20T20:46:53Z

@MichaelAnckaert @philsmy It hurts my heart to be at this same point you all were, trying to set the dynamic-space-size to a larger number.

Please come back and tell us how you did this. 🙏🏾

horat · 2023-12-29T16:44:22Z

pgloader --dynamic-space-size 262144 -v migrate.load
did actually increase memory usage and not crash. Pgloader eventually used around 34GB to transfer 11.7GB successfully.
I am unsure if this is a problem in pgloader or I just have to throw more memory at it 😄

hdanet mentioned this issue Jun 28, 2019

Heap exhausted during load data into postgres #994

Open

Anth0nyME mentioned this issue Sep 18, 2023

Cannot understand migration error reasons #1530

Closed

2 tasks

PaulLockett mentioned this issue Dec 21, 2023

--dynamic-space-size handling on SBCL is broken xach/buildapp#36

Open

Heap exhausted during garbage collection #962

Heap exhausted during garbage collection #962

Comments

AnrDaemon commented May 7, 2019

dimitri commented May 7, 2019

AnrDaemon commented May 7, 2019

AnrDaemon commented May 8, 2019

dimitri commented May 8, 2019

AnrDaemon commented May 8, 2019

phoe commented May 8, 2019

AnrDaemon commented May 9, 2019

phoe commented May 9, 2019

phoe commented May 9, 2019

dimitri commented May 10, 2019

phoe commented May 10, 2019 • edited Loading

dimitri commented May 10, 2019

phoe commented May 10, 2019 • edited Loading

dimitri commented May 10, 2019

phoe commented May 10, 2019

mdrozdz17 commented Jun 17, 2019 • edited Loading

phoe commented Jun 17, 2019

mdrozdz17 commented Jun 17, 2019

mdrozdz17 commented Jun 17, 2019 • edited Loading

mdrozdz17 commented Jun 17, 2019

dimitri commented Jun 17, 2019

appy2401 commented Sep 6, 2019

LincolnBryant commented Sep 10, 2019

hasufell commented Oct 31, 2019 • edited Loading

phoe commented Oct 31, 2019

cgricb commented Feb 3, 2020

bruno-lopes commented Mar 24, 2020

dimitri commented Mar 27, 2020

dastanko commented May 21, 2020

itworx4biz commented Jun 17, 2020

hekep commented Jun 14, 2021

mecampbellsoup commented Nov 28, 2021

bearlin commented Dec 16, 2021

AnrDaemon commented Dec 16, 2021

mowshon commented May 29, 2022

nilay-g commented Oct 13, 2022

linux-root commented Nov 22, 2022

AnrDaemon commented Nov 22, 2022

linux-root commented Nov 24, 2022 • edited Loading

MichaelAnckaert commented Apr 19, 2023

MichaelAnckaert commented Apr 19, 2023

philsmy commented Jun 15, 2023

PaulLockett commented Dec 20, 2023

horat commented Dec 29, 2023

phoe commented May 10, 2019 •

edited

Loading

phoe commented May 10, 2019 •

edited

Loading

mdrozdz17 commented Jun 17, 2019 •

edited

Loading

mdrozdz17 commented Jun 17, 2019 •

edited

Loading

hasufell commented Oct 31, 2019 •

edited

Loading

linux-root commented Nov 24, 2022 •

edited

Loading