-
-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap exhausted during garbage collection #962
Comments
You can try to play with batch size options, and in particular reduce the prefetch rows setting, which defaults to 100000 rows (see https://pgloader.readthedocs.io/en/latest/pgloader.html#with). |
But what would change the issue, if the same database were loaded just fine previously? Any server changes that may affect it? |
Also, how to detect optimal setting? What should I look at? |
The problem is with SBCL Garbage Collector, and I don't have proper answers for how to avoid it in general. I guess at some point we should get back to speaking with SBCL maintainers and see about improving the GC there. It sounds too much of a task for me to handle though, unfortunately. |
Sad story. |
What heap size was SBCL started with? Does it help if you start SBCL with a bigger heap? |
How can I know? I'm using provided Docker container, and not setting any extra options. |
I see. It is not necessarily a GC bug per se - it is possible that pgloader requires a bigger heap to function properly with this database. The provided Docker container needs to be modified to take this into account. |
@dimitri Is there any kind of flag for the container that allows specifying the SBCL heap size? If not, there should be one. |
I'm not sure if it's possible to set that to a saved image at run-time, my understanding was that it needs to happen when cooking the image somehow? |
From
Via the SBCL manual:
|
Nice! Do you want to try to hack |
No, currently Line 83 in 954eca0
This means that the heap size cannot be modified from outside if I understand it correctly. So far, it seems like we either have a choice of being able to pass I'm also exploring other options on |
Have a look at https://github.com/dimitri/pgloader/blob/master/Makefile#L9 for how to define the dynamic space size in the build process for pgloader. Currently, this only works for the legacy build system that relies on buildapp. I intend to switch to the new build system at It should be fairly easy to add |
Yes, it doesn't seem hard to do that. Nonetheless, you cannot use |
I just ran into this issue this past weekend. I am using one linux server that has a load file that contains with 2 RDS instances: LOAD DATABASE
Here is my version: I was reading that i could possibly increase the |
Try the syntax:
Where |
Looks like 8192 is out of range but i tried 3500 and that brought me to the terminal. Do you know if there is a way to verify this now? I'll also try the migration to see if it gets me past this error for now to see if that helps. |
I just tried it and it looks like i'm still running into the issue Heap exhausted during garbage collection: 32768 bytes available, 115984 requested. |
I was able to get past this error. Basically at first i was specifying DYNSIZE=8192, and this got me the error above. As a test i upped the DYNSIZE=20000 and it got me past my error. |
You can also try to limit the amount of memory that's used by pgloader by tweaking the batch size parameters, starting with |
Can you please let me know how that can be done from the command line. I am facing same issue while migrating my confluence database to postgresql. |
hi @appy2401 if you are building from source, you can run something like
which will build a binary that has the increased heap size. if you want to tweak the prefetch rows parameter, you can build a command file that looks like this:
|
Sorry, I don't understand how this is usable with huge database imports when the garbage collector barfs out. Given that I don't want to deal with the esoteric toolchain, what's the best option, try the python version 2? In addition,
|
The latter is a type-error and is a bug in logic, and this warrants a separate ticket. We would need a backtrace to debug it properly. @dimitri How does one produce a Lisp backtrace with pgloader? |
prefetch rows = 10000 worked for me. |
Using prefetch rows, I could load more data, but the problem continues. |
Using |
Problem is persisting even with using smaller |
Hi, I also faced this failure. Cheers, G. |
How did you reduce workers ?
|
Strangely I think that removing the |
Reduce default |
If it affecting the total heap usage, then that's possible. |
@mecampbellsoup thank you. I removed the |
This one worked for me. Here is the sample command:
|
I still got the "Heap exhausted during garbage collection: 65536 bytes available, 75968 requested." when migrating about 1.5 GB data, even set the |
"64k available" is how many bytes left on the heap at the time the call was made. Simply, it comes out of the heap manager itself. |
Finally, setting |
I'm currently also experiencing the I tried building pgloading (master branch) using I'm running pgloader with 10 workers, concurrency of 1 and 10k prefetch rows. The max memory usage on the system is less than 1Gb before heap exhaustion. Two different errors are reported, no pattern when one error or another occurs: First error:
Second error:
The second error seems to report a FYI: I'm trying to switch to a CCL based build, but fail due to #1479 |
Quick followup: it seems that I can pass the |
@MichaelAnckaert how did you pass that option to |
@MichaelAnckaert @philsmy It hurts my heart to be at this same point you all were, trying to set the dynamic-space-size to a larger number. Please come back and tell us how you did this. 🙏🏾 |
|
While trying to run a full new import using fresh pgloader container, I stumbled upon
I can assert that nothing has significantly changed on my end, this is the same exact unaltered files I've imported last time, and the only change was in loader script, and it was minimal:
The text was updated successfully, but these errors were encountered: