New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
borg uses too much memory (with misc. operations) #5202
Comments
Can you reproduce the issue again:
Would be interesting whether these also use a lot of memory. |
Even better, let me copy what I wrote on irc after textshell and elho mentioned that this behaved like preload was being set to True and I checked it was being set to False:
so the problem is that that I tested that renaming the preload function fixed the problem (consuming less than 150 MB instead of ~34GB and taking less than half the time to finish) but elho was faster to create a commit, so I guess he will submit it soon: elho@be64610 |
…same name The locally defined preload() function overwrites the preload boolean keyword argument, always evaluating to true, so preloading is done, even when not requested by the caller, causing a memory leak. Also move its definition outside of the loop. This issue was found by Antonio Larrosa in borg issue borgbackup#5202.
Yes, I built a borg 1.2 from that branch plus a 1.1.11 with that patch on top, testsuite passed, I will submit the actual PR after some real world tests with them have completed successfully. |
Oh, oops. Thanks for finding this! This affects pretty much everything using |
…same name The locally defined preload() function overwrites the preload boolean keyword argument, always evaluating to true, so preloading is done, even when not requested by the caller, causing a memory leak. Also move its definition outside of the loop. This issue was found by Antonio Larrosa in borg issue borgbackup#5202.
…same name The locally defined preload() function overwrites the preload boolean keyword argument, always evaluating to true, so preloading is done, even when not requested by the caller, causing a memory leak. Also move its definition outside of the loop. This issue was found by Antonio Larrosa in borg issue borgbackup#5202.
@LocutusOfBorg @FelixSchwarz as this affects many operations, likely including Guess noone wants to have a "out of memory" surprise at full restore time... |
crap, there seems to be a cython code generation issue, so at least the pip pkg is broken on py38, see #5214 . |
Shouldn't the linter have caught this? I know PyCharm does, but I thought pylint would have as well... |
@enkore the problem is that pycharm warns about quite a lot in borg source. some is invalid. some is valid, but harmless. and some is valid and not harmless. would be quite some effort to clean up the code to remove the warnings about the harmless stuff. |
@antlarr Thank you for spotting the problem :-) @ThomasWaldmann also thank you for making me aware of the problem. I did built 1.1.13 for Fedora 31/32/rawhide and EPEL 7+8. |
1.1.12 available in Debian and Ubuntu |
Is this a BUG / ISSUE report or a QUESTION?
ISSUE
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
client: 1.1.11 (openSUSE TW system package)
server: 1.1.11 (built locally from debian package)
Operating system (distribution) and version.
client: openSUSE Tumbleweed (20200516)
server: osmc (2020.03-1). Debian-based distribution for mediacenters.
Hardware / network configuration, and filesystems used.
client: x86_64
server: aarch64 (a vero 4k mediacenter with external usb hard disk)
How much data is handled by borg?
Original size of all archives: 2.11 TB
Each archive original size is between 31 and 60 GB
There's a total of 43 archives.
Full borg commandline that lead to the problem (leave away excludes and passwords)
borg recreate -e 'home/user/large-directory' -e 'home/user/large-file' ssh://borgbackup@server/home/borgbackup/repositories/hostname::hostname-2019-09-30T00:01:57
Describe the problem you're observing.
Recreating the archive consumes a lot of memory. I checked it kept growing during the execution of the recreate command peaking just at the end at a VmSize of 34155864 k (~34GB).
After some debugging, I could observe the problem lies in the self.responses dictionary in remote.py, which keeps all the messages received during the processing.
In the middle of a debugging session I got this:
(total_size is the function from https://code.activestate.com/recipes/577504/ which returns the total size in memory of an object and its contents)
It seems the only way for an item to be removed from the responses dictionary is by first getting its msgid into the waiting_for list. I checked (in a different debugging session than the commands above) and waiting_for doesn't seem to change at all during the execution. It stayed with the same single content every time I checked:
Of course, the max msgid value of self.responses.keys() kept growing every time I checked, just as its length.
I don't know where waiting_for got that value from, but I'd say maybe waiting_for is waiting for the last msgid coming from the server, so every response will just be stored in self.responses and never discarded.
Note that all this happens inside the
self.repository.commit()
call in Archive.save . This is the backtrace that can be seen by stopping the execution at mostly any time during the recreate (after a few seconds after starting):Please note that the line numbers in archive.py and remote.py might be different than yours since I added some debug code to those files.
Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.
I can reproduce it every time.
Include any warning/errors/backtraces from the system logs
The system logs don't show any problem in the client or the server. I also did a
borg check
which didin't find any problem.The text was updated successfully, but these errors were encountered: