-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rclone using too much memory #2157
Comments
Are you using cache? If yes and you're also using Plex, can you disable that feature and see if it does the same thing? It will be enough to just remove the configs from the section and start again. LE: Even easier will be to run simply with |
no I'm not using cache and it is running with -v normally. The older version didn't use cache as that was not implemented yet I dont think. The new one is using the same remotes and setup as the old. The only change is the executable upgrade. |
Hmm, ok. There was another report in the forum and that person uses cache. I was trying to establish a corellation but it seems to be coming from somewhere else. |
I'd like to replicate this if possible. Can you describe the activity that causes the problem more? Or maybe you've got a script I could run? Which provider are you using? And are you using crypt too? |
So the batch script is actually just a set of rclone commands to sync stuff up against the mount points. SETUP: This is my mount:
BATCH:
I'm using google as a provider and it is using crypt as well. Media2 is simply synced with Media1 but I don't delete so it becomes a superset of what is copied to Media1. New files get put in Media1 and then replicated via this process. |
I am experiencing this too, only noticed as my IOWAIT shot up as I hit the swap and rclone pages were being pushed to the disk. This wasn't the behaviour in the later betas of 1.39, anything changed in that respect? |
I managed to replicate this... First I made a directory with 1000 files in using
(this is one of the rclone tools) I then mounted up
and ran this little script to copy the files in and out of the mount
v1.39 uses a pretty constant 4MB, v1.40 goes up and up! |
https://i.imgur.com/Lvm01QB.png |
OK here is the pprof memory usage - rclone was at about 2GB RSS when I took this. So looks like it might be a bug in the fuse library... Though I'm not 100% sure about that.
|
Was it updated just before the release? It's a bit weird no one noticed this in the last betas. A simple list of the vendor shows that it wasn't. |
I have this issue on 1.39 though. |
rclone v1.39-211-g572ee5ecβ |
I used the magic of git bisect to narrow it down to this commit fc32fee
Which is both good and bad. Good because there is a workaround I tried I suspect this is a bug in the fuse library as I'll report a bug upstream in a moment... |
I've not seen cmount mentioned before, what's the difference? |
cmount is a libfuse based mount - it works just the same. It isn't normally compiled in for linux as it requires cgo which would require a cross compile environment for all the OSes. Build it in with |
Upstream bug report: bazil/fuse#196 |
Are we likely to run into any issues at 1s, especially with vfs caching enabled and the cache layer? |
I haven't had a OOM termination or the memory usage since Ive added --attr-timeout 1s @ncw Can you tell me what that can negatively affect running with that timeout? |
I imagine it's just that it'll mean for a second at any given time there's a chance data on the remote will be out of sync to locally and thus streaming the file may go awry. Least that's how I read it. |
I'll also enable cache and make sure its stable. I tried this before and ran into the same issue before reverting to non-cached again. |
@Daniel-Loader wrote:
Probably not. For previous rclone's the value was 60s (or 1s on Windows). What it means is that if you upload a new file not via the mount then there is a 1s window of opportunity for the kernel to be caching stale data. In practice what this means is that the kernel gets the length of the file wrong and you either get a truncated file or a file with corruption on the end. 1s is quite a small window and the kernel would have to be caching the inode for it to be a problem, so I think you are very unlikely to see it. In fact I didn't have any reports of problems with it at 60s, though people may not have realised there was a problem as retrying will have fixed it. @calisro Assuming this fixes your problem, I think I'll set the default back to 1s. |
If rclone use all of ram on your system, it'll crash the mounted drive. So I write a script to check and remount it.
|
Still working well in both cache and non-cached mounts |
I am also running in "out of memory" errors. Debug-Log unfortunately doesn't show much information.
€dit: Usually I do not use --attr-timeout 1s but after reading that is solved the problems for calisro I gave it a try but without success. |
@neik1 what is your access pattern to cause this problem? Can you show me how to replicate it? Can you post a log somewhere? |
@ncw: Well, my setup is running rclone + emby on Ubuntu (x64). I restarted both yesterday evening during today at about 11AM it crashed with the OOM error. In between there was a library scan (to check for updated files) and 2 files were streamed successfully. On the third one it crashed. There were no more than one user streaming simultaneously. Log -> https://1drv.ms/t/s!AoPn9ceb766mgYspfoZqWaZExTlsEQ After analyzing a bit the log I was surprised that there are over 30 lines which saying "open file" (see line 13 of the log) @hklcf, this propably isn't the right place but anyway: What are you using to have such a nice graph for your memory usage? |
@neik1 thanks for the log :-) You have 95 go routines running which seems like a fair number... I think you have 22 open files. Each of these may be using Could you try If you can run rclone with |
Here is my cheap and nasty memory monitoring tool #!/bin/sh
# Monitor memory usage of the given command
COMMAND="$1"
if [ "$COMMAND" = "" ]; then
echo "Syntax: $0 command"
exit 1
fi
headers=""
while [ 1 ]; do
ps $headers -C "${COMMAND}" -o pid,rss,vsz,cmd
sleep 1
headers="--no-headers"
done Run it as |
@ncw, how come that I have 22 open files when I am just streaming one specific file? Stopping rclone right before it crashes will be difficult because my sister and my parents stream as well to their devices and I can't monitor it like 24/7. Is it possible to create a script (or alike) that monitors rclone and if it exceeds let's say 500-600MB it kills it to create the memprofile? I restarted all the processes and put --attr-timeout 60s --memprofile /tmp/mem.prof and removed --buffer-size 32M. Sorry, for this stupid question but what exactly do I need to do with that script? Just save it and run it like: ./script.sh? memusage rclone shows me that -> -bash: memusage: command not found |
@neik1 save the script to memusage.sh and run |
Worst case and you can't get memory user to work well with 2GB, try mounting with the |
@hklcf, thanks that helped. Set it up and it's now running. @Daniel-Loader: Hi Daniel, |
--buffer-size is per open file on a mount |
OK! But how is it possible that there were so many files open that led to that issue when there was only one streaming going? That's the point I do not understand yet. |
Well if there's a library scan if it's not doing it sequentially it might be doing mediainfo lookups on multiple files at once. You could try changing the buffer size down to 8MB and see if it retains enough performance? Alternatively as said, write the 32MB buffers to disk as a swap of sorts. I know on plex you can opt out of chapter/thumbnail creation which would read the whole file on scans, emby can opt out too? |
With this adapted mount command it crashed again:
Log -> https://1drv.ms/t/s!AoPn9ceb766mgYsqBGX5KjSKxFVeiA My next step will be trying to avoid the mem cache at all with --cache-chunk-no-memory @Daniel-Loader, I am not using chapter/thumbnail creation. The library scan for new files takes about 8min (scrapping of new files included). So, it's actually pretty fast I suppose. |
Yeah that's incredibly fast for a non cached remote media scan, depending on library size! |
Yeah, well... At the end it doesn't seem to be a rclone issue. In my case it seems to be an Emby problem with a specific client. €dit: I am saying this because Emby crashed again but this time rclone didn't (probably because of the --cache-no-mem flag) and was only using 80mb of memory. |
I've committed a fix to change the default to Hopefuly upstream will come up with a fix eventually, but this will do for the moment. |
This will be in https://beta.rclone.org/v1.40-018-g98a92460/ (uploaded in 15-30 mins) |
rclone v1.39-211-g572ee5ecβ
I recently upgraded from rclone.v1.38-235-g2a01fa9fβ to rclone v1.39-211-g572ee5ecβ
I'm using the following mount:
export RCLONE_CONFIG="/home/robert/.rclone.conf"
export RCLONE_BUFFER_SIZE=0M
export RCLONE_RETRIES=5
export RCLONE_STATS=0
export RCLONE_TIMEOUT=10m
export RCLONE_LOG_LEVEL=INFO
export RCLONE_DRIVE_USE_TRASH=false
/usr/sbin/rclone -vv --log-file /data/log/rmount-gs.log
mount robgs-cryptp:Media $GS_RCLONE
--allow-other
--default-permissions
--gid $gid --uid $uid
--max-read-ahead 1024k
--buffer-size 50M
--dir-cache-time=72h
--umask $UMASK 2>&1 > /data/log/debug &
This worked flawlessly before. In the most recent version, it is using WAY too much memory causing OOM issues on my linux server. The above uses upwards of 7GIGS of resident memory before my OOM killer terminates the mount when activity is being read/written to the mount. I used to be able to crank up the buffer-size to 150M without issues. There is NOTHING else different between the setup. I can restore my old version and re-run the batch process without issues and then replace it with the new version and it consistently terminates due to OOM after consuming everything left.
If I restore the old version and rerun the exact same process, rclone consistently uses no more than 1.8G.
I can provide logs but there is nothing them. They simply show transfers. Even in the fuse-debug there is nothing but normal activity. There seems to be either a leak or the use of memory has changed DRASTICALLY between the versions making things unusable. I've rolled back till this is sorted.
The text was updated successfully, but these errors were encountered: