New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory and CPU usage is prohibitively high #468
Comments
The memory usage should not scale linearly with the repo size, precisely because the index is not kept in ram. Perhaps there's something going on here related to the initial sync. The sync will use as much cup as it can, for hashing, compressing and encrypting. You can limit this in various ways. |
The CPU I can live with even though it seems too high compared with ssh+rsync that provides a similar primitive. The rsync protocol may be much cheaper though. Using librsync for sync would probably be a good bet as you'd also get fast syncing of changes to large files. The memory usage is an actual bug them. Maybe it's leaking somehow (although Go has GC I think). |
I just tried restarting the syncthing instances and both of them went down to ~400MB and are now rising again. There's nothing suspicious in the logs. |
The ~400MB is after it's already started syncing so any indexing should be complete. Yours also shows ~1kB/item which seems much too high. At 400MB mine is at ~1.5kB per item so it's comparable. My data just has a lot more smaller files than yours. |
The syncing also seems CPU bound as the receiving node is constantly pegged at 150% while only sporadically receiving data. Most of the time the network connection is idle. I've left it on for 24hours and it's only synced ~28GB of ~400 across two repos. |
No. It's not that simple.
That's exceedingly slow, indeed. These NAS-style servers, are they very underpowered? |
On average that's what your syncthing process is spending in RAM per item. The usage may be in other stuff though.
Not particularly. Both of them are AMD64 machines. The one currently online and syncing slowly is an "AMD Athlon(tm) II Neo N36L Dual-Core Processor". Not particularly fast but also not the slow ARM stuff that's common in NAS servers. |
Yeah, that's not very slow under the circumstances, so then I don't really know why it syncs that slowly. About the memory, I understand averages, but the fact is only very little memory is spent on file indexes since the index is not in RAM. Slices of it are, at times (up to a few thousand items) but the rest is buffers of various kinds (for holding blocks in transfer, compression, encryption), database cache, all other kinds of state kept everywhere, etc. Multiply by 1.5 or so for GC. Obviously there's something going on there with your memory usage though... |
There doesn't seem to be anything particular about my case. Your machine is roughly in line. If this is all fixed cost and doesn't grow if the repository is 1 or 2 TB then it's less serious. It's still quite high though. I'll have to read the code to understand this better. |
I've given up on this for now. rsync is able to saturate the line speed with negligible memory usage and pretty low CPU usage (only SSH really uses CPU and not that much). Edit: Forgot to mention, after two days it's only been able to sync 50GB while pegging the CPU constantly and using 500MB of RAM on the sending side and 700MB on the receiving side. |
Yeah, no idea what's going on there, sorry. You mentioned the sync was running over a tinc vpn, perhaps there's some bad interaction there. If nothing else it's ssl-over-tcp-over-ip-over-ssl-over-tcp-over-ip which is suboptimal for sure. |
If you'd like to isolate the factor I can run a test without tinc. tinc adds a CPU cost because of its own encryption, and that penalty is also paid by rsync, which runs fine. It may be that the receiving side isn't able to hash fast enough to keep up with line speed though. Since rsync doesn't hash it wouldn't hit that issue. As for memory usage, that shouldn't have anything to do with tinc. The memory is used even when not syncing, and your example shows similar memory usage. 1-2MB per GB of repo size is way too high. I see that you've closed the issue though, so let me know if you want help tracking down what's happening with the CPU and memory usage. |
There is no memory cost of "1-2MB per GB of repo size", it should stabilize around 100-150 megs or so maximum, irrespective of repo size. More than that is a bug, which is not reproducible in my installations. I'd like to get to the bottom of it though. (There's a bug in the memory reporting in the GUI in v0.9.0, it'll show a little more than the truth in some cases so comparing actual RSS values from top or so would be nice, but I don't think it accounts for more than a few percent of the total.) |
Some more info here, since I've done some testing. I created a test repo with 200 GB in 250,000 files and measured memory usage. With v0.9.1 at idle (syncthing having done nothing for several minutes; this is relevant, more about that later) syncthing uses about 110 MB of RAM. I'd say this is acceptable, but a bit on the high side. But the GUI is a bad side. Actually loading or reloading the GUI drives the mem usage up to 520 MB. That's obviously crappy. The reason is that the GUI does a bunch of REST calls in parallel that all result in basically linear scans of the entire DB, counting all files and sizes etc. This drives up the peak usage a lot, and it is only slowly returned to the OS once the GC mechanism figures we're not going to need it again anytime soon. Hence the note about idle above. I implemented some tweaks to trigger GC earlier when doing expensive database operations that helped a bit with this. With these changes I get a peak of about 180 MB when reloading the GUI and an an idle usage around 50 MB: There's more to be done here, particularly things that should be kept calculated so that we don't need to scan the full db to figure it out when we load the GUI... |
Your example showed similar values per item (and not per GB) to what I was seeing. I mixed up the per-item values with the per-GB values. My repo has a lot of small files which apparently is a corner case here. 100-150MB fixed size would be fine for me.
Ah, that makes more sense. Since I was testing this out I always had the GUI open. Why is being idle relevant? Does syncing drive the memory usage up? I'll have to try to monitor the usage without the GUI to see what happens in my setup. Thanks for looking into it. |
Actually syncing files will use a bit more memory than idle, yeah, but probably not as much as loading the GUI. Anyway, that feels a little bit more legitimate since it's actually doing something then. The idleness part is relevant because of how Go:s garbage collector works. Basically, it keeps an area of memory for objects. The regular garbage collection cycles marks "free" memory as reusable for new objects, but doesn't return free memory to the OS. When a given chunk of memory has been unused for a few minutes, it's returned to the OS. So in syncthings case, you'll drive up memory usage by opening the GUI. But if syncthing isn't actually doing anything, then neither will the GUI (even while it's still open) and after a few minutes you'll see the memory usage reduce back to previous levels. |
After thinking about this my opinion is that using ~100MB when idle, and going up to 300MB-600MB when scanning and using the GUI is 10x to 100x worse than it should be. All the cases for using that much memory and for using more memory when scanning should be fixable:
There shouldn't really be anything stopping syncthing from having a maximum memory usage of ~10MB irrespective of repository size and that would be really useful in multi-user multi-TB environments. |
Your pull request will be accepted with gratitude. |
On a Raspberry Pi with no repos it's fine but, just with one repo the CPU usage average is between 90 and 100% used |
This should only happen during inital scan, syncing or when the GUI is open. When the GUI is closed and nothing is happening i get almost zero usage on my pi and only a bit higher during periodic rescan. |
Syncthing seems to be leaking memory a lot recently. My 0.9.8 instance is up to 440MB memory usage, whereas with 0.9.2 it never got over 180MB. How can I debug this? |
It's a garbage collected language, so its quite hard to leak stuff. |
@seidler2547 Is that memory as reported by syncthing, or by some other method? |
I ran syncthing on a Pi, it really does max out the hardware. CPU was the problem for me. |
After restarting the two nodes it seems it's finally been able to sync the three nodes. The end result is the same as before:
The RAM usage seems to go up just by using the web interface. I can run the same tests monitoring only with top if that is helpful. The issue has been closed but it's definitely not fixed for me. Please let me know if you still want input or if this is really WONTFIX so I can move on. |
About the memory usage, I have finally let syncthing run for long enough to get some (hopefully) usable heap profiles, please find the tar.xz file here: https://yttr.co/o/vlt5q63r.xz I hope this will help to see that the memory usage continues to increase over time with no apparent reason. The initially good 40-50MB memory usage is at 216MB again on my NAS. To summarize heap increases: I'll keep on monitoring and will send more heap profiles if it increases more. |
I know this issue is closed, but on my Raspberry pi using syncthing armv6-0.10.9 it uses 75-90% cpu virtually all the time. There is only one repo and it ony has 5 files making up around 500kB in it, and they are unchanged. In my config.xml file I have
Memory usage is not that high though. But i had to nice syncthing to even get access to the pi over ssh as it was eating all the cpu constantly. |
Was this caused by the upgrade from 0.10.8 to 0.10.9? |
Actually, on furher investigation, I think this might be related to a corrupt index database. I ran from the command line and saw an error message about the db, so I deleted it, so syncthing would rebuild. Now syncthing only uses a tiny amount of CPU when idle as expected. |
@calmh, we should try and recover from these ourselves as you suggested. |
Yeah. Just need a good spot to do the leveldb.Recover call. |
Syncthing is baking my CPU on my new quad core i7 Windows 8.1 laptop, with 400-600% load any time everything is not fully synced. I just made a .pprof file with the STCPUPROFILE command. Does anyone want it? |
Well, it will be hashing that's burning your cpu, so the pprof won't say much. |
So it's normal that it's using so much CPU? For hours on end? |
Depends on which state it's in. It also hashes when downloading files, so CPU load might go up as well, but I guess the network speed should be the bottleneck in that case, causing the CPU to be underutilized. |
It's finished with the initial scan, so I guess it's hashing while downloading. But my download speed is generally in the in the B/s range (possibly because it's a lot of very small files?), so should the CPU still be getting so much use? |
There is a known bug on some platforms causing slow speeds while Web UI is open (#867) hence you might want to check speeds via some external set of tools. I wasn't able to reproduce it, and I am not sure under what circumstances it happens exactly. |
The home computer with the high CPU usage I'm referring to above is a new i7 Win 8.1 laptop. Unrelated: Is there an easy way to have Syncthing write new log files each time using the -logfile switch, instead of just overwriting? I've been playing around on my work computer and noticed a few things: I'll try it with the GUI closed as well. |
May I ask which folders you are syncing? |
The main folder that seems to be having trouble is the folder for the Scrapbook plugin for Firefox. It saves websites in subfolders inside a data directory and has some .rdf index files that are infrequently updated. (I think it only rebuilds the index when you search it, which is infrequent) (I'm running the extension in Firefox right now and none of the core files have been modified since last week) |
I was curious if new releases had made this any better so tried 0.10.20 on the same set of files. I'm still seeing ~800MB RAM usage on ~430GB/400k files. The CPU (i5-3320M) is also completely pegged on the sending node even after all the scanning is done (so no more hashing should actually be needed). While this is happening the sync is making very little progress (~40kB/s transfer rates). |
close the GUI and it should be better, there is already a issue about that #867 |
Sorry but crypto is not free. Read the FAQ. |
@AudriusButkevicius crypto isn't free, but it can be much cheaper: https://en.wikipedia.org/wiki/AES_instruction_set or is syncthing already using a library which takes advantage of AES-NI? |
No, because Go's crypto is implemented in Go mostly, rather than ASM. If you can point me to a library which provides a cross-platform, cross-arch AES instruction set based cross-compileable, statically linkable library, we'll swap right away. |
So, if you're go's crypto is being used, the one mentioned here: golang/go#11929 than it should already take advantage of AES-NI, and according to that issue it should even speed up faster with 1.6 - or am I getting something wrong? |
Potentially for some parts, as TLS is not the only thing we use, we also use SHA256 which obviously has a cost. Also the CL only seem to relate to amd64, so it's not an improvement across the board, but probably for newer CPU's (post 2010) which aren't maxed out that much anyway. |
Actually SHA256 already seems to be written in ASM, so perhaps that's the last leg of how good it could be ;) |
@AudriusButkevicius I actually came here because I had a huge problem with my syncthing running as a docker container. It was constantly eating 100% of the one CPU to which I restricted it (this was going on for few weeks even though syncthing had nothing to do). Turns out, if you limit syncthing to less memory than it needs, it goes berserk. I had my container limited to 512MiB of memory. Because before that it was eating giga bytes of RAM. After removing the limitation and letting it run for few hours, the container is using ~750MiB of memory and syncthing reports to use only half as much, 360MiB - that part is ok since I have in total roughly 215k files. Anyhow, now the CPU even during syncing is back to max 20% and usually <1% when nothing is going on. Any idea why it was constantly consuming 100% cpu when it was running out of memory? Syncing worked, just my computer was slightly unusable. |
So I think there are two factors. First, cgroups does not change what memory is reported by the machine, hence things such as database probably tries to allocate more than it's allowed, causing issues. |
I've been testing syncthing across 3 machines, a laptop with 8GB of RAM and two NAS-style servers with 1GB and 2GB of RAM. My repositories have the following sizes:
To sync these three repositories syncthing 0.9.0 uses a bit over 700MB of RAM and while syncing continuously pegs the CPU at 150% on all nodes.
While the CPU usage I could manage during the initial sync the memory usage is simply too high. A typical NAS server like the two I have holds >4TB of storage. At the current level of usage that would require ~8GB of memory just for syncthing.
Without looking at the code I assume an index is being kept in memory for all the repository contents. 700MB is 2.6kb per item on disk, which seems way too high. The index should only really need to store filename, parent, permissions, size and timestamp. On average (depends on filename sizes) that should only be 50-60 bytes per item which would only be 13MB. Moving that index to disk would also make a lot more sense.
I assume the CPU usage is hashing lots of files. There I assume using librsync might be a better option.
The text was updated successfully, but these errors were encountered: