-
Notifications
You must be signed in to change notification settings - Fork 817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync performance slow for many files #691
Comments
This is still an issue. I need to sync across multiple machines a cryfs container (3.7GB 240.000 files, 4000 subfolders) but the upload tops up at 20KB/s and it would take days to upload everything. You could take a look at MegaSync approach to parallel uploads/downloads, their client is opensource too. |
I have the same issue and had a quick peek using Process Monitor. It seems like most of the time is spent accessing sync.db. To me it seems like it's re-read in 4K steps for each file. With a DB of >100MB and thus over 25K systems calls thing must become slow. |
I have just experienced the same issue. 67,000 files, 24Gb. Important observations:
So my guess is that there is some sort of array building up. The ability to get the thing going fast by simply restarting the service rules out the database in my opinion. However, restarting the service regularly takes a big hit with initialising the sync algorithm and so is not worthwhile until after about 8000 files have downloaded. Taking a look at the code, I think the "ActivityListModel" class is a strong candidate for the above issues. In particular "combineActivityLists()" and the "resetModel" followed by "beginInsertRows". I am familiar with QAbstractListModel and the code in this method is very likely to exhibit the behaviour of all 5 of the above observations. I need to understand a but more about how the ActivityList is built up but my guess is that a little attention to this section of code would reap significant benefits. |
I am running 2.6.1git on Linux Mint Cinnamon 19.3 and it is so slow. I just set this machine up and I am syncing 12GB and it is going apx 500k. It currently says 9 Hours left, 333MB of 12GB file 34 of 14492. I have 300MB internet and this has been running for apx 2 hours now. This is not the first machine to do this. Another machine I setup yesterday had the same results. |
So slow, that I have to leave PC on 247 to complete sync of my documents. Sometimes is just 10GB takes a few days. |
I don't know how relevant this is to most of you but I'm gonna leave this here just in case: Still not sure what caused the high CPU usage exactly, but I haven't seen the Nextcloud client use more than a few % ever since. Edit for clarification: When I add an entry to the hosts file so my server's domain resolves to its local IP I have really fast file transfers. As soon as I remove that entry, it goes back to being slow. This works on every client in my network, be it Linux, Windows, or Android - so I'm fairly certain it has something to do with my ISP provided router. Using a non-ISP router from ASUS I had fast file transfers either way. |
Just a guess: Is it possible that IPv6 is tried first, but it doesn't work, and after IPv6 has timed out, IPv4 is used and succeeds? Or vice versa? |
Thanks for those observations. I think we can use your observations too to support what I have seen inside the NextCloud code, but I would suggest that neither DNS nor ISP are likely to be related to the cause. Just the actions that took place when you made those modifications and I would suggest, some happy coincidences. Basically, what I can see is that the NextCloud activity list builds up over time with many entries and becomes VERY slow and CPU intensive to update. When you restart the computer or bounce the nextcloud sync service on the PC, the activity list empties and everything goes fast again. And as you say, once the big sync of files has completed and you reboot/restart you will go back to a few % ever after (until you set out to resync the whole folder again from scratch). I am quite confident that the root cause is a piece of the NextCloud client display update code that is also sitting inside the same file sync thread, and the display update has to complete before the sync continues for every single file, even if the client is not open on the screen! Not a particularly good code design. I am hoping to find some time to test a fix to that section of code for NextCloud shortly. I can see exactly where it is happening in their code. |
Theoretically yes, but I've deactivated IPv6 here (exactly because of this issue) and uploads are still as slow as always. By the way, do we know where the bottleneck is? I mean, the problem of uploading many small files also occurs on Android, so is this a client or a server problem? |
Not here. My Nextcloud server is dual stack and works perfectly on IPv6. It's just uploading one small file at a time... |
I have a Nextcloud external storage configured on my self hosted Nextcloud instance. When I am syncing a folder (which has many subdirectories) from remote nextcloud storage the sync client spends more time checking for changes than downloading the contents themselves. Moreover, the sync client sometimes even crashes when I try to expand a folder(with many subdirectories) on external storage #1959. |
I am running the Nextcloud Client in version 2.6.4git on an Ubuntu 18.04 to sync 50,000+ directories. Starting the client lasts at least 15 minutes until it has discovered all local files and created the 50,000 inotify watchers. Then it seems to ask the server for every file if it has changed or not. It only handles a few files per second which in total leads to over an hour until the frontend is finally visible. What's about the 2.7.0 beta version? Does it help in this case? Has anyone tested it already? It's really annoying that the client checks every single file against the server. It's soooo slow! |
Yeah I have been trying with 2.6.4-1. Sorry to say this app is useless if this issue cannot be resolved. What a terrible oversight in what otherwise appears to be a great application. |
@WilliamHorner blames the client display code. Does anybody know if the command line client works better? |
I've switched back to Syncthing for syncing of the files because of this issue, and use Nextcloud just as a fancy web GUI and because of more easy to use iOS and Android clients. |
Same problem with only a few hundred small files across a VPN (30Mbps up) the speed is in bits per second and frequent stalls. |
I'm still on Nextcloud 17, will update soon. I cant confirm that first 1000 files go quick... I just synced 500 files (pdfs and gpx-tracks) of about 200 MB an it took half an hour over my GB LAN. I'm not a Linux freak, but managed to install nextcloud with MariaDB on my Synology DS412+ and using Client 2.6.5 on Windows 10 (Laptop core i7 8thGen). Will get DS420+ soon and see & hope if that helps... If not or no solution here, I think I will have to use Dropbox for small files, its faster even over my 10MB/s upload via ISP!!! |
Do we have any ideas on how we can address this issue? Would it make sense instead of http to use multiple sockets for constant stream of data? Or continue to use the current http/webdav way but allow for parallel processing of http requests? |
It just needs someone to rewrite the Qt Model/View code efficiently by
migrating the beginModelReset() and endModelReset() commands to
something more appropriate for high volumes of items. Which appears to
be complicated by how the view is sorted so I can see why they did it
that way. Fine for the first few thousand items but dog slow after that
because of the graphics update, even if the view is not visible!
(because the model still has to do all the processing even if there is
nothing on the screen). But I am confident that is the issue.
Sorry folks, although I investigated and I can see what is causing it I
just havent had time recently to see if I can fix it for them.
…On 02/09/2020 02:07, Jonathan Beaudoin wrote:
Do we have any ideas on how we can address this issue? Would it make
sense instead of http to use multiple sockets based syncing for
constant stream of data? Or continue to use the current http/webdav
way but allow for parallel processing of http requests?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#691 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIDSU6TX4LCKNSUGLPFAU63SDWLGBANCNFSM4FWXOVGA>.
|
So the model is completely dependent from the view? That's a bummer. |
@WilliamHorner so this is a purely client sided issue? The server doesn't have any issues serving/retrieving large amounts of files? |
Just adding more use case to this issue: https://dl.dropboxusercontent.com/s/0xci50aox5myxek/SywW6g1ErD.png It's been over 2 days to sync 68gb of files on a 10gigabit network with nextcloud 100% on SSDs. |
@jonatino I'm pretty sure it's client side, as things get slower and slower. Restarting nexclound every now and then brings things up to speed for a short while.... |
Same problem with the desktop sync client, while uploading many small files:
The client is uploading only 1-2 files per second which results in a speed of sometimes < 1 kb/s (!), which is insane. While for bigger file, the speed reaches about 3 mb/s. Loading 15gb in 4 days is ridiculous and makes nextcloud for my case just unusable. (I assume the desktop client is loading each file one-by-one, instead of packing a bunch of small files together in one .zip and load them all together to max out on the network speed. This results in a massive overhead for many small files, since they are all handled and loaded separately. But, as i said, thats just what i assume.) |
Thanks for the tip, i will give it a shot. Yesterday i tried a bit with Syncthing, which seems also very promising for my demands. |
How would you do that for the macOS client? |
I have had to switch to OwnCloud's client to sync my folder with all my repos on my user profile of my wife's Macbook Air. It's only 3.1 GB but the Nextcloud client sat for almost an hour scanning EVERY folder on the server, even though I only had it sync one folder and it's subfolders. The client stopped responding and just had the spinning beach ball of death. I only use her laptop very rarely, so I have my Repo folder synced so I can pick up where I left off from my dev computer. I can't wait for hours to have the client just become ready to sync. When I moved to the OwnCloud client, it only scanned folders for about 2 minutes, then started syncing files very fast. If the Nextcloud client would limit the server scan to the user selected folders, that would help tremendously to get the sync started. It also seems to be very single minded and deal with one operation at a time. As others have stated, if it's syncing many small files, all other syncs simply don't happen until this is done. I'd rather it grab x number of small files to sync, then check the other folder, then get back to the original task at hand. Heck, have the client look at the folder size and if it's under 10 MB, just transfer the whole folder, or gzip the folder, transfer the one gzip file and expand it on the other side rather than dealing with thousands of tiny files. It just seems there are so many options that could fix this rather than have this issue stay open for 2 years. I'll be happy to help out in any way that I can. |
I also experience this issue. I'd also be happy to help where I can. I am not an expert, but might it be possible that WebDAV is a performance bottleneck here? Has any one of the more experienced developers benchmarked, where in the sync process most of time gets lost? I also had to upload a large number of files and while it was not finished after multiple days via Nextcloud, it took only ~30 minutes with rsync. Especially the discovery phase (e.g. calculating the file list) only took a minute or so. I also guess, that to many users this is quite an important feature and a blocker, especially when setting the server up. Rsync should provide a decent measure for speed of light here. |
@TheMrAnderson Do I understand correctly, that the OwnCloud client is compatible with the Nextcloud server? Or did you also use an Owncloud server in your setup? |
Yes, the OwnCloud client is compatible with the Nextcloud server, even though the OwnCloud client will show in red that it's not and you can proceed at your own risk. And as you said the OwnCloud client scans very shortly and begins syncing very fast. This issue exist solely in the Nextcloud client, not in the server. |
Indeed, the owncloud client works and performs the discovery (of which files to sync) a lot faster than nextcloud client. But still, when uploading small files, the upload rate drops to a few KB, while for big files it is a lot faster. I think, there are two issues with the nextcloud client here:
@tflidd you investigated the database issue, right? Note on using the owncloud client: I think, this is not really something that one should do. While syncing, it threw a lot of errors and also created some metadata files I did not want and ignored some files I wanted to use. |
I did some tests long time ago (owncloud/core#20967) and compared the speed of different versions and the related db-queries. Back then, I thought the RPi is a nice reference system since it is cheap and you can easily share images, and it's not difficult to stress it to its limits. You should probably split this topic a bit, a part for more client-related stuff (current bugs, enhancements in the sync protocol), and and some server-related things (better/faster database interaction). With the file discovery it has already gone way off-topic and you really need to focus on a specific issue. |
FYI, my Nextcloud instance died and I didn't really feel like trying to diagnose what went wrong so I've moved to ownCloud. Just too many unresolved issues with Nextcloud and it doesn't seem that they are priority. Just wanted to update so if there are any more questions about my findings, I can no longer try any potential fixes. I am building my ownCloud instance from scratch the same way I built my Nextcloud instance and in 10 minutes my 6.3GB Repo folder has already synced over 2GB which is in stark contrast to my Nextcloud instance that was only in double digit MB after 10 minutes. |
FYI. It seems like a build from master takes only 5-10 minutes for the discovery phase while version 3.1.1 (on Mac) takes at least 30 minutes. I am wondering if there is a code change or if this is some wrong configuration of the client? |
master (to become 3.2) comes with major changes in the sync engine. So it wouldn't be surprising to see such changes in performances. |
Cool. Don't want to waste your time, but I tried to track down the commits introducing the relevant changes, but couldn't. Could you give me a hint, where to look at? |
Would be hard to pin point a commit to be honest. But lots has changed in the discovery and how the E2EE code plugs into it. |
Will the Changelog be updated? Because at the moment there are only changes tracked up to version 2.6. |
I'm having similar issues, on a Mac (10.15.7 Catalina) with the desktop client (Version 2.6.5). Can supply more information if needed, otherwise, I'll just lurk here to see if something is progressing. |
@HedvigS If you want, you can try a more recent version (3.2.1 at the moment). There happened a lot in the sync engine between the updates. |
I am trying to sync about 90GB of files both big and small and it goes incredibly slow with the smaller files and speeds up with larger ones. When syncing loads of small ones it syncs 1-2 files per second at like a few KB/s. I am on version 3.2.2 and the sync is trying to upload to the server. |
I'm facing this issue too with 3.3.1 (tried various old version), I'm syncing a bit more than 3 millions of files (yes that's a lot :), about 60GB of disk space. By setting the, undocumented, environment variable sqlite journal mode are explained here: https://sqlite.org/pragma.html#pragma_journal_mode Slowness might also came from the server, but in my case I don't have access to it and it's running an old and unsupported version. |
Hi all, The slow upload of small files should be addressed by Sync 2.0, released as part of the 3.4 version of the desktop client and the 23 version of the server. https://nextcloud.com/blog/nextcloud-sync-2-0-brings-10x-faster-syncing/ for more info. The feature will be further improved over the next releases as I believe it currently only works with upload, not download. |
I don't think this should be closed. I just tested it again and while the backend sync is faster, the UI on the desktop app whilst syncing a big folder is still completely unusable which I think is why this issue was originally made. |
This topic was an enhancement, so the discussion was about why it should be implemented and how. This has been done and it was implemented on the backend (and desktop client). You can submit specific reports or requests to the corresponding repositories (e.g. desktop client, mobile client, etc). If all related reports end up here, it will be a huge mess and impossible to get through. |
As @tflidd says - let me quote from the original report:
This has been resolved, or largely - we're still making some more changes there of course. You're right @jonatino that with 1 million files there are other bottlenecks, but after we addressed the UI one you mention there'll be another one, and another, and another - that is the nature of software development. We fixed this issue (mostly), let's create another then for the client UI, memory and discovery performance next. |
If I remember right you also have to enable HTTP/2 on your webserver. |
Same problem here |
In case you use the default settings of your database, you might want to adjust the caches:
|
@tflidd my posts was from 3 months ago, the performance in NextCloud has been improved. I increased in 3GB the DB & the Web docker. |
Expected behaviour
Sync should be able to process many small files in an appropriate time.
Actual behaviour
I tested syncing with 10MB of 1000 Files (bi-directional) and it took about 7 min. I guess this is an issue because the client uses WebDAV and starts an http-request for every single file.
Client configuration
Client version: 2.5.0 Beta 2
Operating system: ArchLinux x64
OS language: English
Qt version used by client package (Linux only, see also Settings dialog): 5.11.2
Client package (From ownCloud or distro) (Linux only): nextcloud-desktop-git (AUR)
Installation path of client: /
The text was updated successfully, but these errors were encountered: