Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAC Client 1.4.1 very high CPU load #1073

Closed
rz0 opened this issue Oct 4, 2013 · 67 comments
Closed

MAC Client 1.4.1 very high CPU load #1073

rz0 opened this issue Oct 4, 2013 · 67 comments
Milestone

Comments

@rz0
Copy link

rz0 commented Oct 4, 2013

since my update to 1.4.1 the cpu load of the ownCloud deamon ist constantly arround 100%.
owncloud-1-4-1_cpu_issue_mac

@jonmorrison99
Copy link

I too have the same issue on OSX 10.8.5

@dragotin
Copy link
Contributor

dragotin commented Oct 8, 2013

do you confirm that its constantly around 100%. Any chance to strace it?

@rz0
Copy link
Author

rz0 commented Oct 8, 2013

yes, i can confirm it. the only way to bring the cpu down is by quittin the owncloud deamon...

@rakekniven
Copy link

Reported in forum as well. See https://forum.owncloud.org/viewtopic.php?f=14&t=17420
On ubuntu.

@lmu-zz
Copy link

lmu-zz commented Oct 9, 2013

Same issue on Windows 7

@oedfors
Copy link

oedfors commented Oct 10, 2013

I'm running Linux and it has been suggested to me that it may have to do with my kernel version, which does not seem to be the case ... since Mac and Win platforms are also affected.

I have tried logging the client (using "owncloud --logwindow"), but for each sync cycle (every 30 seconds), the log is truncated to 2.3 MB in size when saving. I only see the "last" part of what appears to be a, possibly much, longer log. (I have 10,000+ files of total size 7 GB in a sync folder.)

In the thread at https://forum.owncloud.org/viewtopic.php?f=14&t=17420 I have supplied more information about my setup, the periodicity of the CPU load from the ownCloud client, etc.

@srfreeman
Copy link

Wait a minute folks, I do not see anything near the noted loads on a Win 7 desktop, nor the Ubuntu installation on the same machine (dual boot on a cheap, 5 year old PC with a Pent. Dual 1.8 Mhz processor).

Are you saying that a high load is the result of a large number of files in the local ownCloud folder or there is something wrong in the client itself?

A recent test, synchronizing ~2000 files/folders produced nothing more than 60% processor usage and that was just during momentary spikes. The local polling on my test system produces nothing more than 20% spikes with almost no duration. Certainly nothing here that affects the normal operation of the PC.

@dragotin
Copy link
Contributor

We need to care for not mixing things up. This and as well the forum thread talk about the load that happens through the local normal local walk through of the local dir tree. We are not talking about an error that would lock up the client forever burning 100% CPU which would be a dead lock.

I am wondering why that happens because we switched the method to detect local changes with client 1.4.0 IIRC. From that release on the client relies on the file system change notification of the underlying systems and does not do a local walk any more every 30 seconds as it was before. It does it every 5 minutes now for to make sure that everything is caught because the file system notification is unreliable sometimes for various reasons.

So my first advice would be to check if everybody has updated to 1.4.1 please.

@oedfors
Copy link

oedfors commented Oct 10, 2013

I have used ownCloud for a while, with almost constant solder size over time (10,000+ files and 7 GB). I have my own server and I sync the 7GB folder on three different machines. Two stationary ones and one laplop, all runnning Linux Mint (14 and 15).

When I first installed ownCloud (half a year ago), I had problems with the sync procedure being way too slow ... so I gave up. Over time I have upgraded to newer versions of both the server and the client, to see if things got better. Suddenly, when the 1.4 client appeared, things started to work much better. No longer high CPU loads and the sync speed was absolutely OK for me. Recently the server was upgraded to 5.0.12 (from 5.0.10, if I remember correctly) and the client to 1.4.1 (from 1.4.0). After that I have unacceptable CPU loads on the client machines again, in a periodic pattern (60 sec cycle with 20-25 sec of 100% CPU load from ownClout client). The sync seems to work, but at a very high CPU load. Can't say if it happened after the server upgrade or the client upgrade, unfortunately.

@srfreeman
Copy link

I have certainly updated the clients to 1.4.1 (server to 5.0.12) and though I read about the 30 second interval local walk being done away with, there still seems to be a polling (as named in the log) at the same interval:

10-09 19:47:54:148 * event notification enabled
10-09 19:48:22:150 * Polling "ownCloud2" for changes. (time since next sync: 30 s)
10-09 19:48:22:150 Setting up host header: "coveru.com"
10-09 19:48:22:556 * Compare etag with previous etag: false
10-09 19:48:52:165 * Polling "ownCloud2" for changes. (time since next sync: 60 s)
10-09 19:48:52:165 Setting up host header: "coveru.com"
10-09 19:48:52:711 * Compare etag with previous etag: false
10-09 19:49:22:180 * Polling "ownCloud2" for changes. (time since next sync: 90 s)
10-09 19:49:22:180 Setting up host header: "coveru.com"
10-09 19:49:22:586 * Compare etag with previous etag: false
10-09 19:49:52:194 * Polling "ownCloud2" for changes. (time since next sync: 120 s)
10-09 19:49:52:194 Setting up host header: "coveru.com"
10-09 19:49:52:616 * Compare etag with previous etag: false
10-09 19:50:22:209 * Polling "ownCloud2" for changes. (time since next sync: 150 s)

A spike of ~20% CPU usage with almost no duration happens on my test system to coincide with each group of three events. No problem is noted here, but, they do happen.

@dragotin
Copy link
Contributor

Yes, sure they do. That is the regular check (one HTTP request) if something has changed on the server. That is by design and as you say does not cause trouble.

@srfreeman
Copy link

Thank you for confirmation dragotin.

I note that there is no problem on my test system, however, through conversation with oedfors, the same events cause one of the four threads to spike to 100% causing a heat and power issue on his system.

Does the client design make use of multiple threads on a hyperthread capable processor?

While I am asking; Have the client / server been tested or perhaps built with Linux kernel 3.5 and / or above?

@dragotin
Copy link
Contributor

Ok, we seem to approach the core problem here. It seems @oedfors has a larger data set that is handled. We can revisit the code that does the remote ETag check to see if we can more optimize. @oedfors can you tell us how many files and directories you have on the top level of the sync dir?

threads: Yes, we use a thread to do the syncing. But we have no parallel running threads yet, and its questionable if that would help imo.

Yes, kernels > 3.5 are in use. Why do you ask that?

@srfreeman
Copy link

My questions are just to help me rationalize the differences seen between oedfors' system and my test system.

While the threads in oedfors' system show great differences (image displayed on forum) i.e. 99.5%, 9%, 10% and 6% usage. The graphs of my CPU usage note two cores with relatively equal usage.

Since I have not upgraded beyond Linux kernel 2.6, could this be the reason that I see different load levels?

Thank you for the answers provided, now, back to your ideas on oedfors' system...

@oedfors
Copy link

oedfors commented Oct 10, 2013

@dragotin,

The answer to your question is: On the top level of my sync directory I have 5 small files (~100 kB and below) and 24 directories (of which the largest contains almost 4 GB data and the smallest only a few kB).

Thank you again guys for assisting in this matter!

@dragotin
Copy link
Contributor

this is the corresponding issue on server side: owncloud/core#5255

@danimo
Copy link
Contributor

danimo commented Oct 10, 2013

I can reproduce this when toggling between (manual and system specified) proxy settings. In consequence, I get mirall and csync being called subsequently again and again. Looking into it.

@ogoffart
Copy link
Contributor

How many items (files or directories) is there in the toplevel directory.

It would be nice to have mirall run trough a profiler to know what is really taking so much CPU

@dcalacci
Copy link

Hey guys - I just started using the owncloud client on my osx system and I've noticed that it uses around one entire core on my machine whenever it (ostensibly) checks for changes in the directory tree.

I saw you ask the other reporters about how many files they have in their owncloud directory:

$ find . -type f | wc -l
   13729

and here's a screenshot showing the load spikes on my system:
screenshot 2013-10-14 08 55 24

as far as I can tell, the load spikes seen in that graph are due to owncloud.

If you guys want any more information from me, just ask and I'll try to provide it.

@ogoffart
Copy link
Contributor

I was interested in number of file in the root directory, not including subdirectory. (so ls -a | wc -l)

Is the problem occur when it check for changes in the server (every 30 seconds)? or when it does a full sync to detect changes locally (every 5 minutes)?

@dragotin
Copy link
Contributor

I fixed the bug that was described by @danimo above (changing proxy setting frequently) with commit f841450 . Ignore files were not read correctly.

Maybe it should be quickly described why that commit fixed the problem: The ingore file list was not properly read by mirall before. That made mirall not properly ignoring changes to the database file. And that again made the folder rescheduled constantly. With commit d0d3626 I also added that the database files are always ignored to avoid that.

I think that could easily have caused this problem. I consider this problem as fixed, but lets leave it open and people retest with 1.4.2.

@danimo
Copy link
Contributor

danimo commented Oct 15, 2013

Fixed in 1.4.2

@danimo danimo closed this as completed Oct 15, 2013
@oedfors
Copy link

oedfors commented Oct 17, 2013

Thanks for fixing the problem!

When is 1.4.2 expected?

@JonnyBot
Copy link

I appreciate this is closed. But FWIW:

We have rolled out a test of OC for a test for about 10 users -- we would like to have it service 50+ -.

Server: Ubuntu 12.04.3 LE / PHP 5.3.10-1ubuntu3.8 with Suhosin-Patch (cli) / ownCloud 6.0 beta
So far Win client 1.4.2 seems very stable, uses no CPU

Mac client 1.4.2 on OSX 10.8.2 used a tremendous amount of CPU with no break for hours until I killed and restarted it. It bounced between 90% and 30% every second in top. But mostly it stayed between 70% and 40%.

Our test sample has about 400 files in 30 directories totaling less than 1Gb.

I surmised that it was synching non-stop 100% of the time for those hours.
Pause / Resume did not work when it was in this state.

This was from the first use (which I need to test thoroughly before rolling out to users).

OF NOTE:
There were and are many (20+) Invalid Char notices in the Detailed Synch Protocol view.
I do think many of those documents (PDFs) have Chinese char. Not sure if this is an issue.

But not sure if there is some use-case for first-time synch with these errors that causes constant synching.

Restarted with --logwindow but it's too late. Everything is fine now.

@Chluz
Copy link

Chluz commented Dec 20, 2013

Hi, I'm runnig OC 6.0.0a with mirall 1.5 and I'm seeing 25% cpu usage every 300 seconds when csync runs. The problem is the csync message log shows CSync run took 321764 Milliseconds, with no errors that I could see. Doing this every 300 seconds means my laptop is running quite hot. This is with 40 000 files , 40 Gig.

Is this the expected behaviour ? the forums seem to say that csync walking procedure (csync open dir, _csync_merge_algorithm_visitor, etc) should take around a second. This is not the case for me but I do have more files.

@srfreeman
Copy link

@Chluz Actually, the client checks your local ownCloud (sync) folder, every 30 seconds. You are seeing it every 300 seconds, simply because it takes 300 seconds to scan the '40,000' files. The speed of your drive could be an issue, however, attempting to keep 40,000 files in sync (or any operation involving continual access of 40,000 files) certainly gives your laptop a reason to run hot, in any case.

Have you considered reducing the number of files (certainly there is not a chance of 40,000 files changing in any short interval of time) or increasing the scan interval of the client?

@dragotin
Copy link
Contributor

We are considering to obsolete the every 5 minutes check, but there is still some work to do. But still I wonder why the 40,000 files check takes so long.

@Chluz
Copy link

Chluz commented Dec 20, 2013

Hi srfreeman and thanks for your answer. I do see successful sync every 30 seconds, but these only take 1 or two seconds to complete. Every 300 seconds though, it looks like csync travels through a few (if not all directories), doing things that I don't really understand.
As Dragotin is saying, I'm wondering if the 5 min check is useful for anything. I will try and capture a full log for you.

@srfreeman
Copy link

@Chluz Your perceived speed issue may be caused by any number of things, but, looking at just the first part of your log; How often would you think the content of the image and music files contained in the third or fourth copy of Dad's phone backup will change? If you surmise correctly that the answer is 'never'; Why would you subject these files to a process that checks them for changes, whether it be every thirty seconds or even, every three hours?

Another question would be; Are you running the Dropbox client on this same device?

@Chluz
Copy link

Chluz commented Dec 26, 2013

@srfreeman , sorry for the late response, I was without internet.
the android backup files from my dad don't indeed need to be synched, they just happened to be in the droobox sync folder when i transferred it iver to owncloud. from what i can see in the logs nothing is going wrong, its only that each request takes a few milliseconds, times 40000 that goes up fast.
The dropbox client was on the same machine synching the same files. I think i should just change the sync remote check time to an hour, remove as many files as I can from the sync, and wait for the owncloud team to do away with the slower cpu intensive sync completely.

@srfreeman
Copy link

@Chluz the documentation for the ownCloud client provides a clear warning: "Syncing the same directory with ownCloud and other sync software such as Unison, rsync, Microsoft Windows Offline Folders or cloud services such as DropBox or Microsoft SkyDrive is not supported and should not be attempted. In the worst case, doing so can result in data loss."

I would imagine that once you have rectified the 'two client access' issue and reduced the number of files (from looking further in your log, very few files could benefit from the scrutiny and two way synchronization provided by the ownCloud client), you will see little need to adjust the interval.

For simple copying (one way synchronization, if you will) of files from the ownCloud server, I would recommend that you take a look at the many WebDAV clients available. This could provide you with a low resource intensive way of providing off site backup of files.

Of course, a decent backup strategy for the ownCloud server could make very little additional copying necessary.

From our tests and usage by hundreds of users (US, nation wide through three interconnected data centers), the actual synchronization file load through ownCloud's desktop clients is so low that any question of software efficiency is a moot point.

@Chluz
Copy link

Chluz commented Dec 27, 2013

hi @srfreeman, just to clarify dropbox and owncloud have the same files, but are running from two different folders (I copied the dropbox folder content to the owncloud folder)

I originally used webdav to transfer some content over, but you're right in saying I should clean up the sync folders. I will do that asap, thanks for your help.

----- Reply message -----
From: "srfreeman" notifications@github.com
To: "owncloud/mirall" mirall@noreply.github.com
Cc: "Chluz" charles.luzzato@gmail.com
Subject: [mirall] MAC Client 1.4.1 very high CPU load (#1073)
Date: Thu, Dec 26, 2013 22:32

@Chluzhttps://github.com/Chluz the documentation for the ownCloud client provides a clear warning: "Syncing the same directory with ownCloud and other sync software such as Unison, rsync, Microsoft Windows Offline Folders or cloud services such as DropBox or Microsoft SkyDrive is not supported and should not be attempted. In the worst case, doing so can result in data loss."

I would imagine that once you have rectified the 'two client access' issue and reduced the number of files (from looking further in your log, very few files could benefit from the scrutiny and two way synchronization provided by the ownCloud client), you will see little need to adjust the interval.

For simple copying (one way synchronization, if you will) of files from the ownCloud server, I would recommend that you take a look at the many WebDAV clients available. This could provide you with a low resource intensive way of providing off site backup of files.

Of course, a decent backup strategy for the ownCloud server could make very little additional copying necessary.

From our tests and usage by hundreds of users (US, nation wide through three interconnected data centers), the actual synchronization file load through ownCloud's desktop clients is so low that any question of software efficiency is a moot point.

Reply to this email directly or view it on GitHubhttps://github.com//issues/1073#issuecomment-31224431.

This message has been scanned for viruses and
dangerous content, and is believed to be clean.

@ser72
Copy link

ser72 commented Feb 3, 2014

@dragotin : What is this "every 5 minute check" you speak of? Another user is seeing CPU spikes as such every 5 minutes on 1.5.0 client on MAC OS X 10.8.5.

Is this check still in the 1.5.0 code? What is it checking? Will it be obsoleted? And if so, what client version will obsolete the check?

@ser72
Copy link

ser72 commented Feb 5, 2014

@dragotin any comments on the 5 minute check. Turns out we are seeing the High CPU every 5 minutes. In addition, when the client window is up (ie from the Settings menu), we also see 17-18% CPU...???

@dragotin
Copy link
Contributor

dragotin commented Feb 6, 2014

What we do until 1.5.0 is: We run a full sync run every fife minutes that compares both remote and local side. This is what users might recognise as high CPU even though nothing has changed. Normally, we trigger sync runs if the local file system watcher notifies a change. Unfortunately the file system watchers are not 100% reliable on any of our platforms. In rare cases they "loose" events which would result in a change that is not synced for the user. We want to avoid that and as a compromise we decided to do an extra sync every fife minutes.

Now that we face the feedback that it's annoying we were already discussing to skip that fife minutes thing for the 1.5.1 release, but we weren't sure. @ser72 what do you think? @MTRichards ?

@Chluz
Copy link

Chluz commented Feb 6, 2014

@dragotin I'm hoping this sync can be avoided :)
From the previous issue I was having with this, sync of 40000 files took around 5 minutes. Now I have reduced it to a few thousand files and the sync takes 30seconds. It is still not negligible.

@moscicki
Copy link
Contributor

moscicki commented Feb 6, 2014

To add to this thread:

On some filesystems inotify does not work (e.g. AFS) - so I would be careful with this optimization. Also, the inotify evens may be lost on a busy fs if you are not fast enough to grab them from the system queue. So unless you do something really smart you'd have to stay on the safe side...

Here are limitations from box.com that someone recently pointed me to:

https://support.box.com/hc/en-us/articles/200521078-Box-Sync-3-x-Behavior-Limitations-and-Recommendations

3.1 Do not sync a large number of files and folders
Syncing a large number of files and folders can degrade your computer’s performance. For this reason, we do not allow users to exceed the limits below:

Optimum Performance: 10,000 files & folders maximum
Technical Limit: 40,000 files & folders maximum
All previous versions: No more than 10,000 files & folders
kuba

On Feb 6, 2014, at 4:33 PM, Chluz notifications@github.com wrote:

@dragotin I'm hoping this sync can be avoided :)
From the previous issue I was having with this, sync of 40000 files took around 5 minutes. Now I have reduced it to a few thousand files and the sync takes 30seconds. It is still not negligible.


Reply to this email directly or view it on GitHub.

@ser72
Copy link

ser72 commented Mar 3, 2014

It is reported that even with 1.5.2 client the CPU on the MAC client is high.

See data on S3 at support/HighMacCPU

@danimo

@ser72 ser72 reopened this Mar 3, 2014
@dragotin
Copy link
Contributor

dragotin commented Mar 7, 2014

@ser72 I checked the logfiles you provided again. This is the crucial part:

02-28 13:27:29:440 <===================================== sync finished for  "ownCloud"
02-28 13:27:29:641 XX slotScheduleFolderSync: folderQueue size:  0
02-28 13:27:35:835 !!! Mirall::CheckQuotaJob created for  QUrl( "https://drive.domain.tld" )  querying "/"
02-28 13:27:38:538 * Polling "ownCloud" for changes. (time since last sync: 62430 s)
02-28 13:27:38:539 ** Force Sync now, state is  "Success"
02-28 13:27:38:539 Schedule folder  "ownCloud"  to sync!
02-28 13:27:40:539 XX slotScheduleFolderSync: folderQueue size:  1
02-28 13:27:40:564 Folder in overallStatus Message:  Mirall::Folder(0x111c1c740)  with name  "ownCloud"
02-28 13:27:40:564 Sync state changed for folder  "ownCloud" :  "SyncPrepare"
02-28 13:27:40:564 SocketApi:  Sync state changed
02-28 13:27:40:564 SocketApi:  Broadcasting to 0 listeners:  "UPDATE_VIEW"
02-28 13:27:40:565 *** Start syncing
02-28 13:27:40:565   ==> returning exclude file path:  "/Applications/owncloud.app/Contents/Resources/sync-exclude.lst"

The sync is forced again way to early, I will investigate that. It looks buggy.

@ser72
Copy link

ser72 commented Mar 7, 2014

@dragotin Thanks.

@dragotin
Copy link
Contributor

dragotin commented Mar 7, 2014

The value 62430 secs for the time to last sync is strange, and wrong, and the reason why a the next sync is forced right away.

@ser72 could you ask if the user has custom values for remotePollInterval and/or forceSyncInterval in the config file? Thx.

@guruz
Copy link
Contributor

guruz commented Mar 7, 2014

In @ser72 's logfile all the "time since last sync" are monotonically increasing, although 3 syncs are happening.
That timer _timeSinceLastSync is reset when the sync ends, but for some reason in this logfile that doesn't seem to happen.

@ser72
Copy link

ser72 commented Mar 7, 2014

Requested info from the user

@dragotin
Copy link
Contributor

dragotin commented Mar 7, 2014

@guruz yes, the reset of _timeSinceLastSync seems not to happen, yet the logging in shows that the code goes exactly through the function that resets it. Is it possible that two clients are running at the same time, spitting each other into the soup? I thought somebody reported that for Mac, but not sure...

@guruz
Copy link
Contributor

guruz commented Mar 10, 2014

@ser72 If the user restarts the client, does this issue happen again?
Did he/she did anything specifically to trigger this? (e.g. deleting/adding a sync folder in the settings etc)

@ser72
Copy link

ser72 commented Mar 10, 2014

@jcfischer Does this happen when you restart the client?

I believe nothing special occurs to trigger this. If my notes are correct, it occurred after a fresh install of the client. @jcfischer is that correct?

@guruz
Copy link
Contributor

guruz commented Mar 11, 2014

Also, does this happen with 1.5.3 ? There's a small change in there with could influence this issue.

@jcfischer
Copy link

Have not tried with 1.5.3 yet. Busy this week with Workshops, will have time on thursday

cheers
jc

SWITCH
Jens-Christian Fischer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 15 71
jens-christian.fischer@switch.ch
http://www.switch.ch

http://www.switch.ch/socialmedia

On 11.03.2014, at 09:56, Markus Goetz notifications@github.com wrote:

Also, does this happen with 1.5.3 ? There's a small change in there with could influence this issue.


Reply to this email directly or view it on GitHub.

@ser72
Copy link

ser72 commented Mar 25, 2014

@jcfischer Have you had the chance to test 1.5.3 client for the high CPU yet?

@ser72
Copy link

ser72 commented Mar 25, 2014

@guruz

1.5.3 has behaved very well today…

But I have seen phases of near constant high CPU load

@dragotin
Copy link
Contributor

@ser72 that is probably the inefficient code we still have in the update phase, but that will be addressed in the upcoming 1.6.0 release.

@ser72
Copy link

ser72 commented Mar 25, 2014

@dragotin That's what we thought as well. We are willing to wait for 1.6 and see the behavior there.

Just wanted to finish off the last note from @guruz so it isn't in our ball park

@francesquini
Copy link

I am still experiencing these problems using version 1.6.0 on Ubuntu, server version 6.0.3.
After 24h, it is still using 100% CPU. According to "perf top" 30-40% of the CPU on the client side is being used by libsqlite3.so.0.8.6, by the thread with name "CSync_Neon_Thre". The CPU use on the server is negligible and the network traffic practically none.

However, I do have a big number of files:

$ find . -type f | wc -l
113653
$ du -h |tail -n 1
21G .

@dragotin
Copy link
Contributor

dragotin commented Jun 4, 2014

can you

  • open the settings window
  • press F12 to make the log window appearing
    and check what it actually does, ie. paste some logging?

Did it finish the first sync run to download/upload everything?

@francesquini
Copy link

The problem happened during the first sync. Finally I discovered the problem. The connection was unstable (over wifi with poor reception). At each time the connection was lost, the whole sync process started over! The client seems to discard everything it already did when the connection is reestablished. Fixed the connexion issues and after 10-12 hours, it finished.

@guruz
Copy link
Contributor

guruz commented Jun 5, 2014

1.6 has a lot of improvements in this area. I'm closing this as it has been fixed (if you have a stable internet connection).

@guruz guruz closed this as completed Jun 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests