Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Slow searches after upgrading from 6.0.6 to 7.0.4, Server CPU load peaks at 100% while searching #13237
Before start, as this is my first post, I would like to thank all the developers for the awesome work with Owncloud, it's a wonderful software.
I came across a weird issue after the migration from an stable and rock solid deployment based on OC 6.0.6 with 300 users and LDAP integration. The Search app in the web, which had always been blazing fast (I'm not using lucent search so far) is now sluggish (I.e: one search takes more than 7 seconds to draw results in a user account with less than 300 files on it), to the point that with those users with more files in their accounts the issue gets worse and no results are displayed and which is worst, during this search task, Apache processes takes up the whole CPU, up to 100% thus degrading the performance of the whole installation.
I did lot of testing on my own before posting this issue here: For example, the issue only appears with existent users migrated from 6.x version then, if I create a new user within OC 7.0.4 and copy over the same files from another existent user with lot of files, the search works as expected (within a second) and the CPU doesn't peak. Therefore, I assume that may have something to do with the upgrade process and something that is not well updated somewhere in the DB, causing the issue.
There is nothing relevant in the logs and the upgrade process completed successfully. I have also checked that downgrading to 6.0.6 solves the issue.
I would appreciate it if you could have a look at this issue and let me know of any possible solution for this.
If you need further information please let me know.
Thank you in advance.
Steps to reproduce
1.Login to OC with a user migrated from OC 6 and enter any pattern to search in the search box
As in OC 6, the search should complete within a second or two and should not affect severely the performance on the server, with the CPU at 100%.
The search, specially with the users with most files, takes too long to complete or never ends and while launching it the CPU on the server peaks due to Apache process.
Operating system: Ubuntu 14.04.1 LTS
Web server: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f
Database: Mysql 5.5.40-0ubuntu0.14.04.1 (Ubuntu)
PHP version: 5.5.9-1ubuntu4.5
ownCloud version: OC 7.0.4
Updated from an older ownCloud or fresh install:
List of activated apps:
The content of config/config.php:
<?php $CONFIG = array ( 'instanceid' => 'XXXX-OMITTED-XXXXX, 'theme' => 'custom', 'passwordsalt' => 'XXXX-OMITTED-XXXXX, 'datadirectory' => '/owncloud/data', 'dbtype' => 'mysql', 'version' => '220.127.116.11', 'dbname' => 'XXXX-OMITTED-XXXXX, 'dbhost' => 'localhost', 'dbtableprefix' => 'oc_', 'dbuser' => '',XXXX-OMITTED-XXXXX 'dbpassword' => '', 'installed' => true, 'ldapIgnoreNamingRules' => false, 'mail_smtpmode' => 'smtp', 'mail_smtphost' => 'XXXX-OMITTED-XXXXX', 'mail_smtpauth' => false, 'loglevel' => '1', 'maintenance' => false, 'enable_previews' => true, 'preview_max_x' => NULL, 'preview_max_y' => NULL, 'enable_avatars' => true, 'knowledgebaseenabled' => false, 'trusted_domains' => array ( 0 => 'XXXX-OMITTED-XXXXX', ), 'secret' => 'XXXXX-OMITTED-XXXXXX' 'share_folder' => '/Shared',
@icewind1991 Sounds like a FS problem – did you happen to notice anything like that?
FS? Don't think so, because the newly created user with the very same files copied over from an already existing user who suffers from this issue, does not face this issue, and the filesystem were the files from either user are stored is the same. In short, the issue seems to affect only old users from the 6.0.6, not the new ones, no matter the amount of files or storage quota usage.
Thank you for your help.
If you want to do us a VERY big favour you could profile the request and so we can see right away where the problem possibly lies.
For that I'd recommend:
Please notice that blackfire is proprietary software but created from the guys behind Symfony so I consider it trustworthy enough ;-)
Such a trace would help us a lot since we see directly where most of the time was spent and can analyze it without needing to guess :)
Thank you for help, I will follow your directions as soon as possible and provide you with the required info.
Have a nice day!
I'm back again I've performed all the requiered profiling on my side and I'm providing you with screen captures of the results:
In order to test all the possible scenarios described for the issue I tested it with two users with issues (migrated from OC 6.0.6), one with fewer files where the search completes but takes lot of time to do (almost 17 seconds) and the other one, with much more files than the previous where the search never completes. The searched pattern is the same throughout all the testing.
To compare it, I also have a newly created user to whom I copied over the same files as the user with the large file list (exactly the one used below that never completes). In this case the search completes in just 137 ms.
Screen captures for all the tests so far:
User 1: User migrated from 6.0.6, very few files, search task ended after almost 17 seconds:
User 2: User migrated from 6.0.6, large file list, search task never ends:
User 3: Newly created user, Same files as user2. Search behaves as expected:
Comparision between User 1 (16 seconds) and User 3 (137 ms):
comparision between user 2 (never completes) and user 3:
I hope this will help you finding out the issue. However, if you need further details or any further testing on my side, please let me know.
Thank you in advance.
Do you happen to have an external SMB mount for some of the users or on the instance?
The dependency check for external SMB storage seems to take horribly long as can be seen at https://cloud.githubusercontent.com/assets/10474615/5692857/72d80e3a-9906-11e4-9a18-7a655fd9f504.png - 52.11% of 16.6s for
To test whether that is the actual problem you could modify
@DeepDiver1975 Pretty interesting case with a lot of debug informations. Please schedule the array memory cache replacement (#13211). I'd say that certainly warrants 8.0 if we take a look at the horrible performance here.
As suggested, I disabled external files app and the user whose search took 17 seconds improves a lot, I'll send you new profile screenshots later on. However, the user with no response at all continues with the issue ( no response & CPU load). I don't know whether has anything to do with the same issue, but further testing this problematic user, I found out that he has also issues while accessing picture gallery (takes also forever) or Documents (same here).
We do use SMB mounts for some users, in the meantime and while troubleshooting this issue, I think it's better to leave it off rather than taking the risk of searches badly affecting the performance of the whole instance.
Thank you very much for your help.
I performed further testing on mi side, I did the profiling once again with the files_external app disabled:
The search on the user that took 17 seconds now takes less than 6 seconds, although I still think is far more than expected. Here are the captures for this search:
And compared with the same search with same user before disabling files_external:
On the other hand, while testing the user with large amount of files, it still does not complete but this time 43 seconds later it ends with a PHP error:
[Sun Jan 11 10:24:08.768699 2015] [:error] [pid 9219] [client 172.18.X.X:50396] PHP Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 56061 bytes) in /var/www/owncloud/apps/files_sharing/lib/cache.php on line 164, referer: https://SERVER/index.php/apps/files/
Hope this would be helpful for you. If you need further testing please let me know.
Thank you in advance.
Actually I have already done it, but for unknown reason blackfire does not gather any info of it and in the slot it reads: Unknown Profile context.
I'll keep on trying.
Any ideas on why blackfire does not record it correctly?
For the user where searching now works I see the following performance enhancements possible:
I wonder whether the difference between the user where searching works fast and the user where it is slow is that one user has received some shares? – I see a lot of sharing related code paths called which in sum need again more than 80% (!) of the processing time. Can you confirm that the first user has active shares?
Seems like their module does not profile completely when the script times out before (or requires too much memory etc…). So nothing we can change.
About how many files are we talking here that the user has? - Hundreds, thousands, millions?
The following SQL commands should help you for that:
For the files list I don't have a handy SQL query available since I'm bad with Joins etc ;-) – I'd go the with the following command in the users
Would be great if you could complete the following list with the missing XXX:
In case of user 3 are the search results actually returning sane results?
I come back with the info, surprisingly the figures does not seem too high
Yesterday with user 2 I also tried to truncate oc_fscache table (delete
Finally, yes, the results of User 3, are correct, consistent and are quick
Hope his helps.
On Sun, Jan 11, 2015 at 1:00 PM, Lukas Reschke firstname.lastname@example.org
Ángel Barrio Martínez
I have been further testing with the new user created (user 3) with the same amount of files uploaded and then I share with him from another user a total of 20 folders and I can see that, as suggested the search time is much worse than when this user was first created, so as you suggested the issue may lay within the shared files list.
I would single out the user2 (didn't complete with memory error php) because I think that this user has another issue that is affecting the behaviour causing the memory error, not only while searching but also while browsing gallery or (this user happens to be a member of admin gropu) also in the user management web page as you can see down here:
Regarding this user in particular, I did further reserching within DB tables and this user has lots of references in oc_properties table like this one:
I can't see any references like this in other users, is something that is is use in OC 7.0.4 or could instead be safely truncated?
Take into account that this is an installation that started with OC 4.5, then upgraded to 5.0 and 6.0.X until get to 7.0.4 now, so could it be that something "legacy" in the DB is making noise to the point of causing this issue?
I will continue with the research on my own, however, if you need anything else on mi side please let me know.
Thank you very much for the superb support.
Edit: added triple backquotes for log
And that all for free – isn't a open-source community great :-)
Thanks for your patience and help in debugging this. Usually people are much less helpful and restrict themselves to posting things like "That crap doesn't work" ;-)
From what I saw in the code searching through folders as recipient is slow, because it is running the search query on every received shared folder separately. I'm not aware of any changes regarding that part. I believe 6.0.6 had that code path too.
Also I saw you talking abous
Absolutely Lukas, I think owncloud is not only one of the best softwares I've ever tried in my 14 years of IT experience, but also one the best supported so far, much better than the majority of propietary services for sure :-). Awesome, please carry on.
Regarding the unused tables on the DB, I supposed they were no longer in use, but now I will remove them at least for the most problematic user just in order to see whether any of this could affect this one. Actually, the most problematic user is my own user, I think it was the very first one user of the instance back in version 4.5. :-) Can I safely delete the whole oc_fscache? Do you have the names of tables of a freshly (not migrated) installation of OC 7.0.4 with the apps I currently have enabled so that I can compare?
@PVince81 I assure you that with 6.0.6 everything was working fine, nevertheless I'm quite convinced that this could be a legacy issue or something inherited from even past versions.
For the moment, with the files_external disabled, I am not aware of any other user, except mine, with the memory timeout issue, in every single user I tested the searches work, however, as I told you, they take much longer than in version 6.0.6, they should be in the ms range an not in the seconds range as they are.
Thank you Vince & Lukas.
referenced this issue
Jan 12, 2015
I did some more research on DB comparison and for that I installed from scratch a 7.0.4 in another demo server and enable the same apps on both instances, except from Calendar, the comparison between both instances: the old one and the newly deployed one shows the following differences:
As you can see on the picture of the left, apart from the already commented oc_fscache, there are other tables that exists in the upgraded install but does not exist in a newly deployed instance. Of course maybe any of these could be generated with the use, but there are quite suspicious in my humble opinion:
First, oc_oc_filecache_6dxxxxxxx as far as I know is a temporary table used during migration process, and it should have been deleted after completion, which in this case did not happen. I think is safe to drop this table completely ,so I will do so. I remind you that during the upgrade process from 6.0.6 no errors were triggered. The original oc_filecache table has a total of 1.5 million entries, I think is fine because every single file stored on the instance has an entry on this table.
Then, oc_permissions seems to match shared files with the numeric value of the permissions associated with them, but I am wondering whether this is of use as for 7.0.4 as in oc_share is also the same info.
Last, the table oc_queuedtask has zero rows so it doesn't seem to be in use at all.
I hope this information could shed light on this issue or at least be helpful to improve the installation process of Owncloud.
Thank you in advance.