Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload performance OC 8.2.1 (50 mysql queries per file) #20967

Closed
tflidd opened this issue Dec 6, 2015 · 34 comments
Closed

Upload performance OC 8.2.1 (50 mysql queries per file) #20967

tflidd opened this issue Dec 6, 2015 · 34 comments

Comments

@tflidd
Copy link
Contributor

tflidd commented Dec 6, 2015

Upload of many small files is very slow. I tested on a raspbery pi 2 which is known to have a bad i/o-performance. During uploads, mysql-operation create an important io-wait on the system. It was worse with journaling of ext4, so I disabled it (it's a test system).

Uploading a folder with 1180 files (18 MB) via the oc-client takes about 20-30 minutes (pictures, a lot of text files, take any php code of a project), but the number of sql-queries is extremely high (>50 queries per uploaded file!):

command number
select 49,681
update 11,523
insert select 2,487
set option 1,526
delete 511
insert 423
begin 17
commit 17

For upload, a smaller number of queries can be obtained when files are uploaded in an external folder via ftp (takes a few seconds) then this folder is included into owncloud, the scanning process takes a few minutes (it's hard to tell, 5 minutes?) and the number of queries (some queries are due to handling the web-interface):

command number
select 11,054
update 3,040
insert select 3,004
set option 425
delete 138

I remember you worked on the deletion process for OC 8.0, this scales much better: 3000 files take 3500 update and 1400 select queries. Perhaps there is a similar improvement possible for upload operations ;-)

System: RPi2, Debian Jessie
Webserver: nginx/php-fpm
Database: mariadb
Owncloud 8.2.1
Apps enabled:

  • activity: 2.1.3
  • calendar: 0.8.1
  • contacts: 0.5.0.0
  • files: 1.2.0
  • files_external: 0.3.0
  • files_pdfviewer: 0.7
  • files_sharing: 0.7.0
  • files_texteditor: 2.0
  • files_trashbin: 0.7.0
  • files_versions: 1.1.0
  • files_videoviewer: 0.1.3
  • firstrunwizard: 1.1
  • galleryplus: 14.3.0
  • provisioning_api: 0.3.0
  • templateeditor: 0.1

I haven't tested with a pure webdav-client yet.

@melroy89
Copy link

melroy89 commented Dec 7, 2015

We have indeed a big MySQL issue here... I can confirm this.

@DeepDiver1975
Copy link
Member

@icewind1991 @LukasReschke some more profiling? 🙊

@DeepDiver1975 DeepDiver1975 added this to the 9.0-current milestone Dec 7, 2015
@LukasReschke
Copy link
Member

Is Redis enabled as file locking provider? If no: Pretty sure it's file locking. 🙊

@tflidd
Copy link
Contributor Author

tflidd commented Dec 8, 2015

Indeed, I used only APCu. But I can't disable filelocking, I tried 'filelocking.enabled' => 'false', that didn't work.

So, I tried to enable file-locking using a redis-server. I installed from debian (jessie) which comes in version 2.8.17. I run more or less the standard config, except that I use the socket to connect to redis. I disabled apcu (https://www.en0ch.se/how-to-configure-redis-cache-in-ubuntu-14-04-with-owncloud/) but the admin page now complains
No memory cache has been configured. To enhance your performance please configure a memcache if available. Further information can be found in our documentation.

Redis seems to be used as the warning the the database is used for file-locking disappeared. Before I could see many queries for a file-locking table, but such queries disappeared as well. Unfortunately the performance got worse:

command number
select 114 k
update 27,946
set option 4,465
insert select 1,313
commit 133
begin 133

I only used the standard redis-configuration, I perhaps I need to tune some parameters to use memory more efficiently. Is this normal in the logs?

[7522] 08 Dec 00:07:05.004 * 10 changes in 300 seconds. Saving...
[7522] 08 Dec 00:07:05.006 * Background saving started by pid 11366
[11366] 08 Dec 00:07:05.484 * DB saved on disk
[11366] 08 Dec 00:07:05.487 * RDB: 0 MB of memory used by copy-on-write
[7522] 08 Dec 00:07:05.508 * Background saving terminated with success
[7522] 08 Dec 00:12:06.090 * 10 changes in 300 seconds. Saving...
[7522] 08 Dec 00:12:06.091 * Background saving started by pid 11883
[11883] 08 Dec 00:12:06.135 * DB saved on disk
[11883] 08 Dec 00:12:06.138 * RDB: 0 MB of memory used by copy-on-write

@LukasReschke
Copy link
Member

@tflidd Would it be possible for you to install the free blackfire.io tool on your server?

Then just upload the file in your browser, open the developer tools, go to the network console, copy the request as cURL (to upload.php) and in the command line type "blackfire curl theRestOfTheCurlCommand"

After this sharing the trace (there is an option for public sharing) would help us a lot. (ah. And technically there is also a 15day trial for the paid version. You might try to enable that one as well as it gives more infos ;))

@ghost
Copy link

ghost commented Dec 8, 2015

@tflidd Have you removed the memcache.local from your config? You need both, the memcache.locking and memcache.local

@tflidd
Copy link
Contributor Author

tflidd commented Dec 8, 2015

Thanks for your ideas. I will have to verify my caching-setup with the docs, it's possible that I missed something. If it doesn't improve, I will try to install blackfire.io on my RPi2.

@melroy89
Copy link

melroy89 commented Dec 8, 2015

Why are we talking about Redis, when the problem started with MySQL database. Even so, MySQL is default.

@ghost
Copy link

ghost commented Dec 8, 2015

@Danger89 oC 8.2+ is using the database for file locking as a default (previously this wasn't the case and only redis was used when configured) which might cause more SQL queries than before. If oC is configured to use redis for file locking those queries are not done within MySQL.

@melroy89
Copy link

melroy89 commented Dec 8, 2015

Well, I was using APCu with the same drawback: large amount of small files takes forever. Ps. I was using Debian 8, which has release 8.1.4. And previously I ran Apache webserver.

I now setup Redis with memcached and Transactional File Locking enabled. I also switched to Nginx server and php-fpm. Let me check if I see any improvements or not.

@tflidd
Copy link
Contributor Author

tflidd commented Dec 12, 2015

Ok, it took a bit longer. Problem was, that blackfire.io does not work on ARM. So I installed OC 8.2.1 on a virtual machine (Debian 8 system, it is apache+mod_php instead of nginx+php-fpm). There are still a lot of SQL queries.

Now, I run blackfire.io like @LukasReschke described (hopefully I understood correctly):

@tflidd
Copy link
Contributor Author

tflidd commented Dec 12, 2015

small update: I uploaded directly via webdav and tried to trace it.
blackfire curl -T /uploadfile http://oc-user:password@mydomain/remote.php/webdav/test/test/newfile
The folders existed before, but it is a new file (no overwrite). The corresponding profile:
https://blackfire.io/profiles/64844ad7-c01b-45ac-a3b8-7fce9e547477/graph

@tflidd
Copy link
Contributor Author

tflidd commented Dec 13, 2015

The OC\Files\Cache\Updater::update-function is responsible for a lot of queries (update mtime and size for all parent folders). I uncommented all the updates inside this function to have a benchmark for the file transfer. All the 1180 files (18 MB) are transferred in less than 90s (before >20 min).

In the update()-function were some changes for OC 9.0, so I set up the current master (74de12c), created a new profile:
for the curl-webdav: https://blackfire.io/profiles/69ec773e-9a8e-485f-b4c6-0f6e4831fa83/graph
for update.php: https://blackfire.io/profiles/32e5d8ac-38f1-4488-92e1-43ea4ac1e2a9/graph
which seems to be better.

So I uploaded the 1180 files, it takes a bit less than 20min but still much longer than the 90s.

@LukasReschke
Copy link
Member

Helpful traces. Thanks a lot, @tflidd 🚀 😄 👍

@LukasReschke
Copy link
Member

@icewind1991 Care to take a look? THX

@melroy89
Copy link

Top 3 bottleneck update-queries:

~7x     327 ms  UPDATE `oc_filecache` SET `mtime` = ?, `etag`=? WHERE (`mtime` <> ? OR `etag` <> ? ) AND `fileid` = ?
3x  141 ms  UPDATE `oc_filecache` SET `size`=? WHERE (`size` <> ? ) AND `fileid` = ?
~2x     85.3 ms     UPDATE `oc_filecache` SET `mtime` = ?, `storage_mtime`=? WHERE (`mtime` <> ? OR `storage_mtime` <> ? ) AND `fileid` = ? 

@icewind1991
Copy link
Contributor

9.0 should have less UPDATE 'oc_filecache' SET 'mtime' = ?, 'etag'=? queries already

I have an idea how to reduce the number of UPDATE 'oc_filecache' SET 'size'=? queries which I'll try to implement when I have the time

@tflidd
Copy link
Contributor Author

tflidd commented Dec 15, 2015

Are there plans for a bulk upload?
It is not efficient to recalculate the folder size several times per second just to upload the next tiny file. And the resulting upload time is too long.

Taking my 1200 files with a total size of 18MB, it would take less than 1 minute to upload these files directly via FTP. At this point it takes more than 20 minutes. For OC9, there are some improvements, but it doesn't help much if you shorten the time from e.g. 22 minutes to 18 minutes. We should think about an upload process where this time is reduced to less than 5 minutes.

The client could use something like the zip-streamer to combine all the files and then all the filecache-commands are run once on the server. Other idea is to just create the filecache-entries for the new files and folders and move the calculation of the size to a cronjob (or add a command for the oc-client that can trigger such an update).

@melroy89
Copy link

@tflidd good point! Indeed post-calculations should be done at the end or cronjob. The current implementation / protocol is not efficient enough. A redesign may be necessary.

@ghost
Copy link

ghost commented Jan 26, 2016

Some discussion about short-term mitigations on database level where done in owncloud/client#331, especially in owncloud/client#331 (comment)

@jospoortvliet
Copy link

Also related to (and perhaps one of the two should be considered duplicate): #11588
#11588 (comment) for benchmark
#11588 (comment) - this won't be solved unless we do batching.

@tflidd
Copy link
Contributor Author

tflidd commented Feb 14, 2016

I checked again on my RPi2 with Debian 8 (mariadb, redis) and upgraded from 8.2.1 to 8.2.2 to 9.0beta1. For better comparison, i used the method of #11588 (comment) (10000 files). Here the results:

time (min) select update set insertselect
OC 8.2.1 170 591 k 115.5k 32.2 k 10004
OC 8.2.2 150 593 k 116.0k 32.3 k 10030
OC 9.0beta1 167 672 k 135.9k 32.2 k 10061

Data were estimated after upload of 40 min (50 min for OC 9.0beta1).

@tflidd
Copy link
Contributor Author

tflidd commented Feb 14, 2016

I tried again on a machine, vserver (debian 8 + mod_php+redis+mysql), there is other stuff running so I couldn't make stats of the db-queries. Results are based on 10 minutes upload:
OC 8.2.2 137.2 files/min -> 70 min
OC 9.0beta1 161.9 files/min -> 62 min (php7.0.3: 172.4 files/min -> 58 min)
I'm still far away from the 20 min @jospoortvliet achieved but I don't have a SSD drive.
#11588 (comment)

On the OC 9.0beta1 setup:
I just noticed something that the upload usually start quite fast (first 1-2 minutes), then it slows down:

t (s) files/min
60 163
120 158
180 130
240 136
300 127

Just the first minute with slower download, I caught this on my slow-query.log (>2s):

# Time: 160214 12:00:16
# User@Host: oc_admin2[oc_admin2] @ localhost []
# Query_time: 2.410065  Lock_time: 0.000046 Rows_sent: 0  Rows_examined: 0
SET timestamp=1455451216;
INSERT INTO `oc_filecache` (`mimepart`,`mimetype`,`mtime`,`size`,`etag`,`storage_mtime`,`permissions`,`parent`,`path_hash`,`path`,`name`,`storage`) SELECT '5','13','1455451214','100','6af6fd6a35059693af9a97b36a082e7a','1455451214','27','3848','9e26a4bad9294d22e14fd048a49649cf','files/test/upl/xaaaaaaagxb','xaaaaaaagxb','1' FROM `oc_filecache` WHERE `storage` = '1' AND `path_hash` = '9e26a4bad9294d22e14fd048a49649cf' HAVING COUNT(*) = 0;
# User@Host: oc_admin2[oc_admin2] @ localhost []
# Query_time: 2.353405  Lock_time: 0.000046 Rows_sent: 0  Rows_examined: 0
SET timestamp=1455451216;
INSERT INTO `oc_filecache` (`mimepart`,`mimetype`,`mtime`,`size`,`etag`,`storage_mtime`,`permissions`,`parent`,`path_hash`,`path`,`name`,`storage`) SELECT '5','13','1455451214','100','10cfdfc49e1193507a10e1abe68932f6','1455451214','27','3848','f9e93e667ed2e02e75216af789734988','files/test/upl/xaaaaaaagxc','xaaaaaaagxc','1' FROM `oc_filecache` WHERE `storage` = '1' AND `path_hash` = 'f9e93e667ed2e02e75216af789734988' HAVING COUNT(*) = 0;
# User@Host: oc_admin2[oc_admin2] @ localhost []
# Query_time: 2.563274  Lock_time: 0.000030 Rows_sent: 0  Rows_examined: 1
SET timestamp=1455451216;
UPDATE `oc_filecache` SET `mtime` = '1455451212', `etag`='56c06c4e4db4f' WHERE (`mtime` <> '1455451212' OR `etag` <> '56c06c4e4db4f' ) AND `fileid` = '1362';
# Time: 160214 12:03:39
# User@Host: oc_admin3[oc_admin3] @ localhost []
# Query_time: 2.211797  Lock_time: 0.000163 Rows_sent: 0  Rows_examined: 1
use oc9;
SET timestamp=1455451419;
UPDATE `oc_filecache` SET `mtime` = '1455451416', `etag` = '56c06d18e93d9', `size`='114332497' WHERE (`mtime` <> '1455451416' OR `etag` <> '56c06d18e93d9' OR `size` <> '114332497' OR `mtime` IS NULL OR `etag` IS NULL OR `size` IS NULL) AND `fileid` = '9';
# User@Host: oc_admin3[oc_admin3] @ localhost []
# Query_time: 2.212009  Lock_time: 0.000095 Rows_sent: 0  Rows_examined: 1
SET timestamp=1455451419;
UPDATE `oc_filecache` SET `mtime` = '1455451416', `etag` = '56c06d18e93c4', `size`='114332497' WHERE (`mtime` <> '1455451416' OR `etag` <> '56c06d18e93c4' OR `size` <> '114332497' OR `mtime` IS NULL OR `etag` IS NULL OR `size` IS NULL) AND `fileid` = '9';
# User@Host: oc_admin3[oc_admin3] @ localhost []
# Query_time: 2.222490  Lock_time: 0.000053 Rows_sent: 0  Rows_examined: 1
SET timestamp=1455451419;
UPDATE `oc_filecache` SET `mtime` = '1455451416', `etag` = '56c06d18f4116', `size`='163' WHERE (`mtime` <> '1455451416' OR `etag` <> '56c06d18f4116' OR `size` <> '163' OR `mtime` IS NULL OR `etag` IS NULL OR `size` IS NULL) AND `fileid` = '8';

Perhaps I messed something up with my configuration.

@ghost ghost modified the milestones: 8.2.3-current-maintenance, 9.0-current, 8.2.4-next-maintenance Feb 22, 2016
@tflidd
Copy link
Contributor Author

tflidd commented Apr 4, 2016

With some mysql-tuning (https://forum.owncloud.org/viewtopic.php?f=31&t=30083#p95636) and these parameter in my.cnf

innodb_buffer_pool_size = 512M
innodb_buffer_pool_instance = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 32M
innodb_max_dirty_pages_pct = 90

I was able to upload more than 1000 files/minute. The total upload time for all 10000 files was 9min 28s.

This is only a workaround. There is a risk data loss when the system crashes. More reports on this modification (one user wasn't able to restore the database after a hard-reset of his system):
owncloud/client#331 (comment)

@jospoortvliet
Copy link

Considering the log from Feb 14, it mentions locking - you sure that redis takes care of the locking, is that configured properly?

'memcache.locking' => '\\OC\\Memcache\\Redis',

has to be there.

Just curious!

Now if Redis does take properly care of file locking then the problem is in another area. Honestly I'm amazed you get it under 10 min on a Pi.

While there's sure room for improvement (and #11588 (comment) will already have helped) I think we should keep in mind that the only real solution to this problem is batching.

@tflidd
Copy link
Contributor Author

tflidd commented Apr 13, 2016

I did use redis (#20967 (comment)), this is from my config.php:

  'filelocking.enabled' => 'true',
  'memcache.distributed' => '\\OC\\Memcache\\Redis',
  'memcache.local' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => '/tmp/redis.sock',
    'port' => 0,
    'timeout' => 0,
  ),

However, the big performance improvements were only possible on the virtual machine, on the raspberry pi 2, I didn't exceed 150-180 files/minute. The bottleneck in the raspberry pi is the database who uses all i/o-operations. I use the SD-card where I can write with ~10 MB/s (vserver > 100 MB/s).

@ghost ghost modified the milestones: 9.1-current, 8.2.4-current-maintenance Apr 14, 2016
@jospoortvliet
Copy link

The fastest local memcache would be APCU, on the Pi it's a bit more complicated as APCU eats more memory (two caches instead of one). But on the VM, APCU should speed things up a little more. Only for memecache.local!

@melroy89
Copy link

melroy89 commented Aug 9, 2016

Still not fixed?

@PVince81
Copy link
Contributor

@Danger89 is this happening for you in a recent OC version ?
Some queries were optimized meanwhile.

@PVince81
Copy link
Contributor

PVince81 commented Dec 8, 2016

@mrow4a what do you think ?

@mrow4a
Copy link
Contributor

mrow4a commented Dec 8, 2016

@PVince81 You mean, trying to reduce the number of inserts or optimize the query? Probably this is yet another big project. I think I could have a look and analyse this as a next project.

However, I am big fan of what guys at CERN have with EOS.

@PVince81
Copy link
Contributor

PVince81 commented Apr 6, 2017

@mrow4a is already working on optimizing SQL queries on upload and other file operations for 10.0, some fixes made it in already. Please retry there.

@lock
Copy link

lock bot commented Aug 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants