Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow access (copy, ls, etc) files in folders with lots of files through webdav + proposed fix #8962

Open
Baltix opened this issue Mar 23, 2018 · 10 comments

Comments

Projects
None yet
9 participants
@Baltix
Copy link

commented Mar 23, 2018

Access (copy, ls, etc) files from folders with thousands of files through webdav is very slow (10-20 or more seconds for simple ls command :
time ls /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/RB00097504_1_Joico-Color-Endure-Conditioner-1000-ml.jpg
/home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/RB00097504_1_Joico-Color-Endure-Conditioner-1000-ml.jpg
real 0m18.519s
user 0m0.000s
sys 0m0.000s

(/home/sketis/docs.mini-maxi.lt/ is mounted through davfs (davfs2 package), folder /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/ contains ~9000 files size 100-600kb)

I've tried lots of optimizations, including redis and database buffer, etc optimizations and almost nothing helps here, but after 3 days of hard work I've found where is the main issue:

every operation with any file from folder with thousands of files through webdav protocol generates thousands queries (one or 2 query for every file from that folder):
SELECT * FROM "oc_properties" WHERE "userid" = $1 AND "propertypath" = $2 AND "propertyname" in ($3)
I've noticed this with "pg_activity" tool. This query takes about 2-3 miliseconds, but for every operations nextcloud generates thousands of queries :(

This is a bug, because I access (for example copy) only one file, not all files from folder. Operation takes 2x longer, when I try to access file, which doesn't exists:
time ls /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/B00002
ls: cannot access '/home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/B00002': No such file or directory
real 0m45.274s
user 0m0.000s
sys 0m0.000s

I've activated "pg_stat_statements" module in PostgreSQL and then found the main issue:
SELECT query,total_time,max_time,min_time,calls FROM pg_stat_statements ORDER BY calls DESC fetch first 2 rows only;
query | total_time |max_time|min_time| calls
----------------------------------------------------------------------------------------+------------+--------+--------+------
SELECT * FROM oc_properties WHERE "userid"=$1 AND "propertypath"=$2 AND "propertyname" in($3) | 107636.46 | 3.54 | 1.48 |67532

67532 queries just after ~2 minutes! "oc_properties" table contains ~12000 rows.
I've found, that this table only contains info about executable bit of file, so, it's not useful for majority of users!

So, there is one simple workaround - I've increased query speed 10 times by creating and index in oc_properties table for both columns - userid and propertypath:

CREATE INDEX properties_path_index ON oc_properties USING btree (userid, propertypath);
(this is PostgreSQL syntax)

Please add this index in Nexcloud database creation file.

But index creation only partially solves the performance issues with lots of files in one folder - it would be wise to create the setting "Ignore executable bit in WEBDAV" in nextcloud's config.php and don't query oc_properties table at all when this setting is set to true.

Steps to reproduce

  1. Create a folder and upload 5000-9000 files with one user through webdav
  2. mount that folder with davfs
  3. Set executable bit for all files
  4. try ls, cp or rm one or several files from mounted folder (or with file manager, which supports webdav)

Expected behaviour

ls, cp or rm one or several files from folder should take 1 or 2 seconds

Actual behaviour

ls, cp or rm one or several files from folder takes 10-40 seconds

Server configuration

Operating system: Ubuntu 16.04.4 64-bit

Web server: apache (also tested with nginx - almost identical performance)

Database: PostgreSQL

PHP version: 7.0.28

Nextcloud version: 13.0.1 (also tested with all 12.x versions)

Updated from an older Nextcloud/ownCloud or fresh install: Updated from 12.x

Where did you install Nextcloud from: from nextcloud.org downloads

Nextcloud configuration:

Config report 'ocy77dpvulpn', 'passwordsalt' => 'good_password', 'secret' => 'no_secrets', 'trusted_domains' => array ( 0 => '192.168.199.53', ), 'datadirectory' => '/srv/nextcloud-duomenys', 'overwrite.cli.url' => 'https://192.168.199.53/', 'htaccess.RewriteBase' => '/', 'dbtype' => 'pgsql', 'version' => '13.0.1.1', 'dbname' => 'nextcloud', 'dbhost' => 'localhost', 'dbport' => '', 'dbtableprefix' => 'oc_', 'dbuser' => 'nextuser', 'dbpassword' => 'db-password', 'installed' => true, 'mail_smtpmode' => 'smtp', 'mail_smtpauthtype' => 'LOGIN', 'mail_smtpsecure' => 'ssl', 'mail_from_address' => 'docs', 'mail_domain' => 'my.mail.lt', 'mail_smtpauth' => 1, 'mail_smtphost' => 'my.mail..lt', 'mail_smtpport' => '465', 'mail_smtpname' => 'docs@my.mail.lt', 'mail_smtppassword' => 'mail_password', 'mail_smtpdebug' => true, 'auth.bruteforce.protection.enabled' => false, 'maintenance' => false, 'theme' => '', 'loglevel' => 2, 'memcache.local' => '\\OC\\Memcache\\APCu', 'redis' => array ( 'host' => 'localhost', 'port' => 6379, ), );

Are you using external storage, if yes which one: local

Are you using encryption: no

Are you using an external user-backend, if yes which one: no

Client configuration

Operating system: Ubuntu 16.04 64-bit

@MorrisJobke

This comment has been minimized.

Copy link
Member

commented Apr 16, 2018

@MorrisJobke

This comment has been minimized.

Copy link
Member

commented Apr 16, 2018

CREATE INDEX properties_path_index ON oc_properties USING btree (userid, propertypath);
(this is PostgreSQL syntax)

Please add this index in Nexcloud database creation file.

@nickvergessen Makes sense or not?

@MorrisJobke MorrisJobke added this to the Nextcloud 14 milestone Apr 16, 2018

@nickvergessen

This comment has been minimized.

Copy link
Member

commented Apr 17, 2018

Sounds familiar

@nextcloud-bot nextcloud-bot added the stale label Jun 20, 2018

@MorrisJobke

This comment has been minimized.

Copy link
Member

commented Jun 25, 2018

@rullzer Another one where an index could help a lot

@nextcloud-bot nextcloud-bot removed the stale label Jun 25, 2018

@MorrisJobke

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

We had a look at this and adding the index there is not that easy as it might work fine in Postgres but for mysql there is some limitation on the index length in default setups. :/

So we will move this to 15 as there is not enough time left to look into this properly.

@artemanufrij

This comment has been minimized.

Copy link
Member

commented Oct 21, 2018

I have similar setup on my Ubuntu16.04 server (I use MariaDB) and I also mounted webdav ono my client. ls command takes ~60-120 seconds. cp (10 files a 300k) takes more than 10 minutes.

@MorrisJobke MorrisJobke removed this from the Nextcloud 15 milestone Nov 5, 2018

@srkunze

This comment has been minimized.

Copy link
Member

commented Dec 3, 2018

Same issue here.

I actually intend to use the personal nextcloud server as a backup storage. Unfortunately, it's very slow. :-/

Is there some update on this one?

@grantbunyan

This comment has been minimized.

Copy link

commented Jan 23, 2019

👍 for an update on this please. Transfers via webdav are incredibly slow. 5mins for 300k.

@MarioPerini

This comment has been minimized.

Copy link

commented Mar 12, 2019

Extremely slow WebDav confirmed also from my side

@BrianLeishman

This comment has been minimized.

Copy link

commented Mar 20, 2019

Something I've noticed is that it runs the "getScanner" method on the OC\Files\View class every file every time you do anything with the local storage. Even when I'm 3 or 4 folders deep, if I reload that page, it scans every folder from the root up to where I'm at.

I've been trying to profile the folder loading (trying to figure out which function is the culprit), and it looks like if I'm at a location like this

/Local Storage Root/Brian Leishman/Pictures/Food

Then it scans every file/folder in "Local Storage Root", then "Brian Leishman", "Pictures", and finally "Food", so when "Local Storage Root" has 11k folders in it, they all get scanned again at this point, which means that accessing that last folder with 1 item in it takes about a minute to load (even on a very, very fast machine with all the performance tuning enabled).

I'm trying to fix this myself for our use case, but I'm getting stuck trying to figure out if it's the Sabre/DAV 3rd party tool or NextCloud itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.