Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROPFIND bug prevents larger number of files per directory (and causes connection timeouts) #5084

Closed
moscicki opened this issue Oct 2, 2013 · 19 comments

Comments

@moscicki
Copy link

moscicki commented Oct 2, 2013

Hello,

Owncloud 5.0.11, local mysql, redhat enterprise 6.

The ocsync client upload is very slow because PROPFIND execution time is prohibitive as the number of files per directory grows large. The bug is triggered when a file is simply added to the local folder for the first time: the PROPFIND request performed by the sync client triggers a scan of all files in the directory. This should not happen because the local folder was synced already. This results in an avalanche of SELECTs on the database, stats on the storage and opens on the storage because the scan involves also the determination of file mime type (by reading first 256K bytes of each file). The chunked upload may also be a victim of this bug. At a certain point php script is timing out while handling PROPFIND request.

image

strace of the httpd worker process (all files are filled with null bytes) which keeps on repeating over 1300 times during the PROPFIND handling:

00:35:20.935106 write(17, "\353\0\0\0\3SELECT fileid, storage,"..., 239) = 239
00:35:20.935227 read(17, "\1\0\0\1\fB\0\0\2\3def\10owncloud\foc_fileca"..., 16384) = 985
00:35:20.935722 access("/boxstorage/manual001/files/mdir3/zzz.zero1M.166", R_OK) = 0
00:35:20.935858 access("/boxstorage/manual001/files/mdir3/zzz.zero1M.166", R_OK) = 0
00:35:20.936017 stat("/boxstorage/manual001/files/mdir3/zzz.zero1M.166", {st_mode=S_IFREG|0644, st_size=1024000, ...}) = 0
00:35:20.936211 stat("/var/www/html/owncloud/lib/mimetypes.list.php", {st_mode=S_IFREG|0640, st_size=3048, ...}) = 0
00:35:20.936540 lstat("/boxstorage/manual001/files/mdir3/zzz.zero1M.166", {st_mode=S_IFREG|0644, st_size=1024000, ...}) = 0
00:35:20.936681 lstat("/boxstorage/manual001/files/mdir3", {st_mode=S_IFDIR|0755, st_size=49152, ...}) = 0
00:35:20.936814 lstat("/boxstorage/manual001/files", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
00:35:20.936947 open("/boxstorage/manual001/files/mdir3/zzz.zero1M.166", O_RDONLY) = 16
00:35:20.937074 fstat(16, {st_mode=S_IFREG|0644, st_size=1024000, ...}) = 0
00:35:20.937199 lseek(16, 0, SEEK_CUR) = 0
00:35:20.937291 fstat(16, {st_mode=S_IFREG|0644, st_size=1024000, ...}) = 0
00:35:20.937426 fstat(16, {st_mode=S_IFREG|0644, st_size=1024000, ...}) = 0
00:35:20.937550 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.937672 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.937792 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.937912 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938045 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938165 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938284 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938402 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938522 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938640 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938778 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.938895 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939037 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939159 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939275 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939392 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939509 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939628 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939747 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939867 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.939992 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940119 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940239 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940358 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940477 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940597 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940717 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940835 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.940952 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.941085 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.941204 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.941321 read(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
00:35:20.944737 lseek(16, 262144, SEEK_SET) = 262144

@DeepDiver1975
Copy link
Member

@moscicki Thanks a lot for your detailed report - much appreciated!

What files are you using for testing? Looks like ownCloud cannot determine the mime type properly.

@moscicki
Copy link
Author

moscicki commented Oct 2, 2013

Hello
This particular test was done with files created by dd if=/dev/zero -- so nothing fancy but just NULL bytes everywhere. We are also testing with files created from /dev/urandom.
kuba

@aarononeal
Copy link

This bug looks like the cause of a significant performance issue I'm seeing.

I see this has been moved to v7. I don't know the codebase well enough to understand what the best fix might be, but is there a short-term fix that could be put in place for v6?

Specifically, I'm wondering how necessary this particular piece of code is.

$fp = popen("file -b --mime-type $path 2>/dev/null", "r");
$reply = fgets($fp);

If the MIME type can't be determined by file extension, spawning a shell process and reading the first part of the file is a pretty expensive way to get at it during a sync. Is this an edge case we're paying for and is application/octet-stream good enough?

@moscicki
Copy link
Author

moscicki commented Nov 7, 2013

Hi,

I am a bit puzzled myself what is the status because milestone is set to 7 but the issue is closed referencing #5516 in 6-alpha2. However #5516 is not really related to this issue (it is a different bug cause by something else on another platform).

For this issue (#5084) the problem is

  1. architecture: adding a new file should not trigger any actions on existing files in this directory (it does not make sense and ultimately makes it not scalable)
  2. implementaton: the cost of determining the mime-type of individual file - as you describe above
  3. architecture 2: I would also think that probably the mime-type should be calculated the first time it is needed and not always at every upload

kuba

@DeepDiver1975
Copy link
Member

@moscicki this issue is still open and we will target it with oc7 as marked.
#5516 was referenced because the fix in #5601 has a performance impact on the scenarios described in here.

Having a look at your call graph with #5601 applied there will be no time wasted in mime_content_type, popen and fgets. A big win from my point of view.

But still the issues you described above are not solved - we will take care of this in oc7.

I hope this clarifies the overall situation here - feel free to comment if no.

Take care

@moscicki
Copy link
Author

moscicki commented Nov 7, 2013

Ok, many thanks for this explanation. I will give it a try with 6 and report back.

@DeepDiver1975
Copy link
Member

I will give it a try with 6 and report back.

Thanks a lot - some more nice graphs are very welcome! 😉

@moscicki
Copy link
Author

What is the time estimate to address these kind of server side performance issues?

@sleinen
Copy link

sleinen commented Jul 31, 2014

This still says "Milestone: ownCloud 7". ownCloud 7 has been released, right? Has this issue been addressed in that release? Inquiring minds (and impatient customers) want to know...

@DeepDiver1975
Copy link
Member

Has this issue been addressed in that release?

a partial solution has been released with OC7 - see #5084 (comment)

@craigpg craigpg added Performance and removed Bug labels Sep 2, 2014
@craigpg craigpg modified the milestones: ownCloud 8, ownCloud 7 Sep 2, 2014
@MTRichards
Copy link
Contributor

@moscicki Is this still an issue for oC 7.0.2? I know it is partly improved, but interested on if the result is improved.

@DeepDiver1975 DeepDiver1975 modified the milestones: 8.1-next, ownCloud 8 Jan 9, 2015
@DeepDiver1975 DeepDiver1975 self-assigned this Jan 9, 2015
@DeepDiver1975
Copy link
Member

@icewind1991 somewhat related to #14531

@icewind1991
Copy link
Contributor

#14531 is for recursive propfinds, unrealated to this

@MTRichards
Copy link
Contributor

Is this still an issue?

@MTRichards MTRichards modified the milestones: 8.2-next, 8.1-current May 4, 2015
@PVince81
Copy link
Contributor

@moscicki are you still observing this issue with 8.1 ?

@PVince81
Copy link
Contributor

@DeepDiver1975 move to 9.0 ? (probably an ongoing task anyway)

@ghost
Copy link

ghost commented Sep 30, 2015

@PVince81 @DeepDiver1975 I'm moving it to 9.0

@ghost ghost modified the milestones: 9.0-next, 8.2-current Sep 30, 2015
@ghost
Copy link

ghost commented Jan 6, 2016

closing due to lack of feedback

@ghost ghost closed this as completed Jan 6, 2016
@lock
Copy link

lock bot commented Aug 7, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 7, 2019
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants