Chunked Files: Optimize checking if all chunks are there #22601

Closed
guruz opened this Issue Feb 23, 2016 · 6 comments

Projects

None yet

6 participants

@guruz
Contributor
guruz commented Feb 23, 2016

(not proven 100%)

We've seen benchmarks (TODO: where) where we have seen that for chunked files the later chunks take longer and longer than the first chunks. We're thinking isComplete of OC_FileChunking is to be blamed because it checks the file system for the existance of all chunks for the file. This gets slower (? probably because a succesful stat takes more time)

https://github.com/owncloud/core/blob/c5a200c4198de951dabdd3f783ade655b8ea6434/lib/private/filechunking.php#L76

One optimization idea: Have a fast path to check for currentChunk+1 (if it has that many chunks) and if that does not exist then return immediatly as false instead of counting all chunks.

FYI @dragotin @DeepDiver1975 @jturcotte @ogoffart @rullzer

@guruz guruz added the performance label Feb 23, 2016
@rullzer rullzer was assigned by guruz Feb 23, 2016
@ogoffart

Client issue showing that the more chunk, the slower: owncloud/client#4354

@rullzer rullzer added a commit that referenced this issue Feb 23, 2016
@rullzer rullzer Do not check all chunks of a chunked upload if we do not need to
Fixes #22601

Before we did a full test on all chunks to verify if a chunked upload
was completed. This is unneeded since if we are missing one chunk we can
already fail.

Also we look from back to front since it is much more likely that we
find a missing chunk thus can error out early.
4d353ef
@PVince81 PVince81 added this to the 9.1-next milestone Feb 25, 2016
@PVince81 PVince81 added the app:dav label Feb 25, 2016
@PVince81
Collaborator

One optimization idea: Have a fast path to check for currentChunk+1 (if it has that many chunks) and if that does not exist then return immediatly as false instead of counting all chunks.

Sounds like a plan.

@rullzer
Contributor
rullzer commented Feb 25, 2016

PR already in #22602

@rullzer
Contributor
rullzer commented Feb 25, 2016

One optimization idea: Have a fast path to check for currentChunk+1 (if it has that many chunks) and if that does not exist then return immediatly as false instead of counting all chunks.

Sounds like a plan.

After thinking about it we now just traverse the list from back to front (instead of front to back) and error out the moment we find 1 missing chunk.. But the idea is the same :)

@rullzer
Contributor
rullzer commented Feb 25, 2016

@PVince81 @cmonteroluque just FYI this helps with a blue ticket on the client.. so would 9.0.1 be an option? (PR is already available).

@rullzer rullzer added a commit that referenced this issue Mar 2, 2016
@rullzer rullzer Do not check all chunks of a chunked upload if we do not need to
Fixes #22601

Before we did a full test on all chunks to verify if a chunked upload
was completed. This is unneeded since if we are missing one chunk we can
already fail.

Also we look from back to front since it is much more likely that we
find a missing chunk thus can error out early.
47acc59
@PVince81 PVince81 added the blue-ticket label Mar 2, 2016
@PVince81
Collaborator
PVince81 commented Mar 2, 2016

done

@rullzer rullzer added a commit that referenced this issue Mar 7, 2016
@rullzer rullzer Do not check all chunks of a chunked upload if we do not need to
Fixes #22601

Before we did a full test on all chunks to verify if a chunked upload
was completed. This is unneeded since if we are missing one chunk we can
already fail.

Also we look from back to front since it is much more likely that we
find a missing chunk thus can error out early.
e08f980
@rullzer rullzer added a commit that referenced this issue Mar 9, 2016
@rullzer rullzer Do not check all chunks of a chunked upload if we do not need to
Fixes #22601

Before we did a full test on all chunks to verify if a chunked upload
was completed. This is unneeded since if we are missing one chunk we can
already fail.

Also we look from back to front since it is much more likely that we
find a missing chunk thus can error out early.
313b881
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment