S3 uploads do not appear to run in parallel #8343

storcium · 2024-02-01T16:50:51Z

Describe the bug

A clear and concise description of what the bug is.

Using s3ng, uploads commence after exiting quarantine, but do not appear to parallelize, resulting in far slower throughput than expected.

Steps to reproduce

Run Wasabi s3ng benchmark with 10 streams and 16MB objects, such as:
s3-benchmark -a -s -b -t 10 -z 16M -u
Throughput shows on the S3 side at 381 MiB/s = 3 Gbps, 28 requests/sec
Upload to OCIS space a folder with 5GB made of 35 x 140 MB files with random data, using the same network, compute and storage resources.
Throughput shows as 60 MiB/s = 480 Mbps, 5 requests/sec

Expected behavior

Similar performance with a parallel benchmark and oCIS

Actual behavior

oCIS is about 6 times slower

Setup

Note that we are measuring peak sustained throughput, not the timing (quarantine, postprocessing, etc.). This has been tested as a binary, and in a K8s setup with similar results.

STORAGE_USERS_OCIS_ASYNC_UPLOADS is set to true

micbar · 2024-02-02T15:14:39Z

What is exactly the problem you are trying to solve?

We deliberately designed ocis to do uploads onto a quarantine area on the local disk to boost the upload performance.
Our clients close the connection after the last byte has been uploaded into ocis. By doing this, our frontend upload speed is as fast as possible.

As a second step, we do all kinds of asynchronous postprocessing on that file. Virus Scanning, Content indexing, OCR on images and so on.

After the postprocessing, the files are finally uploaded to S3.

storcium · 2024-02-02T16:30:17Z

Hi Michael, thank you for the response. Please see the graphs attached. Loading 10 streams in parallel in S3 maxes out the throughput at a much higher rate than oCIS does this - referring to the S3 upload stage. Does oCIS parallelize the S3 uploads on the S3 back end, or does them sequentially? Is there a way to parallelize?

The below graph shows max load with everything being the same - OS, network, storage.

micbar · 2024-02-02T16:51:30Z

I still don’t understand the problem you are trying to solve 🤔

tbsbdr · 2024-02-06T09:14:11Z

Notes from todays discussion - just for the record:

nicholas-wilson-au · 2024-02-06T09:28:57Z

Once a user has uploaded multiple large files and the files have completed post processing, upload via S3 appears to be sequential rather than utilising multipart uploading which significantly delays the availability of this file. This was implemented in OC10 as listed above. Can we investigate if this is possible to implement in OCIS?

butonic · 2024-02-07T13:52:44Z

ocis will copy the blob into s3 whenever the postprocessing event is received. Depending on the concurrency multiple goroutines may be running in parallel. Each trying to upload a blob.

For every upload, the github.com/minio/minio-go/v7 package we use makes a single PutObject call, which internally decides wether to do a multipart upload or not. The default size that switches to multipart uploads is 16MB.

I can see three put multipart implementations that might be called from putObjectCommon:

if c.overrideSignerType.IsV2() -> putObjectMultipart, we should be using V4, so I will not follow this
if size < 0 { (size is unknown) -> putObjectMultipartStreamNoLength, ocis we always sets a size -> irrelevant
finally putObjectMultipartStream -> internally checks if the reader implements ReadAt, if so, it will use a parallel uploader using putObjectMultipartStreamFromReadAt if not it will do a sequential upload with putObjectMultipartStreamOptionalChecksum

ocis will pass an os.File as the Reader which does implement ReadAt. So we should be doing parallel uploads. putObjectMultipartStreamFromReadAt internally calls getNumThreads, which, if not given as an option (which we currently don't), defaults to totalWorkers = 4.

We could make PutObjectOptions.NumThreads configurable, but I wonder how we can verify that parallel mutlipart uploads are correctly being started.

👀 it seems we are not using parallel uploads because we set opts.SendContentMd5 to true...

butonic · 2024-02-07T14:04:13Z

ok so this diff will use concurrent uploads:

diff --git a/pkg/storage/fs/s3ng/blobstore/blobstore.go b/pkg/storage/fs/s3ng/blobstore/blobstore.go
index 9c744e754..4fa391ebb 100644
--- a/pkg/storage/fs/s3ng/blobstore/blobstore.go
+++ b/pkg/storage/fs/s3ng/blobstore/blobstore.go
@@ -71,7 +71,7 @@ func (bs *Blobstore) Upload(node *node.Node, source string) error {
        }
        defer reader.Close()
 
-       _, err = bs.client.PutObject(context.Background(), bs.bucket, bs.path(node), reader, node.Blobsize, minio.PutObjectOptions{ContentType: "application/octet-stream", SendContentMd5: true})
+       _, err = bs.client.PutObject(context.Background(), bs.bucket, bs.path(node), reader, node.Blobsize, minio.PutObjectOptions{ContentType: "application/octet-stream"})
 
        if err != nil {
                return errors.Wrapf(err, "could not store object '%s' into bucket '%s'", bs.path(node), bs.bucket)

Let me see what SendContentMd5 is used for exactly.

butonic · 2024-02-07T14:07:05Z

IIRC we needed that parameter for a specific S3 implementation and just hardcoded it. So we need to make it configurable. 🥳 Probably together with the NumThreads ...

butonic · 2024-02-07T14:13:49Z

Adding this to the project board so someone can prioritize.

micbar · 2024-02-07T15:54:47Z

The mentioned param is needed to fulfil the bucket policy.

was brought up by @wkloucek due to an incindent.

butonic · 2024-02-07T16:19:13Z

yup:

The Content-MD5 header is required for any request to upload an object with a retention period configured using Amazon S3 Object Lock. For more information about Amazon S3 Object Lock, see Amazon S3 Object Lock Overview in the Amazon S3 User Guide.

micbar · 2024-02-19T09:00:40Z

Needs dependency bump and re-testing.

butonic · 2024-02-19T11:09:23Z

needs

review of configurable s3 options: configurable S3 PUT options cs3org/reva#4526
bump reva in ocis and make the options configurable in ocis as well https://github.com/owncloud/ocis/pull/8412/files#diff-28b0e458a3b750c876a9079d32926a86a2a06ea269c7e727f194ec829654d127R154-R159

storcium added the Type:Bug label Feb 1, 2024

butonic mentioned this issue Feb 7, 2024

bump tusd and make CORS configurable cs3org/reva#4507

Merged

micbar assigned butonic Feb 8, 2024

butonic closed this as completed Feb 21, 2024

micbar added this to the Release 5.0.0 milestone Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 uploads do not appear to run in parallel #8343

S3 uploads do not appear to run in parallel #8343

storcium commented Feb 1, 2024

micbar commented Feb 2, 2024

storcium commented Feb 2, 2024 •

edited

micbar commented Feb 2, 2024

tbsbdr commented Feb 6, 2024

nicholas-wilson-au commented Feb 6, 2024

butonic commented Feb 7, 2024 •

edited

butonic commented Feb 7, 2024

butonic commented Feb 7, 2024

butonic commented Feb 7, 2024

micbar commented Feb 7, 2024

butonic commented Feb 7, 2024

micbar commented Feb 19, 2024

butonic commented Feb 19, 2024 •

edited

S3 uploads do not appear to run in parallel #8343

S3 uploads do not appear to run in parallel #8343

Comments

storcium commented Feb 1, 2024

Describe the bug

Steps to reproduce

Expected behavior

Actual behavior

Setup

micbar commented Feb 2, 2024

storcium commented Feb 2, 2024 • edited

micbar commented Feb 2, 2024

tbsbdr commented Feb 6, 2024

nicholas-wilson-au commented Feb 6, 2024

butonic commented Feb 7, 2024 • edited

butonic commented Feb 7, 2024

butonic commented Feb 7, 2024

butonic commented Feb 7, 2024

micbar commented Feb 7, 2024

butonic commented Feb 7, 2024

micbar commented Feb 19, 2024

butonic commented Feb 19, 2024 • edited

storcium commented Feb 2, 2024 •

edited

butonic commented Feb 7, 2024 •

edited

butonic commented Feb 19, 2024 •

edited