AWS .NET SDK S3 Append

About

S3 Append provides AppendObjectAsync extension method for .NET AWS SDK S3 client, capable of appending data to existing S3 hosted objects.

Implementation

Since AWS S3 is not a block storage system, content of persisted objects could not be altered in situ. One would have to download, update and upload (override) the object again but such approach suffers of multiple problems (network throughput dependency, high memory requirements, etc.) that make it practically unusable. For this reason, S3 Append implementation relies on AWS S3 multipart copy internally to avoid a need for data to be downloaded first:

Get to-be-updated object's metadata to determine its size
Create multipart upload through which data is to be copied/uploaded
Copy existing data on the server side
Upload new data from client side
Complete multipart upload, overriding original object

Implications

Thanks to high degree of parallelism and almost unbounded network bandwidth, AWS S3 copy operations are considerably faster than naive download-update-upload approach. Moreover, internal AWS data transfers are free of charge, making the proposed solution a no brainer in situation where client logic resides outside of AWS cloud. Still, cost is the key aspect to be considered as at least five AWS S3 requests must be issued for every and each AppendObjectAsync method call (see Implementation for details).

Note that single UploadPartCopy operation could only copy up to 5 GiB of data. That said, append to an object with size of 5 TB would result in (at least) 1004 requests issued by S3 Append logic.

Usage

When imported into scope, AppendObjectAsync could be used in in a straightforward fashion. Consider S3 bucket 109a6d191b67 hosting fa5ec9042bc3 object with plain text content Hello. Following code would, when executed,

using Amazon.S3;
using S3Append.Extensions;
using S3Append.Models;
using System.Threading.Tasks;

public static async Task Main(string[] args)
{
	var client = new AmazonS3Client();
	var request = new AppendObjectRequest
	{
		BucketName = "109a6d191b67",
		Key = "fa5ec9042bc3",
		ContentBody = " world!"
	};

	await client.AppendObjectAsync(request);
}

result in (the same) fa5ec9042bc3 object containing proverbial Hello world!.

Performance

Server side copy will always has superior performance to client mediated copy. Nonetheless, even AWS's network is subject to the laws of physics so at some point, for objects which are gigabytes in size, copy operation could become unsatisfactorily slow. Problem could be partially mitigated by decreasing copy part size (and, thus, increasing copy parallelism) which is 5 GiB by default:

var request = new AppendObjectRequest
{
	BucketName = "109a6d191b67",
	Key = "fa5ec9042bc3",
	ContentBody = " world!",
	PartMaxBytes = (long) Math.Pow(2, 27) // aka 128 MiB
};

Keep in mind, however, that the smaller the copy part size, the more requests generated and, consequently, the more costs associated with respective data append operation.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
S3Append		S3Append
Tests.Unit		Tests.Unit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
S3Append.sln		S3Append.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS .NET SDK S3 Append

About

Implementation

Implications

Usage

Performance

About

Languages

License

seziCZ/S3Append

Folders and files

Latest commit

History

Repository files navigation

AWS .NET SDK S3 Append

About

Implementation

Implications

Usage

Performance

About

Topics

Resources

License

Stars

Watchers

Forks

Languages