Skip to content

seziCZ/S3Append

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AWS .NET SDK S3 Append

Project license Pull Requests welcome Code with love by sezicz

About

S3 Append provides AppendObjectAsync extension method for .NET AWS SDK S3 client, capable of appending data to existing S3 hosted objects.

Implementation

Since AWS S3 is not a block storage system, content of persisted objects could not be altered in situ. One would have to download, update and upload (override) the object again but such approach suffers of multiple problems (network throughput dependency, high memory requirements, etc.) that make it practically unusable. For this reason, S3 Append implementation relies on AWS S3 multipart copy internally to avoid a need for data to be downloaded first:

Implications

Thanks to high degree of parallelism and almost unbounded network bandwidth, AWS S3 copy operations are considerably faster than naive download-update-upload approach. Moreover, internal AWS data transfers are free of charge, making the proposed solution a no brainer in situation where client logic resides outside of AWS cloud. Still, cost is the key aspect to be considered as at least five AWS S3 requests must be issued for every and each AppendObjectAsync method call (see Implementation for details).

Note that single UploadPartCopy operation could only copy up to 5 GiB of data. That said, append to an object with size of 5 TB would result in (at least) 1004 requests issued by S3 Append logic.

Usage

When imported into scope, AppendObjectAsync could be used in in a straightforward fashion. Consider S3 bucket 109a6d191b67 hosting fa5ec9042bc3 object with plain text content Hello. Following code would, when executed,

using Amazon.S3;
using S3Append.Extensions;
using S3Append.Models;
using System.Threading.Tasks;

public static async Task Main(string[] args)
{
	var client = new AmazonS3Client();
	var request = new AppendObjectRequest
	{
		BucketName = "109a6d191b67",
		Key = "fa5ec9042bc3",
		ContentBody = " world!"
	};

	await client.AppendObjectAsync(request);
}

result in (the same) fa5ec9042bc3 object containing proverbial Hello world!.

Performance

Server side copy will always has superior performance to client mediated copy. Nonetheless, even AWS's network is subject to the laws of physics so at some point, for objects which are gigabytes in size, copy operation could become unsatisfactorily slow. Problem could be partially mitigated by decreasing copy part size (and, thus, increasing copy parallelism) which is 5 GiB by default:

var request = new AppendObjectRequest
{
	BucketName = "109a6d191b67",
	Key = "fa5ec9042bc3",
	ContentBody = " world!",
	PartMaxBytes = (long) Math.Pow(2, 27) // aka 128 MiB
};

Keep in mind, however, that the smaller the copy part size, the more requests generated and, consequently, the more costs associated with respective data append operation.

About

Provides .NET AWS S3 client extension method that allows for efficient data appends to existing S3 objects.

Topics

Resources

License

Stars

Watchers

Forks

Languages