Skip to content

Comments

S3 Client Side Encryption#1395

Merged
kstich merged 34 commits intoaws:masterfrom
kstich:s3_client_side_encryption
Nov 8, 2017
Merged

S3 Client Side Encryption#1395
kstich merged 34 commits intoaws:masterfrom
kstich:s3_client_side_encryption

Conversation

@kstich
Copy link
Contributor

@kstich kstich commented Oct 13, 2017

Adds the S3EncryptionClient and supporting implementations. This implements envelope encryption and utilizes OpenSSL to provide CBC and GCM (on PHP >= 7.1) support . Uses pluggable strategies (via MetadataStrategyInterface) for handling a MetadataEnvelope in conjunction with a MaterialsProvider. A KmsMaterialsProvider is provided for key management . Supports ->putObject[Async] and ->getObject[Async] operations. This is cross compatible with Java, Ruby, Go, and C++ for the provided feature set. A service guide is also included.

/cc @jeskew @mtdowling @cjyclaire

Resolves #831

kstich and others added 25 commits October 12, 2017 15:28
… KMS to provide abstract pieces of MaterialsProvider.
…strategies for handling a MetadataEnvelope in conjunction with a MaterialsProvider. Supports putObject[Async] and getObject[Async] operations.
@codecov-io
Copy link

codecov-io commented Oct 13, 2017

Codecov Report

Merging #1395 into master will decrease coverage by 0.33%.
The diff coverage is 95.91%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1395      +/-   ##
============================================
- Coverage      91.9%   91.56%   -0.34%     
- Complexity     2472     2521      +49     
============================================
  Files           142      156      +14     
  Lines          7545     7045     -500     
============================================
- Hits           6934     6451     -483     
+ Misses          611      594      -17
Impacted Files Coverage Δ Complexity Δ
src/TraceMiddleware.php 96.77% <ø> (+0.62%) 47 <0> (ø) ⬇️
src/S3/Crypto/HeadersMetadataStrategy.php 100% <100%> (ø) 5 <5> (?)
src/S3/Crypto/InstructionFileMetadataStrategy.php 100% <100%> (ø) 6 <6> (?)
src/Crypto/KmsMaterialsProvider.php 100% <100%> (ø) 8 <8> (?)
src/Crypto/Cipher/Cbc.php 100% <100%> (ø) 10 <10> (?)
src/Crypto/MetadataEnvelope.php 100% <100%> (ø) 7 <7> (?)
src/Crypto/MaterialsProvider.php 100% <100%> (ø) 3 <3> (?)
src/Crypto/DecryptionTrait.php 92.18% <92.18%> (ø) 11 <11> (?)
src/Crypto/AesGcmDecryptingStream.php 93.1% <93.1%> (ø) 7 <7> (?)
src/Crypto/AesGcmEncryptingStream.php 93.54% <93.54%> (ø) 8 <8> (?)
... and 137 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ba6da5...29bfb98. Read the comment docs.

@kstich
Copy link
Contributor Author

kstich commented Oct 13, 2017

Codecov report of decreased coverage is incorrect. Details show that 95.99% of the diff is covered which is higher than the current 91.90% coverage.

@kstich kstich mentioned this pull request Oct 13, 2017
.travis.yml Outdated

after_success:
- bash <(curl -s https://codecov.io/bash)
- 'if [ $(phpenv version-name) == "7.1" ]; then bash <(curl -s https://codecov.io/bash); fi'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be to blame for the purported coverage drop. There may be code paths not taken in PHP 7.1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like CodeCov is taking the first report and using that, which explains the wild shifts as well. I'm going to set this up to use 7.1 with latest dependencies only, putting us on stable grounds that we can update with further PHP releases.

Feature: S3 Client Side Encryption

Scenario: Upload PHP's GCM encrypted fixtures
When I get all fixtures for "aes_gcm" from "aws-s3-shared-tests"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the name of a particular bucket? Including a resource identifier here seems fragile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bucket name and it is fragile. This is how the AWS SDK for Go handles it and we do something similar for the other full integration tests as well. These tests need to talk to other SDKs and don't get a getResourcePrefix added. I'm open to suggestions on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The S3 test should be creating and deleting buckets via BeforeSuite and AfterSuite hooks. I understand that this test suite has to work this way to integrate with other SDK tests, but maybe in the future there could a test that works on multiple accounts.

case 'gcm':
$cipherOptions['TagLength'] = 16;

$taggedStream = new AesTaggedGcmEncryptingStream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't AesTaggedGcmEncryptingStream be removed if you just used an AppendStream here?

@@ -0,0 +1,109 @@
<?php

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nit but passim: I believe SDK style requires no newline between the file opener and the namespace declaration.


return new KmsMaterialsProvider(
$this->kmsClient,
$materialsDescription['kms_cmk_id']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a cmk? Could abbreviations be avoided in this parameter name, or is the expanded version too long to reasonably type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'kms_cmk_id' is how the key is indexed in the materials description, this is consistent with the Java (and other) SDK implementations. The 'KeyId' naming on Encrypt is used for the $kmsKeyId variable.

*/
public function save(MetadataEnvelope $envelope, array $args)
{
$this->client->putObject([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if saving this instruction file fails? Is the ciphertext blob deleted from S3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $strategy->save() call is made before the PutObject call for the ciphertext. The instruction file will remain on S3 if there is a failure after it has been uploaded, not automatically deleted. I'll add a note to the guide and update the doc block for this file.


$envelope = new MetadataEnvelope();

return Promise\promise_for($this->encrypt(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encrypt isn't returning a promise, is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not returning a promise, but an AesStreamInterface. This is why it's the only piece wrapped in the Promise\promise_for call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's more of a style issue, but why does this need to be in an async context at all? With the exception of $this->client->putObjectAsync, this chain of functions seems to be operating synchronously and then returning non-promise values. Wouldn't it be easier to read if it was just iterative statements that returned the promise from putObjectAsync at the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought on async basing the manipulation was if someone invokes $s3EncryptionClient->putObjectAsync(), let's defer the heavy lifting that makes up the encryption work and other calls until they are executed. If someone invokes $s3EncryptionClient->putObject(), the work is immediately waited for anyways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha that makes sense

public function plainTextProvider() {
return [
[Psr7\stream_for('The rain in Spain falls mainly on the plain.')],
[Psr7\stream_for('دست‌نوشته‌ها نمی‌سوزند')],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a copy-paste error. IIRC, this case (and the following one) were meant to exercise round-tripping of multibyte characters, e.g.:

[Psr7\stream_for('دست‌نوشته‌ها نمی‌سوزند')],
[Psr7\stream_for('Рукописи не горят')],

$provider = new KmsMaterialsProvider($client, $keyId);

$this->assertEquals(
['kms_cmk_id' => $keyId],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From context, I'm guessing that CEK stands for "content encryption key" and CMK stands for "content master key"; why not just use EncryptionKey and MasterKey as the parameter names?

@@ -0,0 +1,42 @@
<?php

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test classes should probably be in a namespace corresponding to the directory structure.

Copy link
Contributor

@cjyclaire cjyclaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't gone through all yet, post my stashed comments before I forget them : )

*/
abstract class AbstractCryptoClient
{
private static $supportedCiphers = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a hash? why not a array of supported ciphers?

* Dependency to provide an interface for building a decryption stream for
* cipher text given metadata and materials to do so.
*
* @param string $cipherText Plain-text data to be encrypted using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, decrypted text data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

switch ($args['@MetadataStrategy']) {
case HeadersMetadataStrategy::class:
return new HeadersMetadataStrategy();
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm "return" is sufficient without "break"?

);
}

return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm PHP will return null if there is no return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wanted to be explicit here though since there are other return statements in this function.

* @throws \InvalidArgumentException Thrown when arguments above are not
* passed or are passed incorrectly.
*/
public function putObjectAsync(array $args) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the failed scenario look like? Especially for "getObject"? It will be great if some error scenario (besides invalid arguments) can be tested as well

Feature: S3 Client Side Encryption

Scenario: Upload PHP's GCM encrypted fixtures
When I get all fixtures for "aes_gcm" from "aws-s3-shared-tests"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The S3 test should be creating and deleting buckets via BeforeSuite and AfterSuite hooks. I understand that this test suite has to work this way to integrate with other SDK tests, but maybe in the future there could a test that works on multiple accounts.

@kstich kstich merged commit a2f90a9 into aws:master Nov 8, 2017
@kstich kstich deleted the s3_client_side_encryption branch January 17, 2018 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants