Encrypted Repository with AES GCM Packets #48221

albertzaharovits · 2019-10-18T00:05:58Z

This is the first PR to a new feature branch that will implement client side encryption for snapshot repositories. This is a followup of #46170.

This introduces two main classes: EncryptedRepository and GCMPacketsCipherInputStream.

EncryptedRepository is the simpler one and is mostly unchanged since #46170 . This ultimately instantiates a BlobContainer that delegates to another BlobContainer, and wraps the InputStreams (of the delegated BlobContainer) from the BlobContainer#readBlob and BlobContainer#writeBlob methods inside a CipherInputStream, namely the GCMPacketsCipherInputStream. Every blob is encrypted with a randomly generated data encryption secret key. This key is "wrapped" (also encrypted) by a master secret key which is derived from a password stored in the Elasticsearch keystore. The password setting name relates to the repository name (see below). This class also explicitly instantiates and uses the BC FIPS security provider in all JCA crypto APIs.

GCMPacketsCipherInputStream is the crux of the PR. It does encryption and decryption using the AES/GCM mode in a packet fashion. This divides the byte stream to be processed into packets that are encrypted with AES/GCM independently (each has it's own authentication tag, and is encrypted with the same key but different IV). This segmentation is done so that the CipherInputStream support mark/reset without buffering (re-encryption is OK). Therefore a mark followed by a reset will re-encrypt all the packets in between the two calls. However, if the mark call occurred during the processing of a packet, all processed bytes until the next packet boundary is encountered will be buffered.

@bizybot the code in #46170 already had the password setting in the keystore, I was amiss when I said it was a repository setting.

Here is how I run/test this:

diff --git a/distribution/build.gradle b/distribution/build.gradle
index 0a630bc44d1..337116c3560 100644
--- a/distribution/build.gradle
+++ b/distribution/build.gradle
@@ -433,6 +433,8 @@ testClusters {
       setting 'xpack.monitoring.enabled', 'true'
       setting 'xpack.sql.enabled', 'true'
       setting 'xpack.rollup.enabled', 'true'
+         setting 'path.repo', '/tmp/repo'
+         keystore 'repository.encrypted.encrypted-fs.password', 'passwordpasswordpassword'
       keystore 'bootstrap.password', 'password'
       user username: 'elastic-admin', password: 'elastic-password', role: 'superuser'
     }

curl -X PUT "localhost:9200/_snapshot/encrypted-fs?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "delegate_type": "fs",
    "location": "/tmp/repo/encrypted_repo"
  }
}
'

This definitely needs more testing! I have a list of tests to be added, as well as an Integ test.
But as you might notice GCMPacketsCipherInputStreamTests is hard-core exhaustive (uses fors instead of randoms) on most encrypt/decrypt + mark/reset scenarios.
That's because GCMPacketsCipherInputStream has A LOT of state and the interplay between so many internal state variables is difficult to cover with randoms. Also, I secretly worry about of-by-one errors and exhaustive tests give me peace of mind.
I am very open to suggestions on how/what tests we should add next.

Note: There are many pieces left in the air, keep in mind this is a PR against a feature branch.

POC with GCM from BC

elasticmachine · 2019-10-18T00:06:01Z

Pinging @elastic/es-security (:Security/Security)

alpar-t · 2019-10-18T09:02:27Z

x-pack/plugin/repository-encrypted/build.gradle

+}
+
+thirdPartyAudit {
+    ignoreViolations (


we should add a comment (or a TODO to do it) to say why it's ok to ignore those

I will add integ tests when everything is put together, we won't merge to master without integ tests for sure. I will add a TODO if you think so.

original-brownbear · 2019-10-21T14:46:27Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+
+                    @Override
+                    public long length() {
+                        return GCMPacketsCipherInputStream.getDecryptionSizeFromCipherSize(entry.getValue().length());


Will this actually return the exact unencrypted size? (it doesn't look like it does from the implementation, though it isn't so important that it does anymore since we changed the way we handle file sizes in the incremental snapshotting logic away from listing and towards using the metadata we store ... I'd still like to understand if this breaks being able to rely on these numbers going forward :))

Thanks for the feedback Armin!
Yes it does, this implementation can compute the plain size from the cipher size, and vice versa, without reading any information in the metadata blob, because the GCM packets have a fixed size.

I have another alternative implementation inbound that must read the metadata blob to determine the decryption size. Arguably this implementation must also read the metadata blob if we ever want to change the packet size.

I will try not to break the requirement that the size returns the correct size, it's not a difficult requirement.

I will try not to break the requirement that the size returns the correct size, it's not a difficult requirement.

Ok, don't worry about it too much though. We can adjust the Javadoc to account for this as well and be happy if it's a hassle. We don't really need this to be exact any more.

Sorry, I just realised something important here :) Actually it would be best if you just returned the encrypted size here now (i.e. did nothing) as is. We are only using this number (the blob-size) for the output of the cleanup endpoint. Other than that, it has no functional impact. Unless we want to hide the size of the unencrypted blob (which seems like it's no secret anyway with the way things work), we're good to simply remove this override I think :)

To me it feels "more correct" to return the size on disk for the blob, which in this case is the encrypted size. But we also proud ourselves that the encryption is completely transparent, which is not strictly the case if an encrypted repository returns different sizes than a plain one.

I get that you're also on my "more correct" side, so I'll make the change suggested (ie. don't wrap it).

original-brownbear

A few thoughts on the mechanics of handling the metadata. I'll try to give the stream-implementation a better read soon as well.

original-brownbear · 2019-10-21T19:34:34Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+        }
+
+        @Override
+        public DeleteResult delete() throws IOException {


We should definitely override org.elasticsearch.common.blobstore.BlobContainer#deleteBlobsIgnoringIfNotExists and maybe also org.elasticsearch.common.blobstore.BlobContainer#deleteBlobIgnoringIfNotExists as well.
Otherwise we break bulk delete operations and potentially slow down deletes on e.g. S3 by a factor of 100-1k.
Also, we can nicely optimize things here by taking a list of blobs as argument for the bulk delete and then instead of running individual deletes for the meta and data blobs, run a single bulk delete on a list of twice the size of the given list.

For the single file deleteBlobIgnoringIfNotExists we need to do the same and make sure to not fail deleting the data if the metadata is missing IMO.

I would also stress here, that bulk deletes aren't an optional performance optimisation (at least IMO, given how poorly non-bulk deletes used to perform in practice ...) Otherwise we're creating a situation where a snapshot delete against S3 might take a minute with a bare S3 repo and multiple hours with client side encryption).

original-brownbear · 2019-10-21T19:47:25Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+import java.util.Map;
+import java.util.function.Function;
+
+public class EncryptedRepository extends BlobStoreRepository {


Just a note for next steps here: we need to override the cleanup method on the blob store repository I think. The current implementation might leak data blobs for which the metadata has been deleted already (and vice-versa). If we don't have that, bulk deletes (see my other comment) will be a nightmare in terms of leaking data.

original-brownbear · 2019-10-21T19:55:38Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+
+                    @Override
+                    public long length() {
+                        return GCMPacketsCipherInputStream.getDecryptionSizeFromCipherSize(entry.getValue().length());


Sorry, I just realised something important here :) Actually it would be best if you just returned the encrypted size here now (i.e. did nothing) as is. We are only using this number (the blob-size) for the output of the cleanup endpoint. Other than that, it has no functional impact. Unless we want to hide the size of the unencrypted blob (which seems like it's no secret anyway with the way things work), we're good to simply remove this override I think :)

...pted/src/main/java/org/elasticsearch/repositories/encrypted/GCMPacketsCipherInputStream.java

original-brownbear · 2019-10-21T20:10:24Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+        }
+
+        @Override
+        public Map<String, BlobMetaData> listBlobs() throws IOException {


I think we should base this method and listBlobsByPrefix on the metadata instead of the data. We are using listing for two purposes:

Finding the latest index-N blob (that will change but for now it is what it is and should work correctly)

Finding unreferenced blobs before cleaning them up

The former will only work correctly for blobs that have their metadata in-tact. If we delete the encryption metadata first, we must not list out dangling data blobs for which the metadata is gone imo. Otherwise we fundamentally change the behaviour of the blob container and the guarantees we're assuming for its operations.

The fact that this will break the cleanup should be addressed by overriding the cleanup logic accordingly (see my other comment on that).

From #48221 (comment) I see that what you expect is that if the metadata exists then the encrypted data exists as well.

But I wrote it the other way around: if the encrypted data exists then the metadata exists as well.

This is the perspective from which write, delete and listBlobsByPrefix are written, please correct me if I got it wrong. For the read operation, the relation is not important because both must be available for the operation to succeed.

I have chosen this relation and not the obverse you suggested because encrypted data without the decryption key is a scary thing. But also, practically, it is much more efficient to leak encryption metadata than the actual data, given the size discrepancy between the two (and my choosing tolerates orphaned encryption metadata).

original-brownbear · 2019-10-21T20:16:13Z

...ry-encrypted/src/main/java/org/elasticsearch/repositories/encrypted/EncryptedRepository.java

+            } catch (NoSuchAlgorithmException | NoSuchPaddingException | InvalidKeyException | IllegalBlockSizeException e) {
+                throw new IOException(e);
+            }
+            try (InputStream stream = new ByteArrayInputStream(wrappedDataEncryptionKey)) {


I think we have to be careful with the ordering here.
We currently delete the metadata first. That means we must write it last. Otherwise, we lose all consistency guarantees. If we find a metadata blob we should have that imply that a data blob exists IMO. If it doesn't we can't do a safe listing operation on either the metadata nor the data blobs (because a data blob could be the remnant of a half-failed delete and a meta blob could be the remnant of a half-failed write).
If we write meta last and delete it first, then meta blobs can be used as a safe proxy for listing out data blobs and it all works out nicely right? (see my comment on the list methods also)

See my prev comment . It depends which direction for the relation between the two blobs we pick, so that we tolerate half-writes and half-deletes. I think I've been consistent with my choosing but let's discuss it if you disagree.

Big sorry here for the noise! I read the delete method the wrong way around (though you deleted the meta before the data) and all the other comments resulted from that mistake.
-> I think your solution is just fine :)

There is one potential upside though that I see to basing it all on the meta (that made me suggest going with the inverse order instead of this one):

Currently adding one meta blob for each blob we write is quite a bit of overhead. A solution that groups the metadata in some way may be a bit tricky as a step 1 (and with the current APIs we have for the blobstore repo) but I think it's not something we should rule out in subsequent steps. Once the work on the metadata handling consistency to fix remaining S3 issues and #45736 finishes we'll have more options here I think.
If we use the metadata as the source of truth and allow dangling data blobs so long as they're not referenced by other snapshot level metadata I think that leaves us a lot more options in lowering the number of puts and deletes without the tricky (BwC wise) move from data blobs to meta blobs first going forward. Admittedly, that's a bit of a random and non-concrete point and if you think it's easier to go with the current order I'm sure we'll find a BwC solution down the road when we get to optimising things :)

…herstream

This reverts commit 7e5bb3b.

…herstream

This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset. Relates #48221 , #46170

albertzaharovits · 2020-01-10T11:59:57Z

@original-brownbear I've raised #50846 to replace this one. Conceptually there is not so big of a difference, but this being a PR that brings a lot of things together under the same roof the changes amounted to modifications on every line so I decided to start from a clean slate. I hope this is not so much of an inconvenience to you.

This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset. Relates #48221 , #46170

This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset. Relates elastic#48221 , elastic#46170

The client-side encrypted repository is a new type of snapshot repository that internally delegates to the regular variants of snapshot repositories (of types Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted repository is set up, it is transparent to the snapshot and restore APIs (i.e. all snapshots stored in the encrypted repository are encrypted, no other parameters required). The encrypted repository is protected by a password stored on every node's keystore (which must be the same across the nodes). The password is used to generate a key encrytion key (KEK), using the PBKDF2 function, which is used to encrypt (using the AES Wrap algorithm) other symmetric keys (referred to as DEK - data encryption keys), which themselves are generated randomly, and which are ultimately used to encrypt the snapshot blobs. For example, here is how to set up an encrypted FS repository: ------ 1) make sure that the cluster runs under at least a "platinum" license (simplest test configuration is to put `xpack.license.self_generated.type: "trial"` in the elasticsearch.yml file) 2) identical to the un-encrypted FS repository, specify the mount point of the shared FS in the elasticsearch.yml conf file (on all the cluster nodes), e.g. `path.repo: ["/tmp/repo"]` 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*. In order to support changing password on existing repository (implemented in a follow-up), the password itself must be names, e.g. for the "test_enc_key" repository password name: `./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password` *type in the password* 4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling: ` curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d' { "type": "encrypted", "settings": { "location": "/tmp/repo/enc", "delegate_type": "fs", "password_name": "test_enc_pass" } } ' ` 5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g. ` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"` Related: #49896 #41910 #50846 #48221 #65768

The client-side encrypted repository is a new type of snapshot repository that internally delegates to the regular variants of snapshot repositories (of types Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted repository is set up, it is transparent to the snapshot and restore APIs (i.e. all snapshots stored in the encrypted repository are encrypted, no other parameters required). The encrypted repository is protected by a password stored on every node's keystore (which must be the same across the nodes). The password is used to generate a key encrytion key (KEK), using the PBKDF2 function, which is used to encrypt (using the AES Wrap algorithm) other symmetric keys (referred to as DEK - data encryption keys), which themselves are generated randomly, and which are ultimately used to encrypt the snapshot blobs. For example, here is how to set up an encrypted FS repository: ------ 1) make sure that the cluster runs under at least a "platinum" license (simplest test configuration is to put `xpack.license.self_generated.type: "trial"` in the elasticsearch.yml file) 2) identical to the un-encrypted FS repository, specify the mount point of the shared FS in the elasticsearch.yml conf file (on all the cluster nodes), e.g. `path.repo: ["/tmp/repo"]` 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*. In order to support changing password on existing repository (implemented in a follow-up), the password itself must be names, e.g. for the "test_enc_key" repository password name: `./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password` *type in the password* 4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling: ` curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d' { "type": "encrypted", "settings": { "location": "/tmp/repo/enc", "delegate_type": "fs", "password_name": "test_enc_pass" } } ' ` 5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g. ` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"` Related: elastic#49896 elastic#41910 elastic#50846 elastic#48221 elastic#65768

The client-side encrypted repository is a new type of snapshot repository that internally delegates to the regular variants of snapshot repositories (of types Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted repository is set up, it is transparent to the snapshot and restore APIs (i.e. all snapshots stored in the encrypted repository are encrypted, no other parameters required). The encrypted repository is protected by a password stored on every node's keystore (which must be the same across the nodes). The password is used to generate a key encrytion key (KEK), using the PBKDF2 function, which is used to encrypt (using the AES Wrap algorithm) other symmetric keys (referred to as DEK - data encryption keys), which themselves are generated randomly, and which are ultimately used to encrypt the snapshot blobs. For example, here is how to set up an encrypted FS repository: ------ 1) make sure that the cluster runs under at least a "platinum" license (simplest test configuration is to put `xpack.license.self_generated.type: "trial"` in the elasticsearch.yml file) 2) identical to the un-encrypted FS repository, specify the mount point of the shared FS in the elasticsearch.yml conf file (on all the cluster nodes), e.g. `path.repo: ["/tmp/repo"]` 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*. In order to support changing password on existing repository (implemented in a follow-up), the password itself must be names, e.g. for the "test_enc_key" repository password name: `./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password` *type in the password* 4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling: ` curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d' { "type": "encrypted", "settings": { "location": "/tmp/repo/enc", "delegate_type": "fs", "password_name": "test_enc_pass" } } ' ` 5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g. ` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"` Related: #49896 #41910 #50846 #48221 #65768

albertzaharovits added 19 commits October 10, 2019 03:03

Encrypted blob store repository - take I (elastic#46170)

ef2f8aa

POC with GCM from BC

Done Encryption without mark/reset

236aa79

Mark in progress

26a6594

Mark reset maybe works????

f1d5a84

Looking good, ready to fire up tests

be0cc44

hardcore simple tests work!

b647657

Encryption/decryption whizzes!

56f3c1e

Before separation of MarkDecorator

3d462e6

Refactored mark and reset

498f77c

Refactored tests to use on large global array

c8faeac

Fix mark reset bug

4052dcd

Tests counters

33133fc

Reset in rewind tests

b25598a

Failed decryption tests

710ea8a

Adapted Encrypted Repository but it does not work yet

1a7801f

Check passes, no javadoc, no integ tests

afafe76

Works!

780f8a1

GCMPacketsInputStream javadoc

51273d5

Merge branch 'master' into packet-based-cipherstream

c2ba0f7

albertzaharovits added >feature :Security/Security Security issues without another label labels Oct 18, 2019

albertzaharovits requested review from bizybot, tvernum and original-brownbear October 18, 2019 00:05

albertzaharovits self-assigned this Oct 18, 2019

alpar-t reviewed Oct 18, 2019

View reviewed changes

original-brownbear reviewed Oct 21, 2019

View reviewed changes

original-brownbear self-requested a review October 21, 2019 19:19

original-brownbear reviewed Oct 21, 2019

View reviewed changes

albertzaharovits added 16 commits October 31, 2019 10:41

Merge branch 'master' into packet-based-cipherstream

2a6bfb3

Rough EncryptedBlobMetadata

4bc1315

Merge branch 'master' into packet-based-cipherstream

06ff4c0

Merge branch 'repository-encrypted-client-side' into packet-based-cip…

a609b6e

…herstream

GCMPacketsEncryptor close/mark/reset TODO

6b8a6f3

Close/mark/reset done!

3c97079

GCMPacketsEncryptorInputStream done!

5dc188e

GCMPacketsDecryptorInputStream Done!

cc9cbc4

Before EncryptedRepository retrofitting

a460b09

BlobEncriptionMetadata no InputStream use write method

835c9d2

All crypto in-place!

dd7f4e1

Deleted GCMPacketsCipherInputStream

7e5bb3b

Revert "Deleted GCMPacketsCipherInputStream"

58ea4f3

This reverts commit 7e5bb3b.

Encryption new way

f78b589

Merge branch 'repository-encrypted-client-side' into packet-based-cip…

86bd583

…herstream

Merge branch 'repository-encrypted-client-side' into packet-based-cip…

c658601

…herstream

albertzaharovits mentioned this pull request Dec 6, 2019

Packet-based enc/dec cipher streams #49896

Merged

albertzaharovits mentioned this pull request Jan 10, 2020

Encrypted blob store repository #50846

Closed

albertzaharovits closed this Jan 10, 2020

albertzaharovits mentioned this pull request Mar 10, 2020

Encrypted blob store reuse DEK #53352

Merged

albertzaharovits mentioned this pull request Dec 23, 2020

Client-side encrypted snapshot repository (feature flag) #66773

Merged

albertzaharovits mentioned this pull request Dec 23, 2020

BACKPORT 7x Client-side encrypted snapshot repository (feature flag) (#66773) #66809

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encrypted Repository with AES GCM Packets #48221

Encrypted Repository with AES GCM Packets #48221

albertzaharovits commented Oct 18, 2019

elasticmachine commented Oct 18, 2019

alpar-t Oct 18, 2019

albertzaharovits Oct 18, 2019

original-brownbear Oct 21, 2019

albertzaharovits Oct 21, 2019

original-brownbear Oct 21, 2019

original-brownbear Oct 21, 2019

albertzaharovits Oct 22, 2019

original-brownbear left a comment

original-brownbear Oct 21, 2019

original-brownbear Oct 21, 2019

original-brownbear Oct 21, 2019

original-brownbear Oct 21, 2019

original-brownbear Oct 21, 2019

albertzaharovits Oct 22, 2019

original-brownbear Oct 21, 2019

albertzaharovits Oct 22, 2019

original-brownbear Oct 23, 2019

albertzaharovits commented Jan 10, 2020

Encrypted Repository with AES GCM Packets #48221

Encrypted Repository with AES GCM Packets #48221

Conversation

albertzaharovits commented Oct 18, 2019

elasticmachine commented Oct 18, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

albertzaharovits commented Jan 10, 2020