Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypted Repository with AES GCM Packets #48221

Conversation

albertzaharovits
Copy link
Contributor

This is the first PR to a new feature branch that will implement client side encryption for snapshot repositories. This is a followup of #46170.

This introduces two main classes: EncryptedRepository and GCMPacketsCipherInputStream.

EncryptedRepository is the simpler one and is mostly unchanged since #46170 . This ultimately instantiates a BlobContainer that delegates to another BlobContainer, and wraps the InputStreams (of the delegated BlobContainer) from the BlobContainer#readBlob and BlobContainer#writeBlob methods inside a CipherInputStream, namely the GCMPacketsCipherInputStream. Every blob is encrypted with a randomly generated data encryption secret key. This key is "wrapped" (also encrypted) by a master secret key which is derived from a password stored in the Elasticsearch keystore. The password setting name relates to the repository name (see below). This class also explicitly instantiates and uses the BC FIPS security provider in all JCA crypto APIs.

GCMPacketsCipherInputStream is the crux of the PR. It does encryption and decryption using the AES/GCM mode in a packet fashion. This divides the byte stream to be processed into packets that are encrypted with AES/GCM independently (each has it's own authentication tag, and is encrypted with the same key but different IV). This segmentation is done so that the CipherInputStream support mark/reset without buffering (re-encryption is OK). Therefore a mark followed by a reset will re-encrypt all the packets in between the two calls. However, if the mark call occurred during the processing of a packet, all processed bytes until the next packet boundary is encountered will be buffered.

@bizybot the code in #46170 already had the password setting in the keystore, I was amiss when I said it was a repository setting.

Here is how I run/test this:

diff --git a/distribution/build.gradle b/distribution/build.gradle
index 0a630bc44d1..337116c3560 100644
--- a/distribution/build.gradle
+++ b/distribution/build.gradle
@@ -433,6 +433,8 @@ testClusters {
       setting 'xpack.monitoring.enabled', 'true'
       setting 'xpack.sql.enabled', 'true'
       setting 'xpack.rollup.enabled', 'true'
+         setting 'path.repo', '/tmp/repo'
+         keystore 'repository.encrypted.encrypted-fs.password', 'passwordpasswordpassword'
       keystore 'bootstrap.password', 'password'
       user username: 'elastic-admin', password: 'elastic-password', role: 'superuser'
     }
curl -X PUT "localhost:9200/_snapshot/encrypted-fs?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "delegate_type": "fs",
    "location": "/tmp/repo/encrypted_repo"
  }
}
'

This definitely needs more testing! I have a list of tests to be added, as well as an Integ test.
But as you might notice GCMPacketsCipherInputStreamTests is hard-core exhaustive (uses fors instead of randoms) on most encrypt/decrypt + mark/reset scenarios.
That's because GCMPacketsCipherInputStream has A LOT of state and the interplay between so many internal state variables is difficult to cover with randoms. Also, I secretly worry about of-by-one errors and exhaustive tests give me peace of mind.
I am very open to suggestions on how/what tests we should add next.

Note: There are many pieces left in the air, keep in mind this is a PR against a feature branch.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Security)

}

thirdPartyAudit {
ignoreViolations (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a comment (or a TODO to do it) to say why it's ok to ignore those

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add integ tests when everything is put together, we won't merge to master without integ tests for sure. I will add a TODO if you think so.


@Override
public long length() {
return GCMPacketsCipherInputStream.getDecryptionSizeFromCipherSize(entry.getValue().length());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this actually return the exact unencrypted size? (it doesn't look like it does from the implementation, though it isn't so important that it does anymore since we changed the way we handle file sizes in the incremental snapshotting logic away from listing and towards using the metadata we store ... I'd still like to understand if this breaks being able to rely on these numbers going forward :))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback Armin!
Yes it does, this implementation can compute the plain size from the cipher size, and vice versa, without reading any information in the metadata blob, because the GCM packets have a fixed size.

I have another alternative implementation inbound that must read the metadata blob to determine the decryption size. Arguably this implementation must also read the metadata blob if we ever want to change the packet size.

I will try not to break the requirement that the size returns the correct size, it's not a difficult requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try not to break the requirement that the size returns the correct size, it's not a difficult requirement.

Ok, don't worry about it too much though. We can adjust the Javadoc to account for this as well and be happy if it's a hassle. We don't really need this to be exact any more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just realised something important here :) Actually it would be best if you just returned the encrypted size here now (i.e. did nothing) as is. We are only using this number (the blob-size) for the output of the cleanup endpoint. Other than that, it has no functional impact. Unless we want to hide the size of the unencrypted blob (which seems like it's no secret anyway with the way things work), we're good to simply remove this override I think :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it feels "more correct" to return the size on disk for the blob, which in this case is the encrypted size. But we also proud ourselves that the encryption is completely transparent, which is not strictly the case if an encrypted repository returns different sizes than a plain one.

I get that you're also on my "more correct" side, so I'll make the change suggested (ie. don't wrap it).

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts on the mechanics of handling the metadata. I'll try to give the stream-implementation a better read soon as well.

}

@Override
public DeleteResult delete() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely override org.elasticsearch.common.blobstore.BlobContainer#deleteBlobsIgnoringIfNotExists and maybe also org.elasticsearch.common.blobstore.BlobContainer#deleteBlobIgnoringIfNotExists as well.
Otherwise we break bulk delete operations and potentially slow down deletes on e.g. S3 by a factor of 100-1k.
Also, we can nicely optimize things here by taking a list of blobs as argument for the bulk delete and then instead of running individual deletes for the meta and data blobs, run a single bulk delete on a list of twice the size of the given list.

For the single file deleteBlobIgnoringIfNotExists we need to do the same and make sure to not fail deleting the data if the metadata is missing IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also stress here, that bulk deletes aren't an optional performance optimisation (at least IMO, given how poorly non-bulk deletes used to perform in practice ...) Otherwise we're creating a situation where a snapshot delete against S3 might take a minute with a bare S3 repo and multiple hours with client side encryption).

import java.util.Map;
import java.util.function.Function;

public class EncryptedRepository extends BlobStoreRepository {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for next steps here: we need to override the cleanup method on the blob store repository I think. The current implementation might leak data blobs for which the metadata has been deleted already (and vice-versa). If we don't have that, bulk deletes (see my other comment) will be a nightmare in terms of leaking data.


@Override
public long length() {
return GCMPacketsCipherInputStream.getDecryptionSizeFromCipherSize(entry.getValue().length());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just realised something important here :) Actually it would be best if you just returned the encrypted size here now (i.e. did nothing) as is. We are only using this number (the blob-size) for the output of the cleanup endpoint. Other than that, it has no functional impact. Unless we want to hide the size of the unencrypted blob (which seems like it's no secret anyway with the way things work), we're good to simply remove this override I think :)

}

@Override
public Map<String, BlobMetaData> listBlobs() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should base this method and listBlobsByPrefix on the metadata instead of the data. We are using listing for two purposes:

  1. Finding the latest index-N blob (that will change but for now it is what it is and should work correctly)
  2. Finding unreferenced blobs before cleaning them up

The former will only work correctly for blobs that have their metadata in-tact. If we delete the encryption metadata first, we must not list out dangling data blobs for which the metadata is gone imo. Otherwise we fundamentally change the behaviour of the blob container and the guarantees we're assuming for its operations.

The fact that this will break the cleanup should be addressed by overriding the cleanup logic accordingly (see my other comment on that).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #48221 (comment) I see that what you expect is that if the metadata exists then the encrypted data exists as well.

But I wrote it the other way around: if the encrypted data exists then the metadata exists as well.

This is the perspective from which write, delete and listBlobsByPrefix are written, please correct me if I got it wrong. For the read operation, the relation is not important because both must be available for the operation to succeed.

I have chosen this relation and not the obverse you suggested because encrypted data without the decryption key is a scary thing. But also, practically, it is much more efficient to leak encryption metadata than the actual data, given the size discrepancy between the two (and my choosing tolerates orphaned encryption metadata).

} catch (NoSuchAlgorithmException | NoSuchPaddingException | InvalidKeyException | IllegalBlockSizeException e) {
throw new IOException(e);
}
try (InputStream stream = new ByteArrayInputStream(wrappedDataEncryptionKey)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to be careful with the ordering here.
We currently delete the metadata first. That means we must write it last. Otherwise, we lose all consistency guarantees. If we find a metadata blob we should have that imply that a data blob exists IMO. If it doesn't we can't do a safe listing operation on either the metadata nor the data blobs (because a data blob could be the remnant of a half-failed delete and a meta blob could be the remnant of a half-failed write).
If we write meta last and delete it first, then meta blobs can be used as a safe proxy for listing out data blobs and it all works out nicely right? (see my comment on the list methods also)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my prev comment . It depends which direction for the relation between the two blobs we pick, so that we tolerate half-writes and half-deletes. I think I've been consistent with my choosing but let's discuss it if you disagree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big sorry here for the noise! I read the delete method the wrong way around (though you deleted the meta before the data) and all the other comments resulted from that mistake.
-> I think your solution is just fine :)

There is one potential upside though that I see to basing it all on the meta (that made me suggest going with the inverse order instead of this one):

Currently adding one meta blob for each blob we write is quite a bit of overhead. A solution that groups the metadata in some way may be a bit tricky as a step 1 (and with the current APIs we have for the blobstore repo) but I think it's not something we should rule out in subsequent steps. Once the work on the metadata handling consistency to fix remaining S3 issues and #45736 finishes we'll have more options here I think.
If we use the metadata as the source of truth and allow dangling data blobs so long as they're not referenced by other snapshot level metadata I think that leaves us a lot more options in lowering the number of puts and deletes without the tricky (BwC wise) move from data blobs to meta blobs first going forward. Admittedly, that's a bit of a random and non-concrete point and if you think it's easier to go with the current order I'm sure we'll find a BwC solution down the road when we get to optimising things :)

albertzaharovits added a commit that referenced this pull request Jan 6, 2020
This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates #48221 , #46170
@albertzaharovits
Copy link
Contributor Author

@original-brownbear I've raised #50846 to replace this one. Conceptually there is not so big of a difference, but this being a PR that brings a lot of things together under the same roof the changes amounted to modifications on every line so I decided to start from a clean slate. I hope this is not so much of an inconvenience to you.

albertzaharovits added a commit that referenced this pull request Nov 25, 2020
This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates #48221 , #46170
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this pull request Nov 30, 2020
This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates elastic#48221 , elastic#46170
albertzaharovits added a commit that referenced this pull request Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`


Related: #49896 #41910 #50846 #48221 #65768
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this pull request Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: elastic#49896 elastic#41910 elastic#50846 elastic#48221 elastic#65768
albertzaharovits added a commit that referenced this pull request Dec 28, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: #49896 #41910 #50846 #48221 #65768
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Security/Security Security issues without another label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants