Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Side Encrypted Snapshot Repositories #41910

Open
albertzaharovits opened this issue May 7, 2019 · 8 comments

Comments

Projects
None yet
5 participants
@albertzaharovits
Copy link
Contributor

commented May 7, 2019

This concerns the encryption of snapshot data before it leaves the nodes.

We have 3 types of cloud snapshot repository types: Google Cloud Storage, Azure Storage and Amazon S3. Amazon and Azure support client side encryption for their java clients, but Google does not.

Amazon and Azure, which support client side encryption, allow the keys to be managed by the client (us) or by their Key Management Service (Vault-like). They both use the Envelope Encryption method; each blob is individually AES-256 encrypted with a randomly generated (locally) key, and this key (Data/Content Encryption Key) is also encrypted with another Master Key (locally or by the Vault service) and then stored alongside the blob in its metadata. The envelope encryption facilitates Master Key rotation because only the small (D/C)EK key has to be re-encrypted, rather than the complete blob.

On the ES side we discussed on having a single fixed URN key handler at the repository settings level.
This URN identifies the Master Key; for example this could point to a key on the Amazon Vault Service or the Keys on each node's keystore. In this alternative it is not possible to rotate the keys via the repository API (it might be possible to do it outside ES, which is obviously preferable, but see below).

I believe this is the rough picture of the puzzle that we need to put together.

We oscillated between implementation alternatives, and I will lay out the one which I think is favorable. Whatever solution we initially implement, given that the Master Key identifier is an URN we can multiplex multiple implementations for the same repository type.

We mirror the Envelope Encryption algorithm, employed by Amazon and Azure, at the BlobContainer level. The key is stored on each node's keystore (and is pointed to by the repository level URN reference).

Advantages:

  • Implement once for all cloud repository types and it will even work for the file system repository type!
  • Testing! We can have unit tests for the base implementation of the ~EncryptedBlobContainer and end-to-end integration tests in only one of the implementation (Amazon or FS), where we can decrypt the data on the service fixture.

Disadvantages:

  • Duplicates code in one SDK
  • We're in the open with Key Rotation. We might need to implement our own cmd line tool to rotate keys (download objects metadata, decrypt the key and re-encrypt it). Tool will be "easy" to implement.
  • Does not support Vault keys.

In the opposite corner, there could be this alternative:
We use the AWS cloud library facility to implement it only for the S3 repository type. The key is stored either on the node's keystore or on the AWS Key Management Service.

Advantages:

  • Easiest to implement
  • Supports AWS's Vault Service
  • We might have support for key rotation, using amazon's command line tool

Disadvantages:

  • Only S3 repository type is supported
  • Testing. We either mock the client and check that the code indeed calls the "crypto" APIs or we do an end-to-end integration test, where we decrypt the data on the fixture. Either way we kinda "test the library" rather than our code. This is pointless and brittle, but we need testing because the risks are too great.

Relates #34454

@elasticmachine

This comment has been minimized.

Copy link

commented May 7, 2019

@elasticmachine

This comment has been minimized.

Copy link

commented May 9, 2019

@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented May 9, 2019

We discussed this today, in our weekly team meeting, but got into extra time pondering the alternatives.

Yet we settled that we don't need to support moving snapshots between repositories.

I would like to kindly ask the distributed team for any input.
In addition, I plan to do the work, but I would need one review volunteer from the distributed team.

@original-brownbear

This comment has been minimized.

Copy link
Member

commented May 9, 2019

@albertzaharovits

what do you mean by

Duplicates code in one SDK

It seems to me for the first option we could "simply" pass a secure setting for the current encryption key to org.elasticsearch.repositories.blobstore.BlobStoreRepository and then wrap all the write and read operations that are initiated from there with the crypto logic completely agnostic to the underlying implementation of the blob store?

That said, I like the first option much better than doing some SDK specific thing just for S3. In the end it seems like that is probably less effort maintenance-wise long term since relying on the SDK's implementations of this completely puts us at the mercy of whatever changes happen with that. Plus, as you point out, working with the SDKs only will be tricky to test and not cover the FS repository.

I would point out one thing though (sorry if this was already discussed, just ignore this if it was :)):

The snapshot mechanism uses blob names as part of it's logic somewhat extensively. Even if we client side encrypt every blob, we'd still be leaking the following information:

  • Number of snapshots in the repository
  • Number of indices in all snapshots
  • Number of shards in each index
  • Number of snapshots (and anonymous id of each of them) that each shard is part of (and vice versa, number of indices and shards in each snapshot)
  • Roughly the number of segments in each shard in some cases

Not sure if that's a compliance problem, but that would certainly be something that would be challenging to not leak via the blob names.

That's all I have for now. Happy to help review you work though :)

@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented May 9, 2019

@original-brownbear
Thank you very much for the prompt response!

It seems to me for the first option we could "simply" pass a secure setting for the current encryption key to org.elasticsearch.repositories.blobstore.BlobStoreRepository and then wrap all the write and read operations that are initiated from there with the crypto logic completely agnostic to the underlying implementation of the blob store?

Yes, that's the first option I was trying to describe. What I mean when I say we duplicate code, is that the "crypto logic" (the envelope encryption, AES algorithm, all that) will most likely be very similar (on purpose) to what the SDK already does.

Not sure if that's a compliance problem, but that would certainly be something that would be challenging to not leak via the blob names.

I think that's a very thoughtful observation, and that it should definitely get in the docs. I don't believe there are regulations for that, and we are not aiming for a specific compliance target, but I'm no expert either. Maybe @joshbressers is more knowledgeable in this regard? I propose we clearly acknowledge this limitation in the docs and act on it only if we get specific requests.

That said, I like the first option much better than doing some SDK specific thing just for S3.
Happy to help review you work though

Glad to hear! Thank you!

@joshbressers

This comment has been minimized.

Copy link

commented May 10, 2019

Ideally we don't want to leak any metadata, but I know sometimes it's unavoidable.

We probably won't run afoul of any compliance standards here. We could see some interest from certain sensitive customers, but generally their concern revolves around leaking names more than this sort of metadata.

@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

Thank you for the answer @joshbressers ! I merely wish to reinforce this position by highlighting that leaking this type of metadata tips off cluster configuration but no information on the actual data.

@tvernum

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

I think it would be preferable to implement this ourselves and not rely on the blob-store libraries to do it.

Ultimately, we need this for multiple repository types, and we could use the cloud SDKs for it, but we would still need to build & verify it for each provider, which wouldn't gain us very much over just building it ourselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.