Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Side Encrypted Snapshot Repositories #41910

Open
24 tasks
albertzaharovits opened this issue May 7, 2019 · 16 comments
Open
24 tasks

Client Side Encrypted Snapshot Repositories #41910

albertzaharovits opened this issue May 7, 2019 · 16 comments
Assignees
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement :Security/Security Security issues without another label Team:Distributed Meta label for distributed team Team:Security Meta label for security team

Comments

@albertzaharovits
Copy link
Contributor

albertzaharovits commented May 7, 2019

This concerns the encryption of snapshot data before it leaves the nodes.

We have 3 types of cloud snapshot repository types: Google Cloud Storage, Azure Storage and Amazon S3. Amazon and Azure support client side encryption for their java clients, but Google does not.

Amazon and Azure, which support client side encryption, allow the keys to be managed by the client (us) or by their Key Management Service (Vault-like). They both use the Envelope Encryption method; each blob is individually AES-256 encrypted with a randomly generated (locally) key, and this key (Data/Content Encryption Key) is also encrypted with another Master Key (locally or by the Vault service) and then stored alongside the blob in its metadata. The envelope encryption facilitates Master Key rotation because only the small (D/C)EK key has to be re-encrypted, rather than the complete blob.

On the ES side we discussed on having a single fixed URN key handler at the repository settings level.
This URN identifies the Master Key; for example this could point to a key on the Amazon Vault Service or the Keys on each node's keystore. In this alternative it is not possible to rotate the keys via the repository API (it might be possible to do it outside ES, which is obviously preferable, but see below).

I believe this is the rough picture of the puzzle that we need to put together.

We oscillated between implementation alternatives, and I will lay out the one which I think is favorable. Whatever solution we initially implement, given that the Master Key identifier is an URN we can multiplex multiple implementations for the same repository type.

We mirror the Envelope Encryption algorithm, employed by Amazon and Azure, at the BlobContainer level. The key is stored on each node's keystore (and is pointed to by the repository level URN reference).

Advantages:

  • Implement once for all cloud repository types and it will even work for the file system repository type!
  • Testing! We can have unit tests for the base implementation of the ~EncryptedBlobContainer and end-to-end integration tests in only one of the implementation (Amazon or FS), where we can decrypt the data on the service fixture.

Disadvantages:

  • Duplicates code in one SDK
  • We're in the open with Key Rotation. We might need to implement our own cmd line tool to rotate keys (download objects metadata, decrypt the key and re-encrypt it). Tool will be "easy" to implement.
  • Does not support Vault keys.

In the opposite corner, there could be this alternative:
We use the AWS cloud library facility to implement it only for the S3 repository type. The key is stored either on the node's keystore or on the AWS Key Management Service.

Advantages:

  • Easiest to implement
  • Supports AWS's Vault Service
  • We might have support for key rotation, using amazon's command line tool

Disadvantages:

  • Only S3 repository type is supported
  • Testing. We either mock the client and check that the code indeed calls the "crypto" APIs or we do an end-to-end integration test, where we decrypt the data on the fixture. Either way we kinda "test the library" rather than our code. This is pointless and brittle, but we need testing because the risks are too great.

Relates #34454 #40416


EDITED 28.01.2021 Backlog:

  • Add the new API that changes the password for a given encrypted repository.
    The encrypted repository must already be configured with the correct password. The API iterates over all the associated
    wrapped DEKs (contained inside a dedicated blob container under the repository's base path), and proceeds to unwrap and re-wrap all the DEKs with the new password.
    Finally, the old DEKs are removed, so that the old password cannot be used any longer.
  • implement searchable and encrypted repositories.
    This mainly requires implementing the AbstractBlobContainer#readBlob interface. This is slightly problematic because the association id between an encrypted blob and its DEK is prepended at the beginning of the blob, so that decryption at an internal position currently requires a seek at the beginning. Double check that the definition structure is reasonable (ie. is it searchable and encrypted or vice-versa, ping David about this).
  • ensure compressed & encrypted repositories work
  • investigate metered and encrypted repositories
  • thoroughly test failure scenarios where IOExceptions are thrown. Generally speaking, (although nor really true in practice) reads and writes contain
    two operations that can fail independently. Make sure testing covers this.
  • create benchmarks (distributed team is working on something that measures the throughput of a repository)
  • permit (and test) encrypted HDFS repositories (it should work)
  • double check that AbstractBlobContainer#blobExists and EncryptedBlobContainer#listBlobs/listBlobsByPrefix for EncryptedBlobContainer can return true and then reading to fail because of decryption problems
  • double check that we're not relying on system encoding (that strings are always written UTF-8 encoded, and reads are always decoded with the same UTF-8)
  • ensure that there's no problem that EncryptedBlobContainer#writeBlobAtomic is not atomic (in general, it cann't be because a write might also generate and write the DEK, so there are two operations that can fail independently)
  • ensure that it's alright for EncryptedBlobContainer#listBlobs/listBlobsByPrefix to return the encrypted blob size (which is larger), instead of the expected blob size of the decrypted blob
  • think about possibly renaming "password name" to "password label"
  • investigate repository password situation on cloud.
    On-premise repository passwords are cached in memory when the node starts, usually requiring a node restart when configuring a new snapshot repository. The security settings implementation on cloud is different, so that maybe we can read the repository passwords immediately after they've been added, changed, without requiring a restart.
  • repository password min-length limit (an encrypted repository with a short password )
  • make KDF parameters configurable
  • investigate the naming of the delegating and delegated repositories (they are the same currently, is this a problem?)
  • make crypto provider selectable for operations on the client side-encrypted repo
  • test that encrypted repos can share the bucker (but different base path, otherwise we already test that the passwords must be the same) and can also share the repository client
  • Revisit definition of password name in repository settings (see: Encrypted blob store reuse DEK #53352 (comment) )
  • Settle on the specification for encrypted and searchable snapshots (ping David about it)
  • Investigate if versioning if individual encrypted blobs is necessary https://github.com/elastic/elasticsearch/pull/53352/files#r444383568
  • Investigate if we can guarantee that DEKs do not change inside a given shard
  • Report encryption stats (from Encrypted blob store reuse DEK #53352 (comment))
  • test FIPS negative behaviour (that a short password doesn't crash the node or something).
@albertzaharovits albertzaharovits added >enhancement :Security/Security Security issues without another label 7x labels May 7, 2019
@albertzaharovits albertzaharovits self-assigned this May 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@albertzaharovits albertzaharovits mentioned this issue May 7, 2019
3 tasks
@albertzaharovits albertzaharovits added the :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs label May 9, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@albertzaharovits
Copy link
Contributor Author

We discussed this today, in our weekly team meeting, but got into extra time pondering the alternatives.

Yet we settled that we don't need to support moving snapshots between repositories.

I would like to kindly ask the distributed team for any input.
In addition, I plan to do the work, but I would need one review volunteer from the distributed team.

@original-brownbear
Copy link
Member

@albertzaharovits

what do you mean by

Duplicates code in one SDK

It seems to me for the first option we could "simply" pass a secure setting for the current encryption key to org.elasticsearch.repositories.blobstore.BlobStoreRepository and then wrap all the write and read operations that are initiated from there with the crypto logic completely agnostic to the underlying implementation of the blob store?

That said, I like the first option much better than doing some SDK specific thing just for S3. In the end it seems like that is probably less effort maintenance-wise long term since relying on the SDK's implementations of this completely puts us at the mercy of whatever changes happen with that. Plus, as you point out, working with the SDKs only will be tricky to test and not cover the FS repository.

I would point out one thing though (sorry if this was already discussed, just ignore this if it was :)):

The snapshot mechanism uses blob names as part of it's logic somewhat extensively. Even if we client side encrypt every blob, we'd still be leaking the following information:

  • Number of snapshots in the repository
  • Number of indices in all snapshots
  • Number of shards in each index
  • Number of snapshots (and anonymous id of each of them) that each shard is part of (and vice versa, number of indices and shards in each snapshot)
  • Roughly the number of segments in each shard in some cases

Not sure if that's a compliance problem, but that would certainly be something that would be challenging to not leak via the blob names.

That's all I have for now. Happy to help review you work though :)

@albertzaharovits
Copy link
Contributor Author

@original-brownbear
Thank you very much for the prompt response!

It seems to me for the first option we could "simply" pass a secure setting for the current encryption key to org.elasticsearch.repositories.blobstore.BlobStoreRepository and then wrap all the write and read operations that are initiated from there with the crypto logic completely agnostic to the underlying implementation of the blob store?

Yes, that's the first option I was trying to describe. What I mean when I say we duplicate code, is that the "crypto logic" (the envelope encryption, AES algorithm, all that) will most likely be very similar (on purpose) to what the SDK already does.

Not sure if that's a compliance problem, but that would certainly be something that would be challenging to not leak via the blob names.

I think that's a very thoughtful observation, and that it should definitely get in the docs. I don't believe there are regulations for that, and we are not aiming for a specific compliance target, but I'm no expert either. Maybe @joshbressers is more knowledgeable in this regard? I propose we clearly acknowledge this limitation in the docs and act on it only if we get specific requests.

That said, I like the first option much better than doing some SDK specific thing just for S3.
Happy to help review you work though

Glad to hear! Thank you!

@joshbressers
Copy link

Ideally we don't want to leak any metadata, but I know sometimes it's unavoidable.

We probably won't run afoul of any compliance standards here. We could see some interest from certain sensitive customers, but generally their concern revolves around leaking names more than this sort of metadata.

@albertzaharovits
Copy link
Contributor Author

Thank you for the answer @joshbressers ! I merely wish to reinforce this position by highlighting that leaking this type of metadata tips off cluster configuration but no information on the actual data.

@tvernum
Copy link
Contributor

tvernum commented May 16, 2019

I think it would be preferable to implement this ourselves and not rely on the blob-store libraries to do it.

Ultimately, we need this for multiple repository types, and we could use the cloud SDKs for it, but we would still need to build & verify it for each provider, which wouldn't gain us very much over just building it ourselves.

@albertzaharovits
Copy link
Contributor Author

Here is the 10 thousand feet view of the currently favored approach.

We create a new type of blob store repository, the encrypted type, that wraps and delegates read/write blob operations to an internally contained blob store repository. In essence, when the administrator creates an encrypted repository the "delegated" type must be specified (which is one of the blob store repository types already available fs, azure, s3, gcs, ?others?) and the creation will instantiate a private internal repository of that type. It will then use this internal repository's read/write-blob operations to store the same data, but encrypted.

Encryption uses the "envelope" strategy. This means that there is a data encryption key (DEK) and a key encryption key (KEK). The DEK is generated randomly for each "blob", and it encrypts the blob (the actual data). The KEK is a secret parameter of the "encrypted" repository, and it encrypts every DEK. In this case, the encrypted repository is comprised of "blobs" encrypted with DEKs, and of DEKs encrypted with the KEK. The KEK is not stored publicly.

The encryption algorithm for the DEK encrypting blobs is AES in the GCM mode using a 256 bit key length. For the DEK encryption with the KEK the same AES algorithm is used but with the ECB mode (this is intrinsic in the AESWrap Cipher from SunJCE). The GCM mode offers authenticated encryption which prevents attacks in which cipher text manipulations trigger predictable alterations of the decoded text. The plan is to use the BouncyCastle crypto provider (?the FIPS variant?) because the default SunJCE is very weak wrt to performance during decryption. The SunJCE provider will not release the decrypted text until the authentication tag has been verified, and hence it recurs to expensive in-memory buffering. It should be noted that it is not possible to have the code truly independent of the Crypto provider because the Cipher initialization is unfortunately (slightly) different for the implementations in the SunJCE and BC providers.

One un-encrypted blob "generates" one encrypted blob, and another blob containing the encrypted DEK. Ideally we should have a versioned format for an "encryption metadata" blob which contains the encrypted DEK and other description type of data, such as the IV. The metadata blob should be hashed but plain , aside from the encrypted DEK. The plan is to use the authenticated data of the GCM mode to include the metadata blob in the authentication tag computation. It should be noted that the write/read operation for blobs is no longer atomic (because it is translated into two such operations). The code aims to make sure that a blob cannot result without its associated metadata (by writing the metadata first, and deleting the blob first, etc).

We have a plethora of KEK storage/generation. The code in the current POC generates it from a text password from the keystore. Other alternatives include to store the binary key in the keystore or a separate file, or source it via KMIP, Amazon KMP, or delegate the "key unwrap" operation via KMIP . We don't have to limit it to a single method but we need to decide what the method is for the first iteration (CC @bytebilly).

Key rotation happens via an ES API . Key rotation implies the use of the old KEK to decrypt all the DEKs and re-encrypt them with the new KEK. Key rotation is depended upon the way we store the KEK. With the current approach of storing a text password in the keystore, the keystore reload call can also rotate keys. But it might not be desirable to hog the reload API call for the reload operation. Another approach is defer the rotation until the next snapshot (which could be an empty one) rather than creating a new API for it. In any case, because of the failure situations we would have juggle both the old and the new KEKs inside the keystore at least for some time, but the precise flow is yet TBD.

Here is the POC were I've explored these choices #46170 .

@albertzaharovits
Copy link
Contributor Author

Here is what it still needs to be done/investigated, the order is somehow important:

  • NIST SP800-38D investigate any faux-pas (for example to not interpret deciphered data; we should be assured that a throw on a stream close, when the authentication tag is verified, will abort the full operation and remove the faulting blob, seeFileRestoreContext)
  • add the FIPS BC provider and check that it all works nicely in a FIPS JVM
  • decide on a way to store the KEK for the minimum viable product (Is a text password in the ES KeyStore, optionally "consistent" across nodes, sufficient?)
  • pin down the protocol to rotate keys. I think the invariable we should maintain is to not have two different snapshots in the same repository using different KEKs.
  • make the encryption metadata blob versioned and HMACed (investigate using the authenticated data of the GCM mode; it might require another key)
  • test that the authenticated encryption works, and that whatever exception pops up will abort the restore operation correctly (there were glaring errors in the JDK in the past).
  • investigate the AES 256 against our JDK compatibility matrix (does the oldest JDK 8 we support require the export trick to work?)

@albertzaharovits
Copy link
Contributor Author

albertzaharovits commented Sep 4, 2019

We discussed the list above and prioritized some items as follows, so as to be sure we resolve all unknowns as soon as possible:

  • move the repository plugin from Encrypted blob store repository - take I #46170 under the x-pack folder and license
  • investigate the FIPS BC provider in a FIPS JVM. Can we make the plugin work in FIPS mode?
  • test the plugin against a cloud repository (eg S3), only the fs type has been tested so far
  • decide on the format of the KEK for the first iteration

This last point requires input from the product team @bytebilly .
As developers we think that a textual password inside the elasticsearch keystore on every node for which we can check consistency across nodes ought to be sufficient. The password is used to generate the master AES KEK using a well known algorithm (PBKDF2WithHmacSHA256) . In the long run we believe it's almost certain that we will need to integrate with key management services like Vault (via its newly added KMIP protocol) or Amazon's or Google's particular Key Management Services (KMS). Therefore, later in the development of this first iteration we will think about how this would precisely work, but first we need to agree that a textual password in the keystore is a good candidate that the cloud and potential clients asking for client-side encryption would use (they might have other preferences, but at least this option does not scare every potential client). The decision for how we store/generate the KEK has a big implication on how the key rotation works as well, and it is the biggest moot point of this feature.

@bytebilly
Copy link
Contributor

@albertzaharovits I totally agree with the client-side management of encryption keys.
This solves two problems:

  1. support the encrypted flow for generic storages (not just for some predefined cloud providers)
  2. comply with security requirements where customers need to provide encryption keys on request

We can consider using a password to seed the KEK, and store it in the node keystore. This is a viable first iteration, and I don't see blockers in current enhancement requests to suggest something different. Further support for cloud-specific key management systems could be added in a second step.

Storing the binary KEK in the keystore is not very useful in my opinion, since it doesn't increase security (KEK can be directly used to decrypt DEK, so it's not a safer option). The text password is easier to use in command line tools we can eventually provide to manually manage encrypted snapshots.

A possible flow could have a flexible configuration that defines the password source. It can define if the password is in the keystore, or if it should be retrieved from an external source. The first part is what we can ship first.

Password rotation could occur automatically every new snapshot, and in addition we can provide a specific API for that. I expect customers may need to guarantee rotation within a well-defined range for regulations. I'd rather avoid coupling rotation with keystore reload.

With this approach, we provide an out-of-the-box key rotation for everyone (every new snapshot), but we also allow rotation on-demand for customers with specific needs.
In the future, it would be awesome to support cloud-based keys to be rotated transparently with the same API, making the entire flow decoupled from the underlying implementation.

What I'm still missing, is who defines this password. Is it user defined, or automatically generated by the system? In the first case, how do we deal with key rotation, since it would replace the user-defined value?
Another point that I still don't have clear is if we need to define the password in each node, and if they should be the same.

@albertzaharovits
Copy link
Contributor Author

Thanks for looking into it @bytebilly !

We can consider using a password to seed the KEK, and store it in the node keystore. This is a viable first iteration, and I don't see blockers in current enhancement requests to suggest something different.

Good to hear that.

A possible flow could have a flexible configuration that defines the password source. It can define if the password is in the keystore, or if it should be retrieved from an external source.

As far as "passwords" are concerned I think they should reside in the keystore only. Subsequent iterations on this feature could "source" the secret to seed the KEK from external systems, but in this case I think it makes more sense to get the actual KEK, and not do any alterations to that.

Password rotation could occur automatically every new snapshot, and in addition we can provide a specific API for that. I expect customers may need to guarantee rotation within a well-defined range for regulations. I'd rather avoid coupling rotation with keystore reload.

Note that the current design aims for a single KEK per repository not per snapshot. Adding a new API to perform the rotation is better compared with coupling this operation with the keystore reload. If the old and the new KEKs during rotation are self-descriptive (meaning the rotation API can tell which one supersedes the other, eg last modified date for keystore entries) then it is possible to do without a new API and trigger the rotation of the whole repository on the next snapshot (which could be empty). But I'm getting ahead of myself, I'll plan for the API not for self-descriptive keys, where you explicitly name in the API the old and the new keys (assumes all keys are "nameable").

In the future, it would be awesome to support cloud-based keys to be rotated transparently with the same API, making the entire flow decoupled from the underlying implementation.

If keys are nameable it should work with the API that's to be introduced in the first iteration. Names will look like URIs, I think this is how they are referred to by all cloud providers.

What I'm still missing, is who defines this password. Is it user defined, or automatically generated by the system?

User defined. I'm open to suggestion to have it seeded by a random value in the keystore, although it doesn't sound too useful to me, but if it enhances UX I am open to it, but it's not consequential at this stage.

In the first case, how do we deal with key rotation, since it would replace the user-defined value?

Yeah, both values should be available in the keystore simultaneously. They should have "similar" names (under the same namespace).

Another point that I still don't have clear is if we need to define the password in each node, and if they should be the same.

Yes, on every node. We have infrastructure to be assured that they are all equal on all nodes (not used as of right now, but the infra is there).

@albertzaharovits
Copy link
Contributor Author

In #53352 I've raised a PR with an implementation of the encrypted BlobStoreRepository which follows the "DEK reuse" strategy discussed over at #50846 (comment) . I would like to try and explain the whole of functionality, as it currently stands.

Encrypted snapshots are implemented as a new repository type, under a module of the x-pack plugin. Encrypted snapshots are available for the following existing repository types: S3, Azure, GCS and FS. An encrypted snapshot encrypts all the data that is part of the snapshot before it is uploaded to the storage service. Snapshots are encrypted when an ordinary snapshot operation is performed against a repository of the new encrypted type. It is not possible to encrypt the snapshots in an existing regular repository.

An encrypted repository is created similarly to creating the regular repository of the same type (eg S3). The same APIs are used, but, in addition, creating encrypted repositories require a new repository setting (cluster state) which names the secure setting holding the repository password which is used to derive the encryption keys (See the example at #53352 (comment) for how to create an encrypted FS repository).
The repository password must be stored in the keystore on every master and data node. A wrong or missing password on one of the data nodes will prevent snapshoting (and restoring) shards on that node.

More technically, encrypted snapshots work by encrypting (AES-256) all data at the blob level, that's uploaded to the storage service. The data encryption keys (DEK) are generated randomly by every node. A generated DEK is reused locally by the node at most for the lifetime of repository (until the repository is deleted or the node is shut down), but the exact details on when a DEK is reused is an implementation detail (it is deliberated starting at #50846 (comment)). The association of the encrypted blob to its DEK is realized by prefixing a DEK name to the encrypted blob. DEKs themselves are encrypted (AES Wrap) and stored under a location, which contains the DEK name, in the storage service as well. The key encryption keys (KEK), used to encrypt the DEKs, are generated starting from the repository password using the PBKDF2 algorithm. The association between the KEK and the DEKs it wraps is realized by storing the wrapped DEK under a path location that requires knowledge of the password (it's again an implementation detai). Theoretically, there could only be one KEK in existence ever (for the lifetime of the repository password), but it is combersome to ensure all the participants use the same KEK, so a relaxed approach has been adopted which derives the KEK by using the DEK name as a salt in the PBKDF2 function (because the DEK name is generated randomly for the purpose of uniqueness anyway).

@rjernst rjernst added Team:Distributed Meta label for distributed team Team:Security Meta label for security team labels May 4, 2020
albertzaharovits added a commit that referenced this issue Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`


Related: #49896 #41910 #50846 #48221 #65768
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this issue Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: elastic#49896 elastic#41910 elastic#50846 elastic#48221 elastic#65768
albertzaharovits added a commit that referenced this issue Dec 28, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: #49896 #41910 #50846 #48221 #65768
@albertzaharovits
Copy link
Contributor Author

albertzaharovits commented Jan 28, 2021

I've created the following two diagrams on how keys are generated and used internally:
ES encrypted snapshots.pdf

They are sketchy and not very professional, but they are accurate as the code currently stands, and, I hope, informative.
When everything is wrapped up, I plan to have something standardized (like "UML for encryption protocols") and vectorized.

alyokaz pushed a commit to alyokaz/elasticsearch that referenced this issue Mar 10, 2021
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`


Related: elastic#49896 elastic#41910 elastic#50846 elastic#48221 elastic#65768
@barebu
Copy link

barebu commented Jul 8, 2022

@albertzaharovits hello. The idea with server side encryption is very cool. Can you tell if there are elasticsearch builds with the "type": "encrypted" repository?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement :Security/Security Security issues without another label Team:Distributed Meta label for distributed team Team:Security Meta label for security team
Projects
None yet
Development

No branches or pull requests

9 participants