Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packet-based enc/dec cipher streams #49896

Conversation

albertzaharovits
Copy link
Contributor

@albertzaharovits albertzaharovits commented Dec 6, 2019

This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates #48221 , #46170

Edit:
Extract from javadocs explaining how encryption works:

An {@code EncryptionPacketsInputStream} wraps another input stream and encrypts its contents.The method of encryption is AES/GCM/NoPadding, which is a type of authenticated encryption. The encryption works packet wise, i.e. the stream is segmented into fixed-size byte packets which are separately encrypted using a unique {@link Cipher}. As an exception, only the last packet will have a different size, possibly zero. Note that the encrypted packets are larger compared to the plaintext packets, because they contain a 16 byte length trailing authentication tag. The resulting encrypted and authenticated packets are assembled back into the resulting stream. The packets are encrypted using the same {@link SecretKey} but using a different initialization vector. The IV is 12 bytes wide and it's comprised of an integer {@code nonce}, the same for every packet in a stream, but which MUST not otherwise be repeated for the same {@code SecretKey} across other streams, and a monotonically increasing long counter. When assembling the resulting stream, the IV is prepended to the corresponding packet's ciphertext.
The packet size is preferably a large multiple of the AES block size (128 bytes), but any positive
integer value smaller than {@link EncryptedRepository#MAX_PACKET_LENGTH_IN_BYTES} is valid.
This input stream supports the {@code mark} and {@code reset} operations, but only if the wrapped stream supports them as well. A {@code mark} call will trigger the memory buffering of the current packet and will also trigger a {@code mark} call on the wrapped input stream on the next packet boundary. Upon a {@code reset} call, the buffered packet will be replayed and new packets will be generated starting from the marked packet boundary on the wrapped stream.
The {@code close} call will close the encryption input stream and any subsequent {@code read},
{@code skip}, {@code available} and {@code reset} calls will throw {@code IOException}s.
This is NOT thread-safe, multiple threads sharing a single instance must synchronize access.

and how decryption works:

A {@code DecryptionPacketsInputStream} wraps an encrypted input stream and decrypts
its contents. This is designed (and tested) to decrypt only the encryption format that
{@link EncryptionPacketsInputStream} generates. No decrypted bytes are returned before
they are authenticated.
The same parameters, namely {@code secretKey}, {@code nonce} and {@code packetLength},
that have been used during encryption must also be used for decryption, otherwise
decryption will fail.
This implementation buffers the encrypted packet in memory. The maximum packet size it can
accommodate is {@link EncryptedRepository#MAX_PACKET_LENGTH_IN_BYTES}.
This implementation does not support {@code mark} and {@code reset}.
The {@code close} call will close the decryption input stream and any subsequent {@code read},
{@code skip}, {@code available} and {@code reset} calls will throw {@code IOException}s.
This is NOT thread-safe, multiple threads sharing a single instance must synchronize access.

@albertzaharovits albertzaharovits added >feature :Security/Security Security issues without another label labels Dec 6, 2019
@albertzaharovits albertzaharovits self-assigned this Dec 6, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (:Security/Security)

@albertzaharovits
Copy link
Contributor Author

Thank you for another thorough review @tvernum !
I have addressed all the issues you've pointed out. Please take another look.

@tvernum tvernum self-requested a review January 6, 2020 04:50
Copy link
Contributor

@tvernum tvernum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@albertzaharovits
Copy link
Contributor Author

@elasticmachine update branch

@albertzaharovits albertzaharovits merged commit a863f76 into elastic:repository-encrypted-client-side Jan 6, 2020
@albertzaharovits albertzaharovits deleted the packet-based-cipherstream-2 branch January 10, 2020 09:51
albertzaharovits added a commit that referenced this pull request Nov 25, 2020
This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates #48221 , #46170
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this pull request Nov 30, 2020
This adds a new bare snapshot repository project which contains the classes implementing encryption (and decryption) input stream decorators that support mark and reset.

Relates elastic#48221 , elastic#46170
albertzaharovits added a commit that referenced this pull request Dec 2, 2020
This builds upon the data encryption streams from #49896
to create an encrypted snapshot repository.
The repository encryption works with the following existing repository types:
FS, Azure, S3, GCS (possibly works with HDFS and URL, but these are not tested).
The encrypted repository is protected by a password stored on every node's keystore.
The repository keys (KEK - key encryption key) are generated from the password
using the PBKDF2 function, and are used to encrypt (using the AES Wrap algorithm)
other symmetric keys (referred to as DEK - data encryption keys) which are themselves
used to encrypt the blobs of the regular snapshot.

The platinum or enterprise licenses are required to snapshot to the encrypted repository,
but no license is required to list or restore already encrypted snapshots.
albertzaharovits added a commit that referenced this pull request Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`


Related: #49896 #41910 #50846 #48221 #65768
albertzaharovits added a commit to albertzaharovits/elasticsearch that referenced this pull request Dec 23, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: elastic#49896 elastic#41910 elastic#50846 elastic#48221 elastic#65768
albertzaharovits added a commit that referenced this pull request Dec 28, 2020
The client-side encrypted repository is a new type of snapshot repository that
internally delegates to the regular variants of snapshot repositories (of types
Azure, S3, GCS, FS, and maybe others but not yet tested). After the encrypted
repository is set up, it is transparent to the snapshot and restore APIs (i.e. all
snapshots stored in the encrypted repository are encrypted, no other parameters
required).
The encrypted repository is protected by a password stored on every node's
keystore (which must be the same across the nodes).
The password is used to generate a key encrytion key (KEK), using the PBKDF2
function, which is used to encrypt (using the AES Wrap algorithm) other
symmetric keys (referred to as DEK - data encryption keys), which themselves
are generated randomly, and which are ultimately used to encrypt the snapshot
blobs.

For example, here is how to set up an encrypted  FS repository:
------
 1) make sure that the cluster runs under at least a "platinum" license
(simplest test configuration is to put `xpack.license.self_generated.type: "trial"`
in the elasticsearch.yml file)
 2) identical to the un-encrypted FS repository, specify the mount point of the
shared FS in the elasticsearch.yml conf file (on all the cluster nodes),
e.g. `path.repo: ["/tmp/repo"]`
 3) store the repository password inside the elasticsearch.keystore, *on every cluster node*.
In order to support changing password on existing repository (implemented in a follow-up),
the password itself must be names, e.g. for the "test_enc_key" repository password name:
`./bin/elasticsearch-keystore add repository.encrypted.test_enc_pass.password`
*type in the password*
4) start up the cluster and create the new encrypted FS repository, named "test_enc", by calling:
`
curl -X PUT "localhost:9200/_snapshot/test_enc?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "encrypted",
  "settings": {
    "location": "/tmp/repo/enc",
    "delegate_type": "fs",
    "password_name": "test_enc_pass"
  }
}
'
`
5) the snapshot and restore APIs work unmodified when they refer to this new repository, e.g.
` curl -X PUT "localhost:9200/_snapshot/test_enc/snapshot_1?wait_for_completion=true"`

Related: #49896 #41910 #50846 #48221 #65768
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Security/Security Security issues without another label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants