Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2016-6225: xtrabackup encryption is not setting the IV correctly #266

Merged
merged 1 commit into from
Nov 22, 2016

Conversation

gl-sergei
Copy link
Contributor

Blueprint checksum-unencrypted-chunk

[https://blueprints.launchpad.net/percona-xtrabackup/+spec/checksum-unencrypted-chunk]

This patch changes xbcrypt format as following:

  1. Bump XBCRYPT header version number, current is "XBCRYP03"

  2. Append 32-byte SHA256 hash of the plaintext to the payload of each
    chunk

  3. Encrypt plaintext payload and hash all together

  4. Both original length and encrypted length fields of the chunk header
    are calculated including these extra 32 bytes.

@@ -36,7 +36,9 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

#include "xbcrypt.h"

#if !defined(GCRYPT_VERSION_NUMBER) || (GCRYPT_VERSION_NUMBER < 0x010600)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not rely on the GCRYPT_VERSION_NUMBER to be always available and shouldn't we error out if this is not available ? (I presume this is likely to be backwards compatibility)

Also why do we require the version to be less than 0x010600 (67072)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I added this check for compatibility. Older libgrypt versions required to use

GCRY_THREAD_OPTION_PTHREAD_IMPL;

and

gcry_control(GCRYCTL_SET_THREAD_CBS, &gcry_threads_pthread);

to work correctly in multithreaded environment. This code no longer needed and deprecated since libgrypt 1.6. Compiler gives warnings for this code. Unfortunately I cannot just remove it, because we support CentOS 5 and 6. Version number is declared in the gcrypt.h as follows:

/* The version of this header should match the one of the library. It
   should not be used by a program because gcry_check_version() should
   return the same version.  The purpose of this macro is to let
   autoconf (using the AM_PATH_GCRYPT macro) check that this header
   matches the installed library.  */
#define GCRYPT_VERSION "1.7.3"

/* The version number of this header.  It may be used to handle minor
   API incompatibilities.  */
#define GCRYPT_VERSION_NUMBER 0x010703

0x010703 is 1.7.3. So, hexadecimal number in fact should be read as decimal. Ask libgrcypt maintainers why :)

@@ -132,6 +133,7 @@ encrypt_init(const char *root)

/* Acording to gcrypt docs (and my testing), setting up the threading
callbacks must be done first, so, lets give it a shot */
#if !defined(GCRYPT_VERSION_NUMBER) || (GCRYPT_VERSION_NUMBER < 0x010600)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per above comments

@@ -552,6 +555,12 @@ encrypt_worker_thread_func(void *arg)
if (thd->cancelled)
break;

assert(gcry_md_get_algo_dlen(XB_CRYPT_HASH) ==
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to add some comments in here that we want to force an error to write to stderr then abort() the process where the length of the hash XB_CRYPT_HASH does not match the expected length XB_CRYPT_HASH please correct me if I am wrong at all here, it's been a long time since I looked at C/C++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined XB_CRYPT_HASH_LEN manually. It must match the length of the output given by chosen hash function (XB_CRYPT_HASH). I use it for buffer allocations and so on. This assert helps to ensure that if I change the hash function (XB_CRYPT_HASH), I also change XB_CRYPT_HASH_LEN and avoid buffer overflow errors. It helps to keep the code maintainable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I understand that, I just meant that a source code comment would have been appreciated here e.g.

#Ensure the returned value of gcry_md_get_algo_dlen(XB_CRYPT_HASH) is equal to XB_CRYPT_HASH_LEN or throw an assertion.
assert(gcry_md_get_algo_dlen(XB_CRYPT_HASH) ==
               XB_CRYPT_HASH_LEN);

Entirely depends on your coding standard though, so feel free to not include something like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment added

assert(gcry_md_get_algo_dlen(XB_CRYPT_HASH) ==
XB_CRYPT_HASH_LEN);

memcpy(thd->to, thd->from, thd->from_len);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thd->to char* pointer containing my_malloc(XB_CRYPT_CHUNK_SIZE + XB_CRYPT_HASH_LEN, MYF(MY_FAE))
thd->from contains (const char *) buf presumable this is the plaintext ?
thd->from_len contains chunk_len = (len > XB_CRYPT_CHUNK_SIZE) ? XB_CRYPT_CHUNK_SIZE : len;

So this confirms we are running a memcpy to memory of the plaintext right before it is encrypted ?
There's no routine I can see to run GC so we're not purging memory after completion of the backup process ? this could lead to artifacts which contain the unencrypted content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an interesting part actually. I want to encrypt buffer pointed by thd->from and its hash and put the result into thd->to. I have two options:

  1. allocate hash_buffer, put hash there, then invoke gcry_cipher_encrypt(thd->from, thd->to); gcry_cipher_encrypt(hash_buffer, thd->to + thd->from_len). I.e. encrypt the buffer and the hash separately (it is streaming mode, so we are good) and put the result into the thd->to.
  2. copy thd->from into thd->to, append hash to thd->to, encrypt entire chunk with single gcry_cipher_encrypt(thd->to, thd->to). Libgrypt can work with the buffer inplace.

The caveat is that option 1 doesn't work for me in multithreaded environments in CentOS 5 and 6 (sigh), so I went with option 2.

As for plaintext chunks left over in the memory... Yes, we don't scrape the buffer, but it will be overwritten with the next chunk, and so on. At the moment when xtrabackup exists, all its memory will be given back to the system and it's kernels trouble to fill them with random garbage if needed. Isn't it?

Copy link

@Oneiroi Oneiroi Nov 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer is I may be wrong here, but C/C++ does not have any Garbage Collection as I noted above, the Linux Kernel does not (be default) run any GC / Purge against free memory.

You are literally waiting for a process to come along and overwrite the same address in memory, or the system is shutdown (must be powered off) and allowed to cold boot after some time (even then this doesn't address physical memory artifacts).

Perhaps I am overthinking this, however I think that right after we are done with this buffer writing an equivalent length of random data would ensure the artifact is removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometime the key copies in memory are overwritten once they are not used anymore (http://security.stackexchange.com/questions/74280/would-it-be-good-secure-programming-practice-to-overwrite-a-sensitive-variable etc). If you choose to do so, make sure not to use memset which can be optimized away (memset_s if available is OK)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to wipe out buffers for long living processes when keys or data are no longer used may still reside in processes memory and even be accessible to remote attacker via buffer overflow or other potential bugs.

I think xtrabackup is not the case. Encryption thread working as follows:

  1. Copy plaintext from thd->from to thd->to (thd->from is not
    needed anymore it will be reused to read the next chunk of
    plaintext)
  2. Append hash to thd->to
  3. Encrypt thd->to in place

Yes, 16 kbytes of plaintext are left in the thd->from and we don't
erase it (actually the whole 1M of plaintext can be read into memory
at once and we process it in 16k chunks). We also keep secret key in
memory for the entire life of xtrabackup process. I don't think it
is an issue because:

  • we reuse the same buffer to read the next chunk of plaintext
    immediately after we wrote down encrypted data
  • if the chunk was the last one, xtrabackup is going to exit.
    xtrabackup is a simple tool, it exits once copying is done.
  • after xtrabackup exited, its memory goes back to Linux kernel.
    Linux zeroes the memory before giving it to another process.

Lets say attacker has an access to xtrabackup memory. xtrabackup
isn't available via network, so attacker likely already has an access
to the filesystem. Then he will just read the data from the
filesystem, it is much easier option for him.

Am I missing something here?


memcpy(thd->to, thd->from, thd->from_len);
gcry_md_hash_buffer(XB_CRYPT_HASH, thd->to + thd->from_len,
thd->from, thd->from_len);
thd->to_len = thd->from_len;

if (encrypt_algo != GCRY_CIPHER_NONE) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only run the encryption where it is not set to the algorithm GCRY_CIPHER_NONE can we think of any conditions where this may occur ? (not a priority to do this, just for my curiosity and education on PXB)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible values for encrypt-algo option are "NONE", "AES128", "AES192" and "AES256". It is user settable. One could run xtrabackup --backup --encrypt --encrypt-algo=NONE and get the unencrypted output. The only use case I can imagine for it is debugging.

@@ -233,6 +223,9 @@ xb_crypt_read_chunk(xb_rcrypt_t *crypt, void **buf, size_t *olen, size_t *elen,

crypt->offset += *elen;
*buf = crypt->buffer;

*hash_appended = version > 2;

goto exit;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -35,6 +35,7 @@ if [ -n "${data_decrypt_cmd:=""}" ] || [ -n "${data_decompress_cmd:=""}" ]; then
vlog "###################################"
vlog "# DECRYPTING AND/OR DECOMPRESSING #"
vlog "###################################"
test -n "${data_decrypt_cmd_wrond_passphrase:=""}" && run_cmd_expect_failure bash -c "$data_decrypt_cmd_wrond_passphrase"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? wrond should be wrong ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed is typo. Will fix it.

@@ -35,6 +35,7 @@ if [ -n "${data_decrypt_cmd:=""}" ] || [ -n "${data_decompress_cmd:=""}" ]; then
vlog "###################################"
vlog "# DECRYPTING AND/OR DECOMPRESSING #"
vlog "###################################"
test -n "${data_decrypt_cmd_wrond_passphrase:=""}" && run_cmd_expect_failure bash -c "$data_decrypt_cmd_wrond_passphrase"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? wrond should be wrong ?


innobackupex_options="--encrypt=$encrypt_algo --encrypt-key=$encrypt_key --encrypt-threads=4 --encrypt-chunk-size=8K"
data_decrypt_cmd="innobackupex --decrypt=${encrypt_algo} --encrypt-key=${encrypt_key} ./"
data_decrypt_cmd_wrond_passphrase="innobackupex --decrypt=${encrypt_algo} --encrypt-key=${wrong_encrypt_key} ./"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? wrond should be wrong ?


xtrabackup_options="--encrypt=$encrypt_algo --encrypt-key=$encrypt_key --encrypt-threads=4 --encrypt-chunk-size=8K"
data_decrypt_cmd="xtrabackup --decrypt=${encrypt_algo} --encrypt-key=${encrypt_key} --target-dir=./"
data_decrypt_cmd_wrond_passphrase="xtrabackup --decrypt=${encrypt_algo} --encrypt-key=${wrong_encrypt_key} --target-dir=./"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? wrond should be wrong ?

Blueprint checksum-unencrypted-chunk

[https://blueprints.launchpad.net/percona-xtrabackup/+spec/checksum-unencrypted-chunk]

This patch changes xbcrypt format as following:

1. Bump XBCRYPT header version number, current is "XBCRYP03"

2. Append 32-byte SHA256 hash of the plaintext to the payload of each
   chunk

3. Encrypt plaintext payload and hash all together

4. Both original length and encrypted length fields of the chunk header
   are calculated including these extra 32 bytes.
@gl-sergei gl-sergei merged commit 04936ee into percona:2.3 Nov 22, 2016
@gl-sergei gl-sergei deleted the 2.3-xb-pxb-186 branch November 24, 2016 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants