Skip to content

Comments

[WIP][SPARK-NNNNN] Updating AES-CBC support to not use OpenSSL's KDF#40903

Closed
sweisdb wants to merge 2 commits intoapache:masterfrom
sweisdb:SPARK-NNNNN
Closed

[WIP][SPARK-NNNNN] Updating AES-CBC support to not use OpenSSL's KDF#40903
sweisdb wants to merge 2 commits intoapache:masterfrom
sweisdb:SPARK-NNNNN

Conversation

@sweisdb
Copy link
Contributor

@sweisdb sweisdb commented Apr 21, 2023

What changes were proposed in this pull request?

The current implementation of AES-CBC mode called via aes_encrypt and aes_decrypt uses a key derivation function (KDF) based on OpenSSL's EVP_BytesToKey. This is intended for generating keys based on passwords and OpenSSL's documents discourage its use: "Newer applications should use a more modern algorithm".

aes_encrypt and aes_decrypt should use the key directly in CBC mode, as it does for both GCM and ECB mode. The output should then be the initialization vector (IV) prepended to the ciphertext – as is done with GCM mode:
[16-byte randomly generated IV | AES-CBC encrypted ciphertext]

Why are the changes needed?

We want to have the ciphertext output similar across different modes. OpenSSL's EVP_BytesToKey is effectively deprecated and their own documentation says not to use it. Instead, CBC mode will generate a random vector.

Does this PR introduce any user-facing change?

AES-CBC output generated by the previous format will be incompatible with this change. That change was recently landed and we want to land this before CBC mode is used in practice.

How was this patch tested?

A new unit test in DataFrameFunctionsSuite was added to test both GCM and CBC modes. Also, a new standalone unit test suite was added in ExpressionImplUtilsSuite to test all the modes and various key lengths.

CBC values can be verified with openssl enc using the following command:

echo -n "[INPUT]" | openssl enc -a -e -aes-256-cbc -iv [HEX IV] -K [HEX KEY]
echo -n "Spark" | openssl enc -a -e -aes-256-cbc -iv f8c832cc9c61bac6151960a58e4edf86 -K 6162636465666768696a6b6c6d6e6f7031323334353637384142434445464748

@github-actions github-actions bot added the SQL label Apr 21, 2023
@sweisdb
Copy link
Contributor Author

sweisdb commented Apr 26, 2023

Closing this and will replace with SPARK-43286 and SPARK-43290.

@sweisdb sweisdb closed this Apr 26, 2023
@sweisdb sweisdb deleted the SPARK-NNNNN branch April 26, 2023 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant