-
Notifications
You must be signed in to change notification settings - Fork 28k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-43038][SQL] Support the CBC mode by
aes_encrypt()
/`aes_decry…
…pt()` ### What changes were proposed in this pull request? In the PR, I propose new AES mode for the `aes_encrypt()`/`aes_decrypt()` functions - `CBC` ([Cipher Block Chaining](https://www.ibm.com/docs/en/linux-on-systems?topic=operation-cipher-block-chaining-cbc-mode)) with the padding `PKCS7(5)`. The `aes_encrypt()` function returns a binary value which consists of the following fields: 1. The salt magic prefix `Salted__` with the length of 8 bytes. 2. A salt generated per every `aes_encrypt()` call using `java.security.SecureRandom`. Its length is 8 bytes. 3. The encrypted input. The encrypt function derives the secret key and initialization vector (16 bytes) from the salt and user's key using the same algorithm as OpenSSL's `EVP_BytesToKey()` (versions >= 1.1.0c). The `aes_decrypt()` functions assumes that its input has the fields as showed above. For example: ```sql spark-sql> SELECT base64(aes_encrypt('Apache Spark', '0000111122223333', 'CBC', 'PKCS')); U2FsdGVkX1/ERGxwEOTDpDD4bQvDtQaNe+gXGudCcUk= spark-sql> SELECT aes_decrypt(unbase64('U2FsdGVkX1/ERGxwEOTDpDD4bQvDtQaNe+gXGudCcUk='), '0000111122223333', 'CBC', 'PKCS'); Apache Spark ``` ### Why are the changes needed? To achieve feature parity with other systems/frameworks, and make the migration process from them to Spark SQL easier. For example, the `CBC` mode is supported by: - BigQuery: https://cloud.google.com/bigquery/docs/reference/standard-sql/aead-encryption-concepts#block_cipher_modes - Snowflake: https://docs.snowflake.com/en/sql-reference/functions/encrypt.html ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By running new checks: ``` $ build/sbt "sql/testOnly *QueryExecutionErrorsSuite" $ build/sbt "sql/test:testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite" $ build/sbt "test:testOnly org.apache.spark.sql.MiscFunctionsSuite" $ build/sbt "core/testOnly *SparkThrowableSuite" ``` and checked compatibility with LibreSSL/OpenSSL: ``` $ openssl version LibreSSL 3.3.6 $ echo -n 'Apache Spark' | openssl enc -e -aes-128-cbc -pass pass:0000111122223333 -a U2FsdGVkX1+5GyAmmG7wDWWDBAuUuxjMy++cMFytpls= ``` ```sql spark-sql (default)> SELECT aes_decrypt(unbase64('U2FsdGVkX1+5GyAmmG7wDWWDBAuUuxjMy++cMFytpls='), '0000111122223333', 'CBC'); Apache Spark ``` decrypt Spark's output by OpenSSL: ```sql spark-sql (default)> SELECT base64(aes_encrypt('Apache Spark', 'abcdefghijklmnop12345678ABCDEFGH', 'CBC', 'PKCS')); U2FsdGVkX1+maU2vmxrulgxXuQSyZ3ODnlHKqnt2fDA= ``` ``` $ echo 'U2FsdGVkX1+maU2vmxrulgxXuQSyZ3ODnlHKqnt2fDA=' | openssl aes-256-cbc -a -d -pass pass:abcdefghijklmnop12345678ABCDEFGH Apache Spark ``` Closes #40704 from MaxGekk/aes-cbc. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
- Loading branch information
Showing
6 changed files
with
141 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters