Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support encrypting binary columns #50920

Merged
merged 3 commits into from Feb 15, 2024
Merged

Conversation

djmb
Copy link
Contributor

@djmb djmb commented Jan 30, 2024

Motivation / Background

ActiveRecord Encryption doesn't prevent you from encrypting binary columns but it doesn't have proper support for it either.

When the data is fed through encrypt/decrypt it is converted to a String. This means that the the encryption layer is not transparent to binary data - which should be passed as Type::Binary::Data.

As a result the data is not properly escaped in the SQL queries or deserialized correctly after decryption.

However it just happens to work fine for MySQL and SQLite because the MessageSerializer doesn't use any characters that need to be encoded. However if you try to use a custom serializer that does then it breaks.

PostgreSQL on the other hand does not work - because the Bytea type is passed a String rather than a Type::Binary::Data to deserialize, it attempts to unescape the data and either mangles it or raises an error if it contains null bytes.

Detail

The commit fixes the issue, by reserializing the data after encryption and decryption. For text data that's a no-op, but for binary data we'll convert it back to a Type::Binary::Data.

Copy link
Contributor

@jorgemanrubia jorgemanrubia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great one @djmb. Left a minor suggestion for your consideration.

I'd also link this PR with the PostgreSQL issue. Will this fix it out of the box?

@djmb djmb force-pushed the encrypt-binary-columns branch 2 times, most recently from 1b296d6 to 16330f1 Compare February 2, 2024 13:04
@djmb
Copy link
Contributor Author

djmb commented Feb 2, 2024

Thanks @jorgemanrubia!

There isn't a Rails issue for this - one was raised on the SolidCache repo though (rails/solid_cache#102)

@@ -150,6 +158,14 @@ def decryption_options
def clean_text_scheme
@clean_text_scheme ||= ActiveRecord::Encryption::Scheme.new(downcase: downcase?, encryptor: ActiveRecord::Encryption::NullEncryptor.new)
end

def reserialize(value)
if cast_type.binary?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the lone case needing reserialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text and binary columns are the only ones able to store the encrypted JSON blobs.

I checked whether we could always serialize here instead, in the hope that it would be a no-op for String columns, but type serialization is much more complicated than that - it might be converting serialized or store values for instance.

I've changed this in 38b7df7, now we just wrap strings as Binary::Data, as that's actually reversing what the encryptor does. I've also renamed the method as "serialize" is overloaded with meanings here.

ActiveRecord Encryption doesn't prevent you from encrypting binary
columns but it doesn't have proper support for it either.

When the data is fed through encrypt/decrypt it is converted to a
String. This means that the the encryption layer is not transparent
to binary data - which should be passed as Type::Binary::Data.

As a result the data is not properly escaped in the SQL queries or
deserialized correctly after decryption.

However it just happens to work fine for MySQL and SQLite because the
MessageSerializer doesn't use any characters that need to be encoded.
However if you try to use a custom serializer that does then it breaks.

PostgreSQL on the other hand does not work - because the Bytea type
is passed a String rather than a Type::Binary::Data to deserialize, it
attempts to unescape the data and either mangles it or raises an error
if it contains null bytes.

The commit fixes the issue, by reserializing the data after
encryption and decryption. For text data that's a no-op, but for binary
data we'll convert it back to a Type::Binary::Data.
Calling `serialize` is not always possible, because the column type
might not expect to be serializing a String, for example when declared
as serialzed or store attribute.

With binary data the encryptor was passed an
`ActiveModel::Type::Binary::Data`` and returned a `String``. In order to
remain transparent we need to turn the data back into a
`ActiveModel::Type::Binary::Data` before passing it on.

We'll also rename `serialize`` to `text_to_database_type` to be a bit
more descriptive.
@jeremy jeremy merged commit 7d4a39c into rails:main Feb 15, 2024
4 checks passed
@jeremy jeremy added this to the 7.2.0 milestone Feb 15, 2024
Ridhwana pushed a commit to Ridhwana/rails that referenced this pull request Mar 7, 2024
* Support encrypting binary columns

ActiveRecord Encryption doesn't prevent you from encrypting binary
columns but it doesn't have proper support for it either.

When the data is fed through encrypt/decrypt it is converted to a
String. This means that the the encryption layer is not transparent
to binary data - which should be passed as Type::Binary::Data.

As a result the data is not properly escaped in the SQL queries or
deserialized correctly after decryption.

However it just happens to work fine for MySQL and SQLite because the
MessageSerializer doesn't use any characters that need to be encoded.
However if you try to use a custom serializer that does then it breaks.

PostgreSQL on the other hand does not work - because the Bytea type
is passed a String rather than a Type::Binary::Data to deserialize, it
attempts to unescape the data and either mangles it or raises an error
if it contains null bytes.

The commit fixes the issue, by reserializing the data after
encryption and decryption. For text data that's a no-op, but for binary
data we'll convert it back to a Type::Binary::Data.

* Extract decrypt_as_text/encrypt_as_text

* Handle serialized binary data in encrypted columns

Calling `serialize` is not always possible, because the column type
might not expect to be serializing a String, for example when declared
as serialzed or store attribute.

With binary data the encryptor was passed an
`ActiveModel::Type::Binary::Data`` and returned a `String``. In order to
remain transparent we need to turn the data back into a
`ActiveModel::Type::Binary::Data` before passing it on.

We'll also rename `serialize`` to `text_to_database_type` to be a bit
more descriptive.
viralpraxis pushed a commit to viralpraxis/rails that referenced this pull request Mar 24, 2024
* Support encrypting binary columns

ActiveRecord Encryption doesn't prevent you from encrypting binary
columns but it doesn't have proper support for it either.

When the data is fed through encrypt/decrypt it is converted to a
String. This means that the the encryption layer is not transparent
to binary data - which should be passed as Type::Binary::Data.

As a result the data is not properly escaped in the SQL queries or
deserialized correctly after decryption.

However it just happens to work fine for MySQL and SQLite because the
MessageSerializer doesn't use any characters that need to be encoded.
However if you try to use a custom serializer that does then it breaks.

PostgreSQL on the other hand does not work - because the Bytea type
is passed a String rather than a Type::Binary::Data to deserialize, it
attempts to unescape the data and either mangles it or raises an error
if it contains null bytes.

The commit fixes the issue, by reserializing the data after
encryption and decryption. For text data that's a no-op, but for binary
data we'll convert it back to a Type::Binary::Data.

* Extract decrypt_as_text/encrypt_as_text

* Handle serialized binary data in encrypted columns

Calling `serialize` is not always possible, because the column type
might not expect to be serializing a String, for example when declared
as serialzed or store attribute.

With binary data the encryptor was passed an
`ActiveModel::Type::Binary::Data`` and returned a `String``. In order to
remain transparent we need to turn the data back into a
`ActiveModel::Type::Binary::Data` before passing it on.

We'll also rename `serialize`` to `text_to_database_type` to be a bit
more descriptive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants