New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support encrypting binary columns #50920
Conversation
activerecord/lib/active_record/encryption/encrypted_attribute_type.rb
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great one @djmb. Left a minor suggestion for your consideration.
I'd also link this PR with the PostgreSQL issue. Will this fix it out of the box?
activerecord/lib/active_record/encryption/encrypted_attribute_type.rb
Outdated
Show resolved
Hide resolved
1b296d6
to
16330f1
Compare
Thanks @jorgemanrubia! There isn't a Rails issue for this - one was raised on the SolidCache repo though (rails/solid_cache#102) |
@@ -150,6 +158,14 @@ def decryption_options | |||
def clean_text_scheme | |||
@clean_text_scheme ||= ActiveRecord::Encryption::Scheme.new(downcase: downcase?, encryptor: ActiveRecord::Encryption::NullEncryptor.new) | |||
end | |||
|
|||
def reserialize(value) | |||
if cast_type.binary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the lone case needing reserialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Text and binary columns are the only ones able to store the encrypted JSON blobs.
I checked whether we could always serialize here instead, in the hope that it would be a no-op for String columns, but type serialization is much more complicated than that - it might be converting serialized or store values for instance.
I've changed this in 38b7df7, now we just wrap strings as Binary::Data, as that's actually reversing what the encryptor does. I've also renamed the method as "serialize" is overloaded with meanings here.
ActiveRecord Encryption doesn't prevent you from encrypting binary columns but it doesn't have proper support for it either. When the data is fed through encrypt/decrypt it is converted to a String. This means that the the encryption layer is not transparent to binary data - which should be passed as Type::Binary::Data. As a result the data is not properly escaped in the SQL queries or deserialized correctly after decryption. However it just happens to work fine for MySQL and SQLite because the MessageSerializer doesn't use any characters that need to be encoded. However if you try to use a custom serializer that does then it breaks. PostgreSQL on the other hand does not work - because the Bytea type is passed a String rather than a Type::Binary::Data to deserialize, it attempts to unescape the data and either mangles it or raises an error if it contains null bytes. The commit fixes the issue, by reserializing the data after encryption and decryption. For text data that's a no-op, but for binary data we'll convert it back to a Type::Binary::Data.
684d86f
to
085af11
Compare
Calling `serialize` is not always possible, because the column type might not expect to be serializing a String, for example when declared as serialzed or store attribute. With binary data the encryptor was passed an `ActiveModel::Type::Binary::Data`` and returned a `String``. In order to remain transparent we need to turn the data back into a `ActiveModel::Type::Binary::Data` before passing it on. We'll also rename `serialize`` to `text_to_database_type` to be a bit more descriptive.
085af11
to
38b7df7
Compare
* Support encrypting binary columns ActiveRecord Encryption doesn't prevent you from encrypting binary columns but it doesn't have proper support for it either. When the data is fed through encrypt/decrypt it is converted to a String. This means that the the encryption layer is not transparent to binary data - which should be passed as Type::Binary::Data. As a result the data is not properly escaped in the SQL queries or deserialized correctly after decryption. However it just happens to work fine for MySQL and SQLite because the MessageSerializer doesn't use any characters that need to be encoded. However if you try to use a custom serializer that does then it breaks. PostgreSQL on the other hand does not work - because the Bytea type is passed a String rather than a Type::Binary::Data to deserialize, it attempts to unescape the data and either mangles it or raises an error if it contains null bytes. The commit fixes the issue, by reserializing the data after encryption and decryption. For text data that's a no-op, but for binary data we'll convert it back to a Type::Binary::Data. * Extract decrypt_as_text/encrypt_as_text * Handle serialized binary data in encrypted columns Calling `serialize` is not always possible, because the column type might not expect to be serializing a String, for example when declared as serialzed or store attribute. With binary data the encryptor was passed an `ActiveModel::Type::Binary::Data`` and returned a `String``. In order to remain transparent we need to turn the data back into a `ActiveModel::Type::Binary::Data` before passing it on. We'll also rename `serialize`` to `text_to_database_type` to be a bit more descriptive.
* Support encrypting binary columns ActiveRecord Encryption doesn't prevent you from encrypting binary columns but it doesn't have proper support for it either. When the data is fed through encrypt/decrypt it is converted to a String. This means that the the encryption layer is not transparent to binary data - which should be passed as Type::Binary::Data. As a result the data is not properly escaped in the SQL queries or deserialized correctly after decryption. However it just happens to work fine for MySQL and SQLite because the MessageSerializer doesn't use any characters that need to be encoded. However if you try to use a custom serializer that does then it breaks. PostgreSQL on the other hand does not work - because the Bytea type is passed a String rather than a Type::Binary::Data to deserialize, it attempts to unescape the data and either mangles it or raises an error if it contains null bytes. The commit fixes the issue, by reserializing the data after encryption and decryption. For text data that's a no-op, but for binary data we'll convert it back to a Type::Binary::Data. * Extract decrypt_as_text/encrypt_as_text * Handle serialized binary data in encrypted columns Calling `serialize` is not always possible, because the column type might not expect to be serializing a String, for example when declared as serialzed or store attribute. With binary data the encryptor was passed an `ActiveModel::Type::Binary::Data`` and returned a `String``. In order to remain transparent we need to turn the data back into a `ActiveModel::Type::Binary::Data` before passing it on. We'll also rename `serialize`` to `text_to_database_type` to be a bit more descriptive.
Motivation / Background
ActiveRecord Encryption doesn't prevent you from encrypting binary columns but it doesn't have proper support for it either.
When the data is fed through encrypt/decrypt it is converted to a String. This means that the the encryption layer is not transparent to binary data - which should be passed as Type::Binary::Data.
As a result the data is not properly escaped in the SQL queries or deserialized correctly after decryption.
However it just happens to work fine for MySQL and SQLite because the MessageSerializer doesn't use any characters that need to be encoded. However if you try to use a custom serializer that does then it breaks.
PostgreSQL on the other hand does not work - because the Bytea type is passed a String rather than a Type::Binary::Data to deserialize, it attempts to unescape the data and either mangles it or raises an error if it contains null bytes.
Detail
The commit fixes the issue, by reserializing the data after encryption and decryption. For text data that's a no-op, but for binary data we'll convert it back to a Type::Binary::Data.