Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New enforcement of String over LargeBinary on EncryptedType breaks existing implementations #444

Open
leosussan opened this issue May 7, 2020 · 6 comments

Comments

@leosussan
Copy link

leosussan commented May 7, 2020

See my comment on #426, which is when this was introduced.

This seems to have broken things for me & probably will for others.

In my implementation (PostgreSQL 11, using FernetEngine), up to this point, encrypted values are stored as a bytea column, which is serialized into a bytes in Python prior to encryption. bytes has no .encode() function and subsequently breaks here.

This would conceivably break on any non-string object when extracted from the database (e.g. the sa.Unicode implementation in the docs for EncryptedType).

Moving .encode() to the str() step & removing from value fixes extraction.

        if isinstance(value, six.text_type):
            value = str(value).encode()
        decrypted = self.fernet.decrypt(value)

This doesn't solve the problem of encryption into the DB, which expects bytea, not a string.

@leosussan
Copy link
Author

leosussan commented May 11, 2020

Continuing the conversation from #426 (comment):

I personally believe that your change is a good one, for the reasons you've outlined when you started this PR. To add to this, at least in Postgres, bytea is generally less performative than text when storing encoded data.

But - at the very least, the documentation / changelog should make clear that the change will break existing implementation.

For the sake of continuity, here's my current workaround. It works by subclassing EncryptedType, redefining impl, and reversing the adjustments:

class CustomEncryptedType(EncryptedType):
    impl = LargeBinary

    def process_bind_param(self, value, dialect):
        value = super().process_bind_param(value=value, dialect=dialect)
        if isinstance(value, str):
            value: bytes = value.encode()
        return value

    def process_result_value(self, value, dialect):
        if isinstance(value, bytes):
            value: str = value.decode()
        value: Optional[Any] = super().process_result_value(
            value=value, dialect=dialect
        )
        return value

It might make sense to introduce this as a legacy option, e.g. LegacyEncryptedType, but urge users of EncryptedType to write a migration for their respective databases.

@villebro
Copy link

The change introduced by #426 breaks alembic migration code that works on MySQL on sqlalchemy-utils==0.36.5:

import sqlalchemy as sa
from alembic import op
from sqlalchemy_utils import EncryptedType

...

def upgrade():
    op.add_column(
        "dbs", sa.Column("password", EncryptedType(sa.String(1024)), nullable=True)
    )
  File "/opt/hostedtoolcache/Python/3.6.10/x64/lib/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 2047, in visit_VARCHAR
    "VARCHAR requires a length on dialect %s" % self.dialect.name
sqlalchemy.exc.CompileError: VARCHAR requires a length on dialect mysql

@instantlinux
Copy link

instantlinux commented May 18, 2020

Ditto this for me: this is a breaking change that would require a semver update and doc on what to do instead. I get the following stacktrace on a migration that works fine with 0.36.4 on MariaDB versions 10.3.x and 10.4.x:

Traceback (most recent call last):
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 2897, in visit_create_table
    create_column, first_pk=column.primary_key and not first_pk
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 350, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py", line 95, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 2930, in visit_create_column
    text = self.get_column_specification(column, first_pk=first_pk)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 1588, in get_column_specification
    column.type, type_expression=column
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 400, in process
    return type_._compiler_dispatch(self, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py", line 95, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 3440, in visit_type_decorator
    return self.process(type_.type_engine(self.dialect), **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 400, in process
    return type_._compiler_dispatch(self, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py", line 95, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 3418, in visit_string
    return self.visit_VARCHAR(type_, **kw)
  File "/home/richb/apicrud/python_env/lib/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 2061, in visit_VARCHAR
    "VARCHAR requires a length on dialect %s" % self.dialect.name
sqlalchemy.exc.CompileError: VARCHAR requires a length on dialect mysql

The column that triggers this error is defined thus:

    sa.Column('password', EncryptedType(sa.String(length=77)),
              nullable=False),

Please revert the change, fix it in a way that doesn't break this straightforward case, or provide doc on what I need to change this to.

@villebro
Copy link

villebro commented May 18, 2020

I agree with @instantlinux , this really should be reverted in 0.36.x and reintroduced in 0.37, preferably with some additional updating notes.

@rushilsrivastava
Copy link
Contributor

rushilsrivastava commented May 18, 2020

Here to iterate the same thing as @instantlinux, this has broken our existing migrations and it would be great to have some updating notes before adding such a change. @villebro Have you had any luck fixing the migration problems for MySQL?

@Lawouach
Copy link

Lawouach commented May 25, 2020

Hello,

I was surprised by this change but I appreciate this was quickly addressed. However, 0.36.6 is still broken compared to 0.36.4 on my code base when using (PostgreSQL 11).

Reverting to 0.36.4 works fine.

So the question is, should I pin to 0.36.4 for now or was 0.36.6 meant to work?

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants