New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use _binary prefix for bytes/bytearray parameters #106

Merged
merged 1 commit into from Sep 11, 2016

Conversation

Projects
None yet
7 participants
@vtermanis
Copy link
Contributor

vtermanis commented Jul 27, 2016

PyMySQL already does this via escape_bytes() converters.py. For compatibility maybe this is a good idea? (I think this would behave the same as with PyMySQL in that nothing it only affects bytes~~/bytearray~~ + Python 3 and bytearray for both versions.)

I think it's not possible to add to Connection.encoders (via constructor) because e.g. self.encoders[bytes] is overridden after the conv argument is parsed.

@methane

This comment has been minimized.

Copy link
Member

methane commented Jul 28, 2016

It's not compatible with PyMySQL.
In PyMySQL, bytearray has _binary prefix on Python 2 too.

@vtermanis vtermanis force-pushed the vtermanis:bytes_binary_prefix branch from 8ab1e6f to 121af51 Jul 28, 2016

@vtermanis vtermanis changed the title Use _binary prefix for bytes/bytearray parameters (Python 3) Use _binary prefix for bytes/bytearray parameters Jul 28, 2016

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Jul 28, 2016

Ah yes, you're right! How about now?

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Jul 29, 2016

Sorry, I don't quite follow: Are you saying the Binary(x) function should also be available in MySQLdb, e.g. put here?
Is there something else I've missed?

@methane

This comment has been minimized.

Copy link
Member

methane commented Jul 29, 2016

First of all, module.Binary(str) is defined in PEP 249.
https://www.python.org/dev/peps/pep-0249/#binary

So, cur.execute("INSRET INTO blob_data (data) VALUES (%s)", (MySQLdb.Binary(x),)) should insert binary data without any warnings.

Currently, MySQLdb.Binary() is just calls bytes(). To add _binary prefix, it should be bytearray() on PY2.

@vtermanis vtermanis force-pushed the vtermanis:bytes_binary_prefix branch from 121af51 to ac63076 Jul 29, 2016

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Jul 29, 2016

Ah ok, I hadn't realised Binary was part of the PEP. It makes sense now.

Edit: Forgot to say I've updated the commit to include the Binary() change - hopefully this is now compatible.

@ushuz

This comment has been minimized.

Copy link

ushuz commented Sep 9, 2016

what's the status of this PR? we encountered lots of warnings after upgrading MySQL, this cloud fix it

ping @methane

@methane methane merged commit 2617620 into PyMySQL:master Sep 11, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 11, 2016

Sorry. I'm very busy for other projects. (like new dict impl in Python 3.6).

@vtermanis vtermanis deleted the vtermanis:bytes_binary_prefix branch Sep 12, 2016

@timgraham

This comment has been minimized.

Copy link

timgraham commented Sep 29, 2016

It seems this is backwards-incompatible or buggy. It broke some tests for Django on Python 3.

edit: I found this is caused by Django's redundant handling of the issue: django/django#7324.

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

I have created issue #123 to track the regression.

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

Apart from the regression tracked in #123, this commit is suddenly breaking all client apps who were adding the binary_ prefix themselves until now.
For those who were adding the prefix in the args parameters of execute, it should be possible to test for an existing prefix in _get_bytes_literal to keep backwards compatibility.
However for those (like Django) who were adding the prefix in the query part, the compatibility path is more difficult to achieve (if possible at all).

@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 30, 2016

For those who were adding the prefix in the args parameters of execute

It can't possible.

bindata = b'xyzzy'
cursor.execute("SELECT %s", (b'_binary' + bindata))  # => SELECT "_binaryxyzzy"
@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 30, 2016

However for those (like Django) who were adding the prefix in the query part, the compatibility path is more difficult to achieve (if possible at all).

Perfect compatibility is not possible, I think.

I want to consult with developer of SQLAlchemy (@zzzeek) and Django ORM (who?)

@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 30, 2016

I'm thinking about revert this and release 1.3.9 soon.
This type of change should not be included in micro version. 1.4.0 or 2.0.0 maybe.

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

This would be much appreciated (from Django's point of view).
A compatibility plan would still be great for the future, like for example a connection property telling if _binary is automatically prepended or not, so we can check that property now when preparing a query.

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Sep 30, 2016

Sorry - I originally suggested this change so am responsible for it not working properly - many apologies!

I personally think that clients should not themselves add the _binary. So, would #124 not provide a valid solution (based on what @claudep suggested in #123) since it now should be in line with what PyMySQL does?

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

Vilnis, that would only fix one part of the regression. In Django, we are currently adding the _binary prefix ourselves when appropriate, so with 1.3.8 it suddenly breaks with a double _binary.

@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 30, 2016

I personally think that clients should not themselves add the _binary.

I don't have right answer about this.
But there are some person (or frameworks like Django) adding _binary prefix in query already.
micro version should not break backward compatibility for such users.

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Sep 30, 2016

so with 1.3.8 it suddenly breaks with a double _binary.

Could new versions of Django not require a minimum mysqlclient version to address this?

Anyway, I do understand that of course it should be backwards compatible. Please do feel free to revert. (At the time of me suggesting the change I was trying to make PyMySQL and mysqlclient be more compatible with each other, that's all. (I assume Django doesn't have PyMySQL as a library option?)

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

If you set now a property like connection.auto_prepend_binary_prefix = False, we can change our code to conditionally adding the prefix depending on that property. Then when you automatically add the prefix in a later major revision (1.4 or 2.0), our code will continue to work.

@claudep

This comment has been minimized.

Copy link

claudep commented Sep 30, 2016

Could new versions of django not require a minimum mysqlclient version to address this?

We could do that for new versions, but all older projects will break if they set mysqlclient >= 1.3.7 in their requirements, for example.

@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Sep 30, 2016

Ah yes, I see now - that all makes sense. I'll shut up and let @methane handle this since I'm just a minor contributor.

.. if everyone likes the new property approach (with default being old behaviour), as suggested by @claudep, I'm happy to turn it into a review-able pull request. (It might be nice if users from #81 and #106 could at least choose to continue to use the new behaviour, in line with PyMySQL)

@zzzeek

This comment has been minimized.

Copy link

zzzeek commented Sep 30, 2016

I'm not sure just by looking at this PR because I don't know the internal flow of these symbols, on the outside we only refer to dbapi.Binary() when we are passing in a binary value to a binary data type. Someone would have to run SQLA's test suite against this version to know for sure, i may or may not have time to do that today.

@methane

This comment has been minimized.

Copy link
Member

methane commented Sep 30, 2016

If you set now a property like connection.auto_prepend_binary_prefix = False, we can change our code to conditionally adding the prefix depending on that property. Then when you automatically add the prefix in a later major revision (1.4 or 2.0), our code will continue to work.

While I don't like options, I agree that adding binary_prefix=False option.

But for now, to fix backward incompatibility quickly, I just revert prefix and released 1.3.9.

@timgraham

This comment has been minimized.

Copy link

timgraham commented Sep 30, 2016

Thank you for resolving it quickly.

@benhoyt

This comment has been minimized.

Copy link

benhoyt commented Oct 20, 2016

@vtermanis It'd be great to re-open this with the right fix. I think mysqlclient should do it automatically, but I get the backwards-compatibility concerns, so makes sense to have a binary_prefix config option. Just for reference (for others who come here), I'm working around it in my SQLAlchemy/mysqlclient application with the following SQLAlchemy event hook to override the connection's encoders[bytes] value (this is Python 3-only code).

def bytes_encoder(b, dummy=None):
    return b'_binary' + string_literal(b)

def str_encoder(s, dummy=None):
    return string_literal(str(s).encode('utf8'))

@sqlalchemy.event.listens_for(Pool, 'connect')
def set_encoders(dbapi_conn, conn_record):
    dbapi_conn.encoders[bytes] = bytes_encoder
    dbapi_conn.encoders[bytearray] = bytes_encoder
    dbapi_conn.encoders[str] = str_encoder
@vtermanis

This comment has been minimized.

Copy link
Contributor

vtermanis commented Oct 21, 2016

I'm happy to make another attempt with config option, unless someone devs/maintainers of this project would prefer me not to. @methane: Should I:

  1. Make a new pull request for this or
  2. Use this pull request and add additional commit(s) or
  3. Squash commits in this pull request with changes

vtermanis added a commit to vtermanis/mysqlclient-python that referenced this pull request Jan 7, 2017

Use _binary prefix for bytes/bytearray parameters
- Based on PyMySQL#106 but now disabled by default
- Can be enabled via 'binary_prefix' connection parameter
- Added unit tests to verify behaviour

fix falsely prefixing strings with _binary type identifier (PyMySQL#123)

vtermanis added a commit to vtermanis/mysqlclient-python that referenced this pull request Jan 7, 2017

Use _binary prefix for bytes/bytearray parameters
- Based on PyMySQL#106 but now disabled by default
- Can be enabled via 'binary_prefix' connection parameter
- Added unit tests to verify behaviour

methane added a commit that referenced this pull request Feb 10, 2017

Use _binary prefix for bytes/bytearray parameters (#140)
- Based on #106 but now disabled by default
- Can be enabled via 'binary_prefix' connection parameter
- Added unit tests to verify behaviour
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment