Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding suport to MariaDB's official Connector #5459

Closed
honglei opened this issue Jul 14, 2020 · 22 comments
Closed

Consider adding suport to MariaDB's official Connector #5459

honglei opened this issue Jul 14, 2020 · 22 comments
Assignees
Labels
external driver issues the issue involves a misbehavior on the part of the DBAPI itself, probably not SQLAlchemy feature mariadb mysql PRs (with tests!) welcome a fix or feature which is appropriate to be implemented by volunteers
Milestone

Comments

@honglei
Copy link

honglei commented Jul 14, 2020

@honglei honglei added the requires triage New issue that requires categorization label Jul 14, 2020
@zzzeek
Copy link
Member

zzzeek commented Jul 14, 2020

hey there -

Is there some behavior or feature that the current mysqlclient does not work for? we've already had a very difficult time supporting MySQL-connector. what does the project gain by supporting another driver?

@zzzeek zzzeek added external driver issues the issue involves a misbehavior on the part of the DBAPI itself, probably not SQLAlchemy feature PRs (with tests!) welcome a fix or feature which is appropriate to be implemented by volunteers mysql and removed requires triage New issue that requires categorization labels Jul 14, 2020
@zzzeek zzzeek added this to the 1.x.xx milestone Jul 14, 2020
@zzzeek
Copy link
Member

zzzeek commented Jul 14, 2020

also it's likely this could be achieved with a simple subclass around the existing mysql-connector dialect. Feel free to provide a PR.

@honglei
Copy link
Author

honglei commented Jul 14, 2020

@CaselIT
Copy link
Member

CaselIT commented Jul 14, 2020

Judging by the note on the existing mysql connector, maybe it's more appropriate if an external dialect for mariadb and/or mysql connectors were created?

@zzzeek
Copy link
Member

zzzeek commented Jul 14, 2020

Maybe this can give some explain why they write a new connector:
https://mariadb.com/resources/blog/mariadb-connector-python-beta-now-available/

Benchmarks:
https://github.com/mariadb-corporation/mariadb-connector-python/tree/master/benchmarks

sure, they benched against mysql-connector, which in my experience is not a very good driver. Do they have benchmarks against mysqlclient ? this is the driver that everyone uses. This whole "mysql-connector-python" thing was not actually necessary except oracle / mariadb are trying to compete with each other.

@zzzeek
Copy link
Member

zzzeek commented Jul 14, 2020

Judging by the note on the existing mysql connector, maybe it's more appropriate if an external dialect for mariadb and/or mysql connectors were created?

I think this horse left the barn years ago and we're sort of stuck with them. I try every few years to get the test suite to pass for mysql-connector and while it improves, it still remains problematic.

@LinuxJedi
Copy link

LinuxJedi commented Jul 15, 2020

Hi,

To answer one of the questions here: Whilst for now the connectors are practically interchangeable, MySQL and MariaDB are heading in very different directions for protocol and API. MariaDB and MySQL themselves are no longer drop-in compatible and have not been for a while, they as wider projects are heading in very different directions with different target audiences.

If there is anything else you need help from us on please let me know.

Edit: we are tracking this here: https://jira.mariadb.org/browse/CONPY-77

(Andrew Hutchings, Engineering Manager for MariaDB Connectors)

@9EOR9
Copy link
Contributor

9EOR9 commented Jul 15, 2020

Attached is a benchmark mariadb vs. pymysql/mysqlclient (0.9.3)
pymysql_mariadb_bench.txt

It was running on my local machine, difference for bulk should be much higher when running client and server on different machines.

(Georg Richter, Author and Maintainer of MariaDB Connector/Python)

@zzzeek
Copy link
Member

zzzeek commented Jul 15, 2020

Hi,

To answer one of the questions here: Whilst for now the connectors are practically interchangeable, MySQL and MariaDB are heading in very different directions for protocol and API. MariaDB and MySQL themselves are no longer drop-in compatible and have not been for a while, they as wider projects are heading in very different directions with different target audiences.

I work for Red Hat with Openstack so I work with MariaDB extensively. Does MariaDB have on its roadmap that it would no longer be compatible with common Python MySQL drivers like mysqlclient and pymysql? If support for the latter driver is being dropped, then I would as part of my job take on getting this to be supported fully as we use pymysql right now. it would also impact other elements of RH Openstack as I assume we would no longer be able to deploy with eventlet monkeypatching, or otherwise mariadb-connector would need some "async" option that works with gevent/eventlet.

If there is anything else you need help from us on please let me know.

A short PR can provide a working version of this dialect pretty easily, it would be a subclass of one of the existing dialects as you see for mysql-connector (slightly longer as it subclasses the base dialect): https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/mysql/mysqlconnector.py or pymysql (very short because it acts the same as mysqlclient): https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/mysql/pymysql.py

the next step is to get tests to pass. For third party dialects, they need to pass a subset of tests known as the "dialect compliance suite" which is documented at https://github.com/sqlalchemy/sqlalchemy/blob/master/README.dialects.rst, however when the dialect is part of SQLAlchemy we at least run it on a larger subset of tests known as "backend", which covers the dialect compliance suite and then about 20% of the other tests for SQLAlchemy Core / ORM. Core/ORM tests that aren't part of "backend" are generally testing SQLAlchemy internals itself and are not exercising anything within a dialect that isn't covered by the "backend" subset.

Background on running the test suite is at https://github.com/sqlalchemy/sqlalchemy/blob/master/README.unittests.rst . The command for example to run against pymysql, only running the "backend" tests:

pytest --dburi "mysql+pymsql://user:pass@host/test?charset=utf8mb4" --backend-only -n4

Getting mysql-connector to pass tests ultimately was never completed because the driver was buggy for many years, and while they made improvements, my most recent attempts a couple of years ago I was still having issues with it crashing (e.g. at the C code level). So in the case of mariadb-connector, I hope the architecture is off to a better start, and certainly mariadb-connector would benefit also from getting SQLAlchemy's test suite to pass, not just because it grants SQLAlchemy compatibility but also because we have thousands of tests and we find bugs in DBAPIs all the time that then get fixed.

If you can clarify whether or not MariaDB is going to continue supporting the protocols that PyMySQL is using, or if the protocol changes will be small enough that PyMySQL will be able to accommodate or if indeed RHOS will have to change to this driver in any case, that would be helpful.

Edit: we are tracking this here: https://jira.mariadb.org/browse/CONPY-77

OK that is great!

@zzzeek
Copy link
Member

zzzeek commented Jul 15, 2020

Attached is a benchmark mariadb vs. pymysql/mysqlclient (0.9.3)
pymysql_mariadb_bench.txt

It was running on my local machine, difference for bulk should be much higher when running client and server on different machines.

OK there are two drivers that are part of the "pymysql" project, there is mysqlclient which is at version 2.0.1 right now and is native: https://pypi.org/project/mysqlclient/ it's a thin wrapper over the mysql native client library so it tends to be as fast as you can get, and there is pymysql which is at 0.9.3 and is pure Python. The latter is not actually very fast. it's popular because it can be monkeypatched to work with gevent/eventlet whereas the former cannot.

@gordthompson
Copy link
Member

PyMySQL may also gain some popularity because mysqlclient only has wheels for Python 3.6/7/8 on 64-bit Windows. Compiling it on Linux would not likely be a big deal — although I've never tried — but in any case lots of people might not want to be bothered if a SQLA-supported pure Python alternative is available (and they aren't aware of the performance difference).

Downloads last month (from pypistats.org):
pymysql -- 5,041,269
mysqlclient -- 2,439,657

@LinuxJedi
Copy link

@zzzeek We cannot guarantee that MySQL won't break protocol/API compatibility and we are not intending to support Protocol X. I don't envisage this breakage happening but any newer extensions either of us make will likely not be compatible with the other. You may already find issues with certain MySQL 8.0 auth types. I don't see us breaking protocol compatibility with at least the 5.x series where we can help it.

We do intend to maintain API compatibility with what is there now to aid with migration. But both MySQL and MariaDB are likely to make extensions (such as our bulk insert prepared statement API).

When I worked on OpenStack we were using pymysql for monkey patching so completely understand the popularity.

Thanks for all the info.

@9EOR9
Copy link
Contributor

9EOR9 commented Jul 24, 2020

I'm just playing with a prototype dialect (I added mariadb.py into dialect/mysql) and have a few quesitions:

  1. I wonder about the definition of server side cursors: In MariaDB Connector/Python you can have 3 types of cursors:
  • unbuffered: data is on the wire and will be read into memory when fetching
  • buffered: entire data is read into memory
  • server side cursor: server sends data for each fetch() command. This will allow you to use several cursors per connection.

According to the code in pymysql and MySQLdb the interpretation seems to be different, since they interpret a server side cursor as an unbuffered result set. So what is the correct behavior for SQLAlchemy?

  1. What is the correct syntax for running the dialect test? I tried
    pytest --dburi "mysql+mariadb://root:@127.0.0.1/test" ./test/dialect/mysql/test_dialect.py --backend-only but I'm getting 4 errors due to missing tables t and t_default.

@CaselIT
Copy link
Member

CaselIT commented Jul 24, 2020

I can only answer 2. Refer to README.unittests.rst for details regarding unittests. In general for mysql this command is required to setup the db environment

GRANT ALL ON *.* TO scott@'%'; -- only if using the default scott:tiger as logins
CREATE DATABASE test_schema CHARSET utf8mb4;
CREATE DATABASE test_schema_2 CHARSET utf8mb4;

I'll leave 1 to Mike

@zzzeek
Copy link
Member

zzzeek commented Jul 25, 2020

@9EOR9 SQLAlchemy does not expose the cursor explicitly so there's not much that happens with a so-called "server side" cursor other than there is a natural unbuffered effect with the rows themselves. since this is what the SSCursor in mysqlclient and pymysql does, it would be least surprising to get as close to that behavior right now. if someone wanted to use multiple server side cursors simultaneously they'd need to use the mariadb-connector connection directly for that in any case.

@9EOR9
Copy link
Contributor

9EOR9 commented Aug 14, 2020

Hi,

for testing purposes I forked the SQLalchemy repo and added a mariadb dialect.

Current status (testing against MariaDB Connector/Python 1.0.1 master branch from github):

backend-only:
3 failed, 2353 passed, 809 skipped in 181.47s (0:03:01)

running entire test suite:
8 failed, 11403 passed, 1089 skipped in 1262.79s (0:21:02)

1 failing test affects MariaDB Server (MDEV-23481).

@zzzeek
Copy link
Member

zzzeek commented Aug 14, 2020

that's very good! I am revising the system by which our CI can name drivers to be tested so we will be able to add this connector to CI.

@9EOR9
Copy link
Contributor

9EOR9 commented Aug 14, 2020

Would anyone be able to help me to fix the remaining tests? For --backend-only these are

FAILED test/engine/test_execute.py::ExecuteTest_mysql+mariadb_10_5_4_MariaDB::test_stmt_exception_pickleable_plus_dbapi
FAILED test/dialect/test_suite.py::TimeTest_mysql+mariadb_10_5_4_MariaDB::test_round_trip
FAILED test/dialect/mysql/test_types.py::TypeRoundTripTest_mysql+mariadb_10_5_4_MariaDB::test_time_roundtrip

@zzzeek
Copy link
Member

zzzeek commented Aug 14, 2020

absolutely, if you can send a PR I can integrate it into CI and fix those up. the "pickleable" thing is probably not needed, need to look at what that is.

@9EOR9
Copy link
Contributor

9EOR9 commented Aug 15, 2020

Done!

@gordthompson
Copy link
Member

I notice that the PyPI package name for "MariaDB Connector/Python" is simply "mariadb", suggesting that the URI prefix would be mariadb+mariadb://. That is, the "dialect" and the "driver" would have the same name.

I don't think that would be a problem technically, but some folks might find it a bit confusing. Perhaps make "mariadb" the default driver for the dialect so people could just use mariadb:// ...?

@gordthompson gordthompson linked a pull request Aug 17, 2020 that will close this issue
@sqla-tester
Copy link
Collaborator

Georg Richter has proposed a fix for this issue in the master branch:

MariaDB dialect implementation https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/2166

@gordthompson gordthompson self-assigned this Aug 20, 2020
@gordthompson gordthompson modified the milestones: 1.x.xx, 1.4 Aug 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external driver issues the issue involves a misbehavior on the part of the DBAPI itself, probably not SQLAlchemy feature mariadb mysql PRs (with tests!) welcome a fix or feature which is appropriate to be implemented by volunteers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants