Create RedshiftDialectMixin class. Add Psycopg2CFFIRedshiftDialect#231
Create RedshiftDialectMixin class. Add Psycopg2CFFIRedshiftDialect#231jklukas merged 21 commits intosqlalchemy-redshift:mainfrom
Conversation
jklukas
left a comment
There was a problem hiding this comment.
From a first read-through, this looks pretty comprehensive. I'd like to see some documentation about the existence of the CFFI variant, and I left some comments about naming. I wish I were better acquainted with modern sqlalchemy conventions to make stronger suggestions on naming.
Perhaps @zzzeek would be willing to do a quick drive-by about how the class naming and registration looks here.
|
|
||
| registry.register("redshift", "sqlalchemy_redshift.dialect", "RedshiftDialect") | ||
| registry.register( | ||
| "redshift.psycopg2", "sqlalchemy_redshift.dialect", "RedshiftDialect" |
There was a problem hiding this comment.
Do we have any idea whether removing RedshiftDialect could affect existing user code? It's not clear to me where this name would get referenced.
There was a problem hiding this comment.
For users directly using RedshiftDialect class, this would be a breaking change. Searching GitHub, I found a number of projects reference this class directly who would impacted. see here. So we probably want to take a different approach here.
One option to avoid a breaking change here could be to re-add RedshiftDialect having it inherit from Pyscopg2RedshiftDialect. Or we simply re-name Pyscopg2RedshiftDialect -> RedshiftDialect.
There was a problem hiding this comment.
I do like modernizing the names here, so I think it's good to add some shim for compatibility rather than staying stuck with the old name.
I think we can get away with something like this in dialect.py:
# Add RedshiftDialect synonym for backwards compatibility.
RedshiftDialect = RedshiftDialect_psycopg2
sqlalchemy_redshift/dialect.py
Outdated
| return all_constraints | ||
|
|
||
|
|
||
| class PsycopgRedshiftDialectMixin(RedshiftDialectMixin): |
There was a problem hiding this comment.
| class PsycopgRedshiftDialectMixin(RedshiftDialectMixin): | |
| class Psycopg2RedshiftDialectMixin(RedshiftDialectMixin): |
I assume that the missing "2" here is a typo, but let me know if there's additional nuance here.
There was a problem hiding this comment.
good catch -- will fix this
sqlalchemy_redshift/dialect.py
Outdated
|
|
||
| class PsycopgRedshiftDialectMixin(RedshiftDialectMixin): | ||
| """ | ||
| Define Psycopg specific behavior. |
There was a problem hiding this comment.
| Define Psycopg specific behavior. | |
| Define behavior specific to ``psycopg2``. |
sqlalchemy_redshift/dialect.py
Outdated
| from sqlalchemy.dialects.postgresql.psycopg2 import PGDialect_psycopg2 | ||
| from sqlalchemy.dialects.postgresql import ( | ||
| psycopg2, psycopg2cffi |
There was a problem hiding this comment.
Are PGDialect_psycopg2 and psycopg2 synonyms? Do we know why the longer name exists?
There was a problem hiding this comment.
Are PGDialect_psycopg2 and psycopg2 synonyms?
From Sqlalchemy source PGDialect_psycopg2,sqlalchemy.dialects.postrgresql.psycopg2.dialect, and sqlalchemy-dialects.postgres.__init__.py, I believe so.
Do we know why the longer name exists?
The name still exists. There is an equivelent name for PGDialect_psycopg2cffi. I can modify this import to be the following if it's preferred:
from sqlalchemy.dialects.postgresql.psycopg2 import PGDialect_psycopg2
from sqlalchemy.dialects.postgresql.psycopg2cffi import PGDialect_psycopg2cffiThere was a problem hiding this comment.
I see now that you inherit from psycopg2.dialect later on, and indeed that's a synonym of PGDialect_psycopg2.
I would prefer, though, that we use the class names. It feels more natural for expressing class inheritance. And there will be some more obvious symmetry, too, with our own class names assuming we update to use RedshiftDialect_psycopg2, etc.
sqlalchemy_redshift/dialect.py
Outdated
| return cargs, default_args | ||
|
|
||
|
|
||
| class Psycopg2RedshiftDialect( |
There was a problem hiding this comment.
I'd like to following existing naming conventions in SQLAlchemy where possible. From the postgresql dialect code, it looks like this would be called RedshiftDialect_psycopg2. Does that seem like a reasonable choice here?
There was a problem hiding this comment.
Yes, sounds good. I will make this change
|
I havent looked at everything but what stands out to me is the test suite refactoring. not sure how feasible this is but assuming redshift is using SQLAlchemy's test harness, we dont generally have dialect names hardcoded in our test suite as this would not scale, and the commandline runner can automatically run the tests against any number of dialects. instead of targeting "dialect = XDialect()" everywhere you instead look at the global "sqlalchemy.testing.config.db.dialect" for the current dialect. Directives associated with test classes like our commandline to run the tests for postgresql looks like: which will then run a pytest like this: the suite will generate URLs given each driver and run the test suite against all the above DBAPIs. Just something to think about as sqlalchemy-redshift is definitely going to want to support other DBAPIs, at least one of asyncpg or psycopg3 for async support, for example. |
…shiftDialectMixin
…2, Psycopg2CFFIRedshiftDialect -> RedshiftDialect_psycopg2cffi
…g class inheritance
There was a problem hiding this comment.
thank you both for the feedback!
As for documentation regarding the CFFI variant, from what I saw in the sqlalchemy source, PGDialect_psycopg2cffi seems very similar to PGDialect_psycopg2, save for some methods having logic around extras/extensions and the package's versioning.
It looks like @zzzeek reviewed a PR associated with PGDialect_psycopg2cffi, sqlalchemy #3052. I see some mentions of differences in unicode bind parameter names and floating point values, but I am far from knowledgable on this subject and this PR is from quite a few years ago, so things have probably changed.
Regarding the test suite, I don't believe sqlalchemy-redshift uses the sqlalchemy test harness at this point in time, but after taking a quick look I agree--it seems like a much more scalable (and clean) approach going forward. This could be worth looking into more in the future.
sqlalchemy_redshift/dialect.py
Outdated
| from sqlalchemy.dialects.postgresql.psycopg2 import PGDialect_psycopg2 | ||
| from sqlalchemy.dialects.postgresql import ( | ||
| psycopg2, psycopg2cffi |
sqlalchemy_redshift/dialect.py
Outdated
| return all_constraints | ||
|
|
||
|
|
||
| class PsycopgRedshiftDialectMixin(RedshiftDialectMixin): |
sqlalchemy_redshift/dialect.py
Outdated
|
|
||
| class PsycopgRedshiftDialectMixin(RedshiftDialectMixin): | ||
| """ | ||
| Define Psycopg specific behavior. |
sqlalchemy_redshift/dialect.py
Outdated
| return cargs, default_args | ||
|
|
||
|
|
||
| class Psycopg2RedshiftDialect( |
|
Thanks for all the updates, @Brooke-white. I have it on my to-do list to give this a full review, but probably won't happen until Friday. |
Even more things came up last week, so apologies that this will need to be further delayed. |
|
Thanks for the update @jklukas , do you have an ETA for when you'll be able to give a full review? |
jklukas
left a comment
There was a problem hiding this comment.
Changes look good, including the new names and the shims for compatibility with RedshiftDialect. I've kicked off integration tests and will merge assuming those all come back clean.
Builds upon graingert’s work in #100 to refactor dialect code to utilize a RedshiftDialectMixin class to define driver specific dialects.
Additionally, the PyscopgRedshiftDialectMixin class is defined and used as a base class for psycopg flavored dialects. It holds the implementation for the
create_connect_args()method, which previously lived in theRedshiftDialectclass. My thoughts here are that this class could also be utilized in supporting a psycopg3 dialect in the future.While graingert’s work in #100 added support for additional dialects (pg8000, pypostgresql, zxjdbc), I believe supporting these drivers would require some additional effort. Support for these drivers is not included in this PR.
Tests are parameterized using
pytest.fixtureto run on each defined dialect (i.e.redshift,redshift+psycopg2,redshift+psycopg2cffi). Unit tests utilize a parameterized”stub”redshift dialectpytest.fixtureto avoid needing an Amazon Redshift cluster. Unsure if we want to run all tests using bothredshiftandredshift+psycopg2, as they are using the same dialect — so I’ll leave this up to the reviewers :)All tests (including ones requiring a Redshift cluster) have been run with nominal results.
Todos