Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SQLAlchemy dialect with psycopg2 #100

Closed
amotl opened this issue Dec 4, 2020 · 7 comments
Closed

Support SQLAlchemy dialect with psycopg2 #100

amotl opened this issue Dec 4, 2020 · 7 comments
Labels
question Further information is requested

Comments

@amotl
Copy link
Member

amotl commented Dec 4, 2020

Hi there,

coming from orchestracities/ngsi-timeseries-api#407 (comment), I am asking myself whether it would be possible to use the CrateDB SQLAlchemy dialect together with the Python DB API 2.0 compatible psycopg2 driver.

Maybe @mfussenegger, @chaudum or @autophagy can add a comment about this?

With kind regards,
Andreas.

@amotl amotl added the question Further information is requested label Dec 4, 2020
@mfussenegger
Copy link
Member

mfussenegger commented Dec 4, 2020

I think the sqlalchemy dialect doesn't have too many concrete dependencies on crate-python.
Could try monkey-patching the driver name.

One exception from the client is used: https://github.com/crate/crate-python/blob/f2617dd789f0eb70b042ac77be4b7c8bd936a7c6/src/crate/client/sqlalchemy/dialect.py#L34

but I don't think that's a real issue.

We could also consider to de-couple it completely, but not sure what's the benefit of using psycopg2 over the crate-python http based client?

We also have to keep in mind that psycopg2 is LGPL licensed, which is afaik not compatible with Apache licensed projects. So we wouldn't be able to ship anything with psycopg2 included.

@amotl
Copy link
Member Author

amotl commented Dec 4, 2020

Hi Mathias,

thanks for looking into this. I was just trying to figure out if there are any other intrinsic couplings which would prevent this in general.

I would not de-couple it but instead provide an option to invoke it like outlined at [1].

# default
engine = create_engine('crate://localhost:4200')

# psycopg2
engine = create_engine('crate+psycopg2://localhost:5432')

Not sure what's the benefit of using psycopg2 over the crate-python HTTP-based client?

Wouldn't the communication be a bit more efficient regarding binary serialization on the wire and without the overhead of HTTP? Also, things like how to properly configure a shared connection pool (#90) would not need any special attention.

We also have to keep in mind that psycopg2 is LGPL licensed.

Thanks for bringing that up. SQLAlchemy is MIT licensed and references psycopg2 within an optional "extra" dependency [2,3]. So I believe it would be fine doing it in the same way?

With kind regards,
Andreas.

[1] https://docs.sqlalchemy.org/en/13/core/engines.html#postgresql
[2] https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_4_0b1/setup.cfg#L60
[3] https://github.com/sqlalchemy/sqlalchemy/blob/rel_1_4_0b1/setup.cfg#L65

@mfussenegger
Copy link
Member

mfussenegger commented Dec 4, 2020

Wouldn't the communication be a bit more efficient regarding binary serialization on the wire and without the overhead of HTTP?

The PostgreSQL wire communication supports text and binary serialization. I think psycopg2 is using the text serialization. I didn't verify that, but I think it is one of the reasons that asyncpg is a lot faster - because they use binary serialization. And before they merged MagicStack/asyncpg#295 our crate-python executemany was significantly faster for inserts (orders of magnitude) than asyncpg. I haven't tested again since they merged the batch improvement.

SQLAlchemy is MIT licensed and references psycopg2 within an optional "extra" dependency [2,3]. So I believe it would be fine doing it in the same way?

Yes it works as optional dependency. I think the license only prevents us from providing a bundle that would include psycopg2 code out of the box.

@amotl
Copy link
Member Author

amotl commented Mar 17, 2021

I just wanted to add that both @proddata and @SStorm signaled interest in having this feature available.

@mfussenegger
Copy link
Member

I just wanted to add that both @proddata and @SStorm signaled interest in having this feature available.

Could you elaborate on the background? What's the advantage of using psycopg2?

@mfussenegger
Copy link
Member

SQLAlchemy 1.4 also supports asyncpg. I think that may be more worthwhile to support than psycopg2 as it has performance advantages.

@amotl
Copy link
Member Author

amotl commented May 18, 2022

Hi again,

I fully agree that the focus should be on asyncpg instead. I will add a corresponding test case to crate/crate-python#391 and close this issue. Feel free to reopen or create another one as soon as psycopg3 is ready and qualifies further evaluation.

With kind regards,
Andreas.

@amotl amotl closed this as completed May 18, 2022
@amotl amotl transferred this issue from crate/crate-python Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants