Skip to content
This repository was archived by the owner on Aug 19, 2025. It is now read-only.

Aiopg#124

Merged
gvbgduh merged 9 commits intoencode:masterfrom
gvbgduh:aiopg
Mar 16, 2020
Merged

Aiopg#124
gvbgduh merged 9 commits intoencode:masterfrom
gvbgduh:aiopg

Conversation

@gvbgduh
Copy link
Copy Markdown
Contributor

@gvbgduh gvbgduh commented Jul 10, 2019

No description provided.

Copy link
Copy Markdown
Contributor

@lovelydinosaur lovelydinosaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤩 Let me know at whatever point you need my input on this. Looking good!

@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Jul 10, 2019

Hi @tomchristie, I was finally able to finish the first turn around and there're some issues appeared.

  1. postgresql+aiopg doesn't work for sqlalchemy as it doesn't recognise that driver (yet?) https://docs.sqlalchemy.org/en/13/core/engines.html#postgresql
    so, I took the postgresql+psycopg2 one for start.
    I have in mind a couple more ideas how to approach:
  • transform +aiopg to +psycopg in the __repr or some cast, but it doesn't feel right
  • pass/specify a particular backed to the Database.__init__, it might allow more flexibility later, but it's not a small architectural change and I'm not fully sure how it's aligned.

Also there might be some confusion as asyncpg is implicit choice for Postgres (even though I believe a good one :)).

Update: All in all, we just need to distinguish asyncpg and aiopg at the database class and supply a url sqlalchemy can recognise (for alembic or some other things).
Or open as issue in sqlalchemy to add an alias for aiopg, but in fact it uses psycopg2 behind the scenes utilising the async api.

  1. A bit different handling for JSON fields values, but I need to triple check.

  2. lastrowid seems to be not exactly the record id, but a local op id - https://aiopg.readthedocs.io/en/stable/core.html#aiopg.Cursor.lastrowid. But I'd rather triple check it as well.

  3. A bit different approach for transactions, basically only raw queries can allow this.

In overall, it's pretty close to mysql backend, but there're some differences in details.
I also spotted a couple of places for optional refactoring - mostly moving some stuff to conftest for unittests, but there're a couple of methods that are some across all backend implementations.

Yeah, it would great to have any feedback, recommendations, complains.

@lovelydinosaur
Copy link
Copy Markdown
Contributor

I suppose sqlalchemy should only see it as postgresql. We're not using their drivers, so that's the best information we can give it.

@gvbgduh gvbgduh marked this pull request as ready for review July 12, 2019 17:00
@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Jul 12, 2019

Yes, I see, it clarifies and makes it easier.
Ok, now I would really appreciate a review if it could be possible.

@gvbgduh gvbgduh mentioned this pull request Sep 2, 2019
@mpyatishev
Copy link
Copy Markdown

@tomchristie, @gvbgduh when do you plan to merge this PR?

@eien
Copy link
Copy Markdown

eien commented Sep 20, 2019

Yup, it'd be great to get this in!

Comment thread tests/test_connection_options.py Outdated


def test_aiopg_ssl():
backend = AiopgBackend("postgres+aiopg://localhost/database?ssl=true")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're expecting postgres+aiopg in some places and postgresql+aiopg in others.
I'd be okay with us accepting either or both.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yeah, it's a typo, but we only expect postgresql+aiopg as of now, at least there's only this mapping. it's not triggered in these tests as don't make any use of it. I'll update urls to avoid confusion.

Comment thread tests/test_databases.py
# Apparently for `aiopg` it's OID that will always 0 in this case
# As it's only one action within this cursor life cycle
# It's recommended to use the `RETURNING` clause
# For obtaining the record id
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example of where we ought to be returning a more structured type from .execute - that way we could have different drivers either support or not support result.row_id.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change our query here to make it consistently return the pk across all our implementations maybe?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds reasonable indeed.
There're two ways though as I can see:

  • provide pure driver experience with minimal overhead, hence it'll vary from driver to driver, but will keep it as simple as possible if it can matter, or
  • provide unified API and implement the missing bits providing the missing functionality, but allocating the additional overhead

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, just 25 secs later, ok, let's follow the second path, all in all, there's always a way to jump to the raw driver and get the full manual control over it.

Copy link
Copy Markdown
Contributor Author

@gvbgduh gvbgduh Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie I'm thinking quite hard about it. There's a function in postgres - lastval() that can be executed to retrieve the value, but it's not reliable as there're circumstances where it can return the wrong value. The returning clause is the best way to do it, but we can't assume how the PK column is called and returning * won't be consistent with other back-ends.

Comment thread tests/test_databases.py Outdated
values = {"data": data}

if database.url.scheme == "postgresql+aiopg":
await database.execute(query, {"data": json.dumps(data)})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle this automatically for our users?
Would that be a sensible thing to do here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it's possible in all cases, as I would need to know that the column of a type json, so it'll require either a lookup to the sqlalchemy's metadata or to the DB schema. But I'll have a look at how easy it can be.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'll require either a lookup to the sqlalchemy's metadata or to the DB schema

If it means a lookup then let's not do it. I'm just not clear why JSONField isn't already handling the serialization for us in this case. Might need some more looking into?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm dealing with it now, there're several facets to approach it.
it can work canonically, but the field has to be sqlalchemy.dialects.postgresql.JSON, but aiopg returns back (over select) the decoded value and it breaks the RowProxy as it expects string.
so, i'm looking for compromise/settings to get the balance

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it was the matter of the dialect class, I reused the one that's in use for the integration with sa part of aoipg it got back to normal.

Comment thread tests/test_databases.py
await cursor.executemany(insert_query, values)
elif database.url.scheme == "postgresql+aiopg":
cursor = await raw_connection.cursor()
# No async support for `executemany`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, in these kinds of cases we should still support the .executemany API, even if all it actually does is for value in values: ...execute... under the hood.

(We can document limitations of particular drivers, but it'd def. be a benefit if we can keep the API the same, even if there's different performance characteristics in some cases.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see and agree, will try my best.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's the test for the raw driver across different backends, but aiopg has the implementation for execute_many and yeah it's with the for loop.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grand, let's just hook our backend implementation up to that then, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All should be in place! I have a unittest to test it for database.execute_many.

Copy link
Copy Markdown
Contributor Author

@gvbgduh gvbgduh Oct 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this case is the battle test of the raw driver, so it replicates backend implementations in some places.

Comment thread tests/test_integration.py Outdated
database_url = DatabaseURL(url)
if database_url.dialect == "mysql":
url = str(database_url.replace(driver="pymysql"))
elif database_url.driver == "aiopg":
Copy link
Copy Markdown
Contributor

@lovelydinosaur lovelydinosaur Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check the .scheme here instead in these conditionals.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@lovelydinosaur
Copy link
Copy Markdown
Contributor

lovelydinosaur commented Sep 30, 2019

Yes yes yes! Let's make this a priority. 🚀
Some review comments above - @gvbgduh feel free to keep nudging me until we've got this one in!

@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Sep 30, 2019

Awesome, thanks @tomchristie!

@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Oct 3, 2019

@tomchristie I think it's in good shape now. There's only one open question - lastrowid, but I'm not sure about the universal, consistent and reliable solution for that. Happy to follow up on it in the next PR if it's more reasonable.

@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Oct 29, 2019

hi @tomchristie! A gentle nudge here, I think it's mergeable. Happy to follow up on any appeared issues anyway.

@ghost
Copy link
Copy Markdown

ghost commented Jan 17, 2020

What would be the point of this if databases already works with asyncpg?

@gvbgduh
Copy link
Copy Markdown
Contributor Author

gvbgduh commented Feb 5, 2020

@StephenCarboni There's a bunch of different reasons, but the main point is asyncpg and aiopg are totally different, including underlying lower-level drivers, internal and external APIs, and even query syntax details.
asyncpg is written from scratch without dependencies, while aiopg is based on psycopg2 which is based on libpq and shares the same (or very similar) API with aiomysql/aiosqllite/aioodbc.
Despite personal preferences, it can also cover different migration strategies between DBs, or from sync to async, when the time is the critical factor for example.

@gvbgduh gvbgduh merged commit 7417461 into encode:master Mar 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants