Allow asynchronous queries #263

sametmax · 2013-11-28T14:29:52Z

Hard one, as peewee has been built on top of synchronous DB driver up untill then, but with Python 3.4 comming and shipping with asyncio + yield from, this can be an interesting for Python 3 users.

coleifer · 2013-11-29T04:42:36Z

This is an interesting request, but it's so big I'm not sure how to address it. Since I intend to keep compatibility with 2.6 for quite some time, I think it's not really possible. Also, my experience with "async" in python has been limited to gevent and I don't feel very comfortable with the new APIs (yield from, Task, coroutine, etc).

Do you have any more thoughts or information to add?

sametmax · 2013-11-29T10:56:45Z

I don't, and I'm not expecting peewee to change any time soon. I am
very aware that these things are hard and take time, therefor I'm
opening this ticket now so the communuty can start the process slowly
but early.

I am myself not yet comfortable with asyncio but since I'm forced to
start thinking about it in my own work, if I got any idea popping, I'll
come back here with a proposal.

Le ven. 29 nov. 2013 05:42:40 CET, Charles Leifer a écrit :

This is an interesting request, but it's so big I'm not sure how to
address it. Since I intend to keep compatibility with 2.6 for quite
some time, I think it's not really possible. My experience with
"async" in python has been limited to gevent and I also don't feel
very comfortable with the new APIs (|yield from|, |Task|, |coroutine|,
etc).

Do you have any more thoughts or information to add?

—
Reply to this email directly or view it on GitHub
#263 (comment).

coleifer · 2013-11-29T16:20:15Z

Thanks for the message @sametmax. Based on my understanding of the new asyncio library, I'm going to go out on a limb here and say that I don't think peewee will be adding support.

A new "asyncio-aware" driver would need to be written for each database.
All code that connects to the db and makes queries in peewee would need to be rewritten, i.e. yield from db.connect(...), yield from db.execute() and yield from cursor.fetchXXX().
Any application using peewee would need to be rewritten where it makes queries: yield from Model.select().where(...)

Have a look at the examples to see how the structure of the application changes when you start using the yield from syntax. Using callbacks is no better, IMO.

sametmax · 2013-12-01T08:54:12Z

I understand. Best solution would be to have the common code, such as
the query builder, in one module (well, wrapper, since peewee is a one
file lib), the sync API code in another module, and the async API code
in a third module.

And allow something like :

from peewee import blabla # sync
from peewee.async import blabla # async

Or make peewee decoupled enough so that it's possible to build an async
lib on top of it wall async_peewee.

But I get it, it's a lot of work, and when you already have coded the
ORM, how to have the time to do this ? I'm not even offering to do it
because I'm well aware it's so huge.

Anyway, thanks for answering.

Le ven. 29 nov. 2013 17:20:19 CET, Charles Leifer a écrit :

Thanks for the message @sametmax https://github.com/sametmax. Based
on my understanding of the new asyncio library, I'm going to go out on
a limb here and say that I don't think peewee will be adding support.

A new "asyncio-aware" driver would need to be written for each
database.

All code that connects to the db and makes queries in peewee would
need to be rewritten, i.e. |yield from db.connect(...)|, |yield
from db.execute()| and |yield from cursor.fetchXXX()|.

Any application using peewee would need to be rewritten where it
makes queries: |yield from Model.select().where(...)|

Have a look at the examples
https://code.google.com/p/tulip/source/browse/examples/fetch1.py to
see how the structure of the application changes when you start using
the |yield from| syntax. Using callbacks is no better, IMO.

—
Reply to this email directly or view it on GitHub
#263 (comment).

csytan · 2014-04-04T02:34:00Z

Just to add to this discussion, I've had positive experiences with the way Guido's NDB handles async operations.

If in the future you are ever in need of any ideas, here are the docs:
https://developers.google.com/appengine/docs/python/ndb/async

soasme · 2014-04-04T02:44:20Z

We can use Trollius for Py2/3 when needed. Much like asyncio, but it has friendly grammer for Py2(yield From(do_something()), raise Return(value)).

coleifer · 2014-04-04T14:54:31Z

Thanks a bunch for the links - I will read up on them.

cpbotha · 2014-05-13T13:29:42Z

I've just tried a flask+peewee app with uwsgi+gevent+psycogreen.gevent.patch_psycopg -- in theory, this should patch psycopg so that calling code (such as peewee) can use it as if it were still blocking, see https://bitbucket.org/dvarrazzo/psycogreen

When I try this, peewee gives me:

peewee.ProgrammingError: execute cannot be used while an asynchronous query is underway

Am I expecting too much?

coleifer · 2014-05-13T14:02:52Z

Strange...I've used psycopg2 with gevent and not had any issues. The code I used was similar to the monkeypatch you linked up @cpbotha . One thing you might check is that the gevent monkeypatch needs to be the first thing that happens.

So the entry-point to your application would look like this:

from gevent import monkey
monkey.patch_all()
from psycopg2_green_monkeypatch import whatever
whatever()

# here begins your actual code...

cpbotha · 2014-05-13T14:33:31Z

Thanks for helping me with this!

I already have in my wsgi.py (entry point for uwsgi):

import gevent
import gevent.monkey
gevent.monkey.patch_all()

import psycogreen.gevent
psycogreen.gevent.patch_psycopg()

import main;main.init();from cnids.app import fapp

With "normal" DB access (RESTful via browser), I see no issues. However, I do see the ProgrammingError exception when I do ab -c 3 -n 10 URL (3 concurrent requesters)

Anything else I could look at? (this is with peewee 2.2.3)

coleifer · 2014-05-13T14:35:08Z

Strange... I'm not sure what might be happening.

cpbotha · 2014-05-13T15:02:08Z

Seems it has happened before: https://bitbucket.org/dvarrazzo/psycogreen-hg/issue/1/databaseerror-execute-used-with

The explanation offered is issue would occur when two queries are done using two different cursors on the same database connection. Does that make sense?

The reporter of that bug then wrote a blog post with a solution http://www.manasupo.com/2012/03/geventpsycopg2-execute-cannot-be-used.html where the simply do the psycopg2 monkey-patching before the fork.

I don't understand why that fixed the problem in their case. In my case, the monkey patching is done before forking, if I may judge by the uwsgi log file (my "MONKEY PATCHING YEAH!" output appears once, before the the three worker processes report for duty):

mapped 4368256 bytes (4265 KB) for 300 cores
*** Operational MODE: preforking+async ***
MONKEY PATCHING YEAH!
Database migration not required.
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x10035d0 pid: 30745 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 30745)
spawned uWSGI worker 1 (pid: 30751, cores: 100)
spawned uWSGI worker 2 (pid: 30752, cores: 100)
spawned uWSGI worker 3 (pid: 30753, cores: 100)
*** running gevent loop engine [addr:0x485620] ***

I'm putting this here as a log for future travellers. Any tips would be welcome of course!

cpbotha · 2014-05-13T15:21:20Z

In Momoko (Tornado wrapping of psycopg2), they had to build in more explicit handling of busy database connections to work around this issue: Tsumanga-Studios/momoko@e4752c9

This was supposed to be a short experiment to benchmark multi-process uwsgi+flask-peewee against multi-process+asnyc+flask-peewee. I think I should let it go, as they say. :)

Thanks in any case!

coleifer · 2014-05-13T15:39:52Z

The explanation offered is issue would occur when two queries are done using two different cursors on the same database connection. Does that make sense?

Are you using threadlocals=True with your database connection? e.g.

db = PostgresqlDatabase('foo', threadlocals=True)

cpbotha · 2014-05-13T15:59:07Z

If I could fly over there and buy you a beer I would!!

You even documented it: https://github.com/coleifer/flask-peewee/blob/master/docs/gevent.rst

I don't understand why my searching didn't take me there, but I'm happy that you've solved it! (in my app.py, I added a 'threalocals' key to the DATABASE configuration dictionary.

What impact does that setting have in a non-greened environment?

Thanks again,
Charl

coleifer · 2014-05-13T16:00:19Z

In a non-greened environment if you were using a multi-threaded WSGI server, then your connections would be opened per-thread.

rudyryk · 2014-09-26T08:13:13Z

I think we just need a separate package, let's say 'aiopeewee', providing asyncio interface. We're using peewee with Tornado+asyncio, so I think we'll start porting it shortly. Meanwhile we use 'run_in_executor' for performing slow requests asynchronously.

rudyryk · 2014-09-26T09:18:20Z

And I'm not sure yet do we actually need porting or just a couple of wrappers and asyncio powered database backend classes.

sametmax · 2014-09-26T09:37:01Z

You will need more than that actually because the nature of the API
itself is synchronous.

E.G : when you access an attribute in peewee, it can fire an new query:
product.shop.name will make a query to get the shop by default.

You'll probably need to return a promise for very thing that should
return a an object or queryset.

Now if you do that in a template :

Things get funny, because most template don't have a way to handle
promises (no callback, no yield, etc).

Plus, there are not just promises, but also deferred and futures,
depending of the framework. So you'll need a different backup for the
async event loop AND for the result wrapper.

It's a lot of work, it's hard. But if you do it, peewee will be actually
the ONLY Python ORM dealing with these issues properly. And hence, I
garantie i will be used much, much more. Technically, you'll be front
page in python subreddit and hackernews the day of the release.
Everybody is waiting for something like this, because right now,
everybody uses hacks (defer to threads, mongodb motor, gevent monkey
patching...) and big solutions like Django ORM or SQLAlchemy stated
officially they didn't want to do it in the current context.

Unfortunatly, I don't know any way to do async with most DB without
using a compiled drivers for it. sqlite3 stdlib driver is syncronous if
I recall, so you won't have the "wow out of the box experience you
should have.

I would make it something la peewee.async, instead of a separate lib. It
would make adoption much easier. But maybe that's just me.

Le 26/09/2014 11:18, Alexey a écrit :

And I'm not sure yet do we actually need porting or just a couple of
wrappers and asyncio powered database backend classes.

—
Reply to this email directly or view it on GitHub
#263 (comment).

rudyryk · 2014-09-28T13:57:51Z

Hi everyone :) I've just published a kind of working proto: https://github.com/05bit/peewee-async

I think basic async support may also be useful, we can deal with related objects by sending explicit prefetch queries. And yes, async queries are difficult to support in templates (without rewriting template engine), so I think it's more suitable for API services where we generally just need to serialise to JSON.

rudyryk · 2014-10-11T20:01:16Z

Just published alpha v0.0.2 on PyPi and here's the docs: https://python-aiopeewee.readthedocs.org Interface seems working and simple one and I think it's now rather close to stable version. But internals don't really shine, I've started issue to discuss better integration with peewee: 05bit/peewee-async#1

Serkan-devel · 2019-08-01T15:50:12Z

Has anything happened since then?

coleifer · 2019-08-01T15:52:39Z

Use gevent

coleifer closed this as completed Dec 2, 2013

coleifer reopened this Apr 4, 2014

coleifer closed this as completed Apr 5, 2014

dalf mentioned this issue Jan 28, 2015

allow deactivation of engine only for specific category searx/searx#205

Closed

coleifer mentioned this issue May 26, 2020

any plan to support async/await #2189

Closed

Repository owner deleted a comment from lucasgadams Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow asynchronous queries #263

Allow asynchronous queries #263

sametmax commented Nov 28, 2013

coleifer commented Nov 29, 2013

sametmax commented Nov 29, 2013

coleifer commented Nov 29, 2013

sametmax commented Dec 1, 2013

csytan commented Apr 4, 2014

soasme commented Apr 4, 2014

coleifer commented Apr 4, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

rudyryk commented Sep 26, 2014

rudyryk commented Sep 26, 2014

sametmax commented Sep 26, 2014

rudyryk commented Sep 28, 2014

rudyryk commented Oct 11, 2014

Serkan-devel commented Aug 1, 2019

coleifer commented Aug 1, 2019

Allow asynchronous queries #263

Allow asynchronous queries #263

Comments

sametmax commented Nov 28, 2013

coleifer commented Nov 29, 2013

sametmax commented Nov 29, 2013

coleifer commented Nov 29, 2013

sametmax commented Dec 1, 2013

csytan commented Apr 4, 2014

soasme commented Apr 4, 2014

coleifer commented Apr 4, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

cpbotha commented May 13, 2014

coleifer commented May 13, 2014

rudyryk commented Sep 26, 2014

rudyryk commented Sep 26, 2014

sametmax commented Sep 26, 2014

rudyryk commented Sep 28, 2014

rudyryk commented Oct 11, 2014

Serkan-devel commented Aug 1, 2019

coleifer commented Aug 1, 2019