New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous Driver For Python <Tornado and Tulip> #2622
Comments
@v3ss0n Thanks for opening an issue. With the current driver you can use threads to perform asynchronous queries. Something like this should work:
A Python driver with an async API would be better. |
Thanks a lot for code example AtnNn. On asynchronous Mortor ,MongoDB driver equivilant can be done without using threads : @gen.coroutine
def tail_example():
results = []
collection = db.my_capped_collection
cursor = collection.find(tailable=True, await_data=True)
while True:
if not cursor.alive:
now = datetime.datetime.utcnow()
# While collection is empty, tailable cursor dies immediately
yield gen.Task(loop.add_timeout, datetime.timedelta(seconds=1)) ##Giving Control back to tornado IOLoop
cursor = collection.find(tailable=True, await_data=True) ##<< This will block on normal Pymongo Driver
if (yield cursor.fetch_next):
results.append(cursor.next_object())
print results Motor driver uses Greenlets to address this problem, without modifying any line of code of mongodb's PyMongo Driver, very interesting approach. His approach is detailed on this video and slide , and should be able to apply to any Blocking NetworkIO drivers without much modification - Thats what i am talking about :) . http://emptysqua.re/blog/video-slides-and-code-about-async-python-and-mongodb/ It will be very good feature for rethinkdb to have it build in. I will also look into rethinkdb's python driver when i get time. |
@v3ss0n -- thanks for chiming in. I'm moving this to backlog for now because there are much more pressing issues to take care of first, but I agree this would be valuable. As a data point, at least 5 people (may be more) have asked me for this. I can't promise when we'll get to it, but I can say with almost complete certainty that there will be support for this eventually. |
I wrote a twisted connector for python driver. It may help... https://github.com/tonich-sh/rethinkdb-twisted.git |
Thanks , looking into it! |
I am now seriously thinking about writing Async Tornado driver for RethinkDB after liking the changefeeds compare to tailable_cursors of MongoDB. |
Current client is too mixed with socket code but twisted's protocols is an abstraction. So i went on easiest way... I made it simple and left only the asynchronous logic. |
I tried two approaches about a month ago, threading & asyncio (Python3). Unfortunately, I don't have the time to make anything more than a hack, but I think the asyncio approach is promising seeing as it's in the standard library. Maybe this will be helpful to someone: |
@tonich-sh , so i think going gevent approach (the way motor did with mongodb ) will be easy for current client . @csytan thanks a lot chris , i am looking into it. |
Hi I'm just evaluating rethinkDB at this point for use in a new project. Having support for an asynchronous tornado driver would be very compelling for me. Out of the box support for a python api that mirrors the node/javascript API (using generators/coroutines) would be really great. |
We're going to consider different options (including Tornado and Python 3 asyncio) for making changefeeds work better in Python as part of #3298. This will happen very soon. |
Ideally the outcome would be to refactor the python code to be event loop agnostic. The same "protocol" code reused in event loop drivers so their feature compatibility is ensured. I feel I have to point this out because unfortunately few libraries are written this way. I dont want to see a future where I'll be installing with The Autobahn project has a great structure for this kind of thing, https://github.com/tavendo/AutobahnPython/ The Autobahn code is structured so if/when people need to use the protocol with some other unsupported event loop, they just have to rewrite the much smaller driver code to use their event loop. |
I think the cleanest way to get the driver to run on both Python 2 & 3 would be to use Tornado. Tornado coroutines are very similar to Asyncio's and are also compatible with Asyncio's event loop. |
@techdragon We'd like to avoid duplication as well. Will see what we can do. Moving this to 2.1. We'll try to get Tornado integration done soon. |
Just a note from my personal experience, Tornado is the node equivalent right now in the python world. Twisted has been around longer, but isn't as widely used in the web-app community. In terms of which async loop to support first, Tornado would be a solid choice. |
@deontologician Ok that's good to know. I have no personal preference with respect to whether to support Twisted or Tornado first. My understanding from what @larkost said is that Twisted tends to be slightly easier to install due to more widely available packages. If Tornado is easier to support than Twisted that sounds like we should support Tornado first. Otherwise we should just pick one and then follow up the other one soon (maybe we can even do both for 2.0). |
I can't comment on which one is easier to implement, just in terms of who is most likely to make use of it |
It has again been a while since I did bleeding edge Python, but if @csytan is right then it would be much less work to support Tornado-and-asyncio-etc than it would be to support only Twisted. @techdragon That's interesting! I'm going to have to look at it in more detail. Certainly a possibility. |
Ok let's do Tornado first then. |
@gchpaco & @danielmewes - If you refactor the logic to separate the drivers from the core logic the drivers rely on, then at least in theory, each driver should be simpler to build. Tornado is the current favourite among a lot of developers, and would be great, but when you refactor the logic you need to take into account that you need to maintain the existing synchronous driver. Twisted has a LOT of documentation regarding its use. I would try to refactor the current code to be broken apart like Autobahn has theirs, supporting twisted and regular synchronous python first, then adding tornado & asyncio in whichever order works best. |
Thanks a lot for this going forward.I haven't check for long due to being busy with Async chat Implementation on Tornado and Mongo. @danielmewes , what @csytan suggested true that tornado's coroutines are compatible with asyncio of Python3. It gives you clean code using yield, without needing to worry about callbacks , and thats the reason for web devs choosing tornado over twisted. One Stone , a flock of birds. Nice? :) |
@gchpaco -- a couple of questions:
|
Specifically, with sample code it would be great to mirror @mlucy's mini-tutorial here #3678 (comment) in Python, so we understand how to use the API in different circumstances, how error handling works, etc. |
Right now I'm trying to get it through review, but I'll try to do that. As a temporary stopgap, see test/rql_test/connections/tornado_connection.py which is an analog of connection.py. |
Yup. Note that @Tryneus is working on refactoring some of it, so there may be some large changes when that goes through. |
The refactor @gchpaco mentioned is up in review 2732, which is based off his branch and eclipses the previous review. It is available in the branch |
Thank you so much , i like the way RethinkDB team handle projects , all official workflow integrated with github, that's so cool. |
Out of curiosity @gchpaco: # Print every row in the table.
for future in (yield r.table('test').order_by(index='id').run(connection)):
item = yield future
print(item) More specifically: how do we know in advance whether the cursor will have another result in the for loop? Does |
@danielmewes, I just tested this and it isn't the expected behavior (this is my fault due to some of the cursor changes in the refactor). Opened #3974 for this issue. |
So based on my suggestion in #3974, that loop would throw |
Having to handle the Should we add a cursor = yield r.table('test').order_by(index='id').run(connection)
while (yield cursor.fetch_next):
item = yield cursor.next()
print(item) |
@danielmewes, I think that would be useful, I'll do it alongside #3974. |
👍 |
Note in Motor, you can create the cursor without a yield since it hasn't begun I/O yet, just stored the query in a MotorCursor object. Only on the first http://motor.readthedocs.org/en/stable/api/motor_cursor.html#motor.MotorCursor.fetch_next Also, I chose the method name https://docs.python.org/2/library/stdtypes.html#iterator-types |
Thanks @ajdavis . Good point about the incompatibility with the iterator interface. |
@danielmewes, it's would be fine to iterate over a |
No problem now , this code works well. con = yield conn
curs = yield evt.run(con)
messages = []
while (yield curs.fetch_next()):
item = yield curs.next()
messages.append(item) But i have a few other questions. Right now Insert performance is not good with async , it takes 200-400 ms , locally , small document of < 2KB each (chat messages). take a look at this code , i may be doing something wrong: import logging
import tornado.escape
import tornado.ioloop
import tornado.web
import os.path
import uuid
import rethinkdb as r
from tornado.concurrent import Future
from tornado import gen
from tornado.options import define, options, parse_command_line
r.set_loop_type("tornado")
define("port", default=8080, help="run on the given port", type=int)
define("debug", default=False, help="run in debug mode")
conn = r.connect("localhost")
evt = r.db("rechat").table("events")
# Making this a non-singleton is left as an exercise for the reader.
class MainHandler(tornado.web.RequestHandler):
@gen.coroutine
def get(self):
con = yield conn
curs = yield evt.run(con)
messages = []
while (yield curs.fetch_next()):
item = yield curs.next()
messages.append(item)
self.render("index.html", messages=messages)
class MessageNewHandler(tornado.web.RequestHandler):
@gen.coroutine
def post(self):
con = yield conn
message = {
"body": self.get_argument("body")
}
# to_basestring is necessary for Python 3's json encoder,
# which doesn't accept byte strings.
messages = (yield evt.insert(message).run(con))
message['id'] = messages['generated_keys'][0]
message["html"] = tornado.escape.to_basestring(
self.render_string("message.html", message=message))
if self.get_argument("next", None):
self.redirect(self.get_argument("next"))
else:
self.write(message) |
i am at 01fa526 |
what branch i should pool? next or v2.0x ? |
After reviewing a lot of amazing work by David Beazley (https://github.com/dabeaz), and with the improvements in async python (mainly introduction of I have created a gist here with a working example It gives a simple structure to listen to multiple change feeds asynchronously (tested on python 3.6) |
Change feed on 1.13 is very intersting feature.Multi-media-realtime HTML5 projects are getitng very popular these days , and i am building one right now too , a multi-media-webchat .It needs a lot of serverside notifications.
However , RethinkDB still needs async driver for Tornadoweb, like Motor driver by MongoDB does , that is only deal-blocking feature that blocking me changing from Mongo.
I Love ReQL and I really believe that RethinkDB is NoSQL-Done-Right.
Any Pointers on modifying current driver work async would be very good,I would like to contribute. (I am still quite new on Async and Tornado.
The text was updated successfully, but these errors were encountered: