Skip to content

Latest commit

 

History

History
503 lines (372 loc) · 16.9 KB

tutorial-tornado.rst

File metadata and controls

503 lines (372 loc) · 16.9 KB

motor.motor_tornado

Tutorial: Using Motor With Tornado

before-inserting-2000-docs

import pymongo import motor import tornado.web from tornado.ioloop import IOLoop from tornado import gen

db = motor.motor_tornado.MotorClient().test_database

after-inserting-2000-docs

import pymongo import motor import tornado.web from tornado.ioloop import IOLoop from tornado import gen

db = motor.motor_tornado.MotorClient().test_database sync_db = pymongo.MongoClient().test_database sync_db.test_collection.drop() sync_db.test_collection.insert_many([{"i": i} for i in range(2000)])

import pymongo

pymongo.MongoClient().test_database.test_collection.delete_many({})

A guide to using MongoDB and Tornado with Motor.

Tutorial Prerequisites

You can learn about MongoDB with the MongoDB Tutorial before you learn Motor.

Install pip and then do:

$ pip install tornado motor

Once done, the following should run in the Python shell without raising an exception:

>>> import motor.motor_tornado

This tutorial also assumes that a MongoDB instance is running on the default host and port. Assuming you have downloaded and installed MongoDB, you can start it like so:

$ mongod

Object Hierarchy

Motor, like PyMongo, represents data with a 4-level object hierarchy:

  • MotorClient represents a mongod process, or a cluster of them. You explicitly create one of these client objects, connect it to a running mongod or mongods, and use it for the lifetime of your application.
  • MotorDatabase: Each mongod has a set of databases (distinct sets of data files on disk). You can get a reference to a database from a client.
  • MotorCollection: A database has a set of collections, which contain documents; you get a reference to a collection from a database.
  • MotorCursor: Executing ~MotorCollection.find on a MotorCollection gets a MotorCursor, which represents the set of documents matching a query.

Creating a Client

You typically create a single instance of MotorClient at the time your application starts up.

before-inserting-2000-docs

>>> client = motor.motor_tornado.MotorClient()

This connects to a mongod listening on the default host and port. You can specify the host and port like:

before-inserting-2000-docs

>>> client = motor.motor_tornado.MotorClient("localhost", 27017)

Motor also supports connection URIs:

before-inserting-2000-docs

>>> client = motor.motor_tornado.MotorClient("mongodb://localhost:27017")

Connect to a replica set like:

>>> client = motor.motor_tornado.MotorClient('mongodb://host1,host2/?replicaSet=my-replicaset-name')

Getting a Database

A single instance of MongoDB can support multiple independent databases. From an open client, you can get a reference to a particular database with dot-notation or bracket-notation:

before-inserting-2000-docs

>>> db = client.test_database >>> db = client["test_database"]

Creating a reference to a database does no I/O and does not require an await expression.

Tornado Application Startup Sequence

Now that we can create a client and get a database, we're ready to start a Tornado application that uses Motor:

db = motor.motor_tornado.MotorClient().test_database

application = tornado.web.Application([
    (r'/', MainHandler)
], db=db)

application.listen(8888)
tornado.ioloop.IOLoop.current().start()

There are two things to note in this code. First, the MotorClient constructor doesn't actually connect to the server; the client will initiate a connection when you attempt the first operation. Second, passing the database as the db keyword argument to Application makes it available to request handlers:

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        db = self.settings['db']

It is a common mistake to create a new client object for every request; this comes at a dire performance cost. Create the client when your application starts and reuse that one client for the lifetime of the process, as shown in these examples.

The Tornado ~tornado.httpserver.HTTPServer class's start method is a simple way to fork multiple web servers and use all of your machine's CPUs. However, you must create your MotorClient after forking:

# Create the application before creating a MotorClient.
application = tornado.web.Application([
    (r'/', MainHandler)
])

server = tornado.httpserver.HTTPServer(application)
server.bind(8888)

# Forks one process per CPU.
server.start(0)

# Now, in each child process, create a MotorClient.
application.settings['db'] = MotorClient().test_database
IOLoop.current().start()

For production-ready, multiple-CPU deployments of Tornado there are better methods than HTTPServer.start(). See Tornado's guide to tornado:guide/running.

Getting a Collection

A collection is a group of documents stored in MongoDB, and can be thought of as roughly the equivalent of a table in a relational database. Getting a collection in Motor works the same as getting a database:

before-inserting-2000-docs

>>> collection = db.test_collection >>> collection = db["test_collection"]

Just like getting a reference to a database, getting a reference to a collection does no I/O and doesn't require an await expression.

Inserting a Document

As in PyMongo, Motor represents MongoDB documents with Python dictionaries. To store a document in MongoDB, call ~MotorCollection.insert_one in an await expression:

before-inserting-2000-docs

>>> async def do_insert(): ... document = {"key": "value"} ... result = await db.test_collection.insert_one(document) ... print("result %s" % repr(result.inserted_id)) ... >>> >>> IOLoop.current().run_sync(do_insert) result ObjectId('...')

insert

before-inserting-2000-docs

>>> # Clean up from previous insert >>> pymongo.MongoClient().test_database.test_collection.delete_many({}) DeleteResult({'n': 1, 'ok': 1.0}, acknowledged=True)

A typical beginner's mistake with Motor is to insert documents in a loop, not waiting for each insert to complete before beginning the next:

>>> for i in range(2000):
...     db.test_collection.insert_one({'i': i})

In PyMongo this would insert each document in turn using a single socket, but Motor attempts to run all the insert_one operations at once. This requires up to max_pool_size open sockets connected to MongoDB, which taxes the client and server. To ensure instead that all inserts run in sequence, use await:

before-inserting-2000-docs

>>> async def do_insert(): ... for i in range(2000): ... await db.test_collection.insert_one({"i": i}) ... >>> IOLoop.current().run_sync(do_insert)

examples/bulk.

insert

before-inserting-2000-docs

>>> # Clean up from previous insert >>> pymongo.MongoClient().test_database.test_collection.delete_many({}) DeleteResult({'n': 2000, 'ok': 1.0}, acknowledged=True)

For better performance, insert documents in large batches with ~MotorCollection.insert_many:

before-inserting-2000-docs

>>> async def do_insert(): ... result = await db.test_collection.insert_many([{"i": i} for i in range(2000)]) ... print("inserted %d docs" % (len(result.inserted_ids),)) ... >>> IOLoop.current().run_sync(do_insert) inserted 2000 docs

Getting a Single Document With ~MotorCollection.find_one

Use ~MotorCollection.find_one to get the first document that matches a query. For example, to get a document where the value for key "i" is less than 1:

after-inserting-2000-docs

>>> async def do_find_one(): ... document = await db.test_collection.find_one({"i": {"$lt": 1}}) ... pprint.pprint(document) ... >>> IOLoop.current().run_sync(do_find_one) {'_id': ObjectId('...'), 'i': 0}

The result is a dictionary matching the one that we inserted previously.

The returned document contains an "_id", which was automatically added on insert.

(We use pprint here instead of print to ensure the document's key names are sorted the same in your output as ours.)

find

Querying for More Than One Document

Use ~MotorCollection.find to query for a set of documents. ~MotorCollection.find does no I/O and does not require an await expression. It merely creates an ~MotorCursor instance. The query is actually executed on the server when you call ~MotorCursor.to_list or execute an async for loop.

To find all documents with "i" less than 5:

after-inserting-2000-docs

>>> async def do_find(): ... cursor = db.test_collection.find({"i": {"$lt": 5}}).sort("i") ... for document in await cursor.to_list(length=100): ... pprint.pprint(document) ... >>> IOLoop.current().run_sync(do_find) {'_id': ObjectId('...'), 'i': 0} {'_id': ObjectId('...'), 'i': 1} {'_id': ObjectId('...'), 'i': 2} {'_id': ObjectId('...'), 'i': 3} {'_id': ObjectId('...'), 'i': 4}

A length argument is required when you call to_list to prevent Motor from buffering an unlimited number of documents.

async for

You can handle one document at a time in an async for loop:

after-inserting-2000-docs

>>> async def do_find(): ... c = db.test_collection ... async for document in c.find({"i": {"$lt": 2}}): ... pprint.pprint(document) ... >>> IOLoop.current().run_sync(do_find) {'_id': ObjectId('...'), 'i': 0} {'_id': ObjectId('...'), 'i': 1}

You can apply a sort, limit, or skip to a query before you begin iterating:

after-inserting-2000-docs

>>> async def do_find(): ... cursor = db.test_collection.find({"i": {"$lt": 4}}) ... # Modify the query before iterating ... cursor.sort("i", -1).skip(1).limit(2) ... async for document in cursor: ... pprint.pprint(document) ... >>> IOLoop.current().run_sync(do_find) {'_id': ObjectId('...'), 'i': 2} {'_id': ObjectId('...'), 'i': 1}

The cursor does not actually retrieve each document from the server individually; it gets documents efficiently in large batches.

Counting Documents

Use ~MotorCollection.count_documents to determine the number of documents in a collection, or the number of documents that match a query:

after-inserting-2000-docs

>>> async def do_count(): ... n = await db.test_collection.count_documents({}) ... print("%s documents in collection" % n) ... n = await db.test_collection.count_documents({"i": {"$gt": 1000}}) ... print("%s documents where i > 1000" % n) ... >>> IOLoop.current().run_sync(do_count) 2000 documents in collection 999 documents where i > 1000

Updating Documents

~MotorCollection.replace_one changes a document. It requires two parameters: a query that specifies which document to replace, and a replacement document. The query follows the same syntax as for find or find_one. To replace a document:

after-inserting-2000-docs

>>> async def do_replace(): ... coll = db.test_collection ... old_document = await coll.find_one({"i": 50}) ... print("found document: %s" % pprint.pformat(old_document)) ... _id = old_document["_id"] ... result = await coll.replace_one({"_id": _id}, {"key": "value"}) ... print("replaced %s document" % result.modified_count) ... new_document = await coll.find_one({"_id": _id}) ... print("document is now %s" % pprint.pformat(new_document)) ... >>> IOLoop.current().run_sync(do_replace) found document: {'_id': ObjectId('...'), 'i': 50} replaced 1 document document is now {'_id': ObjectId('...'), 'key': 'value'}

You can see that replace_one replaced everything in the old document except its _id with the new document.

Use ~MotorCollection.update_one with MongoDB's modifier operators to update part of a document and leave the rest intact. We'll find the document whose "i" is 51 and use the $set operator to set "key" to "value":

after-inserting-2000-docs

>>> async def do_update(): ... coll = db.test_collection ... result = await coll.update_one({"i": 51}, {"$set": {"key": "value"}}) ... print("updated %s document" % result.modified_count) ... new_document = await coll.find_one({"i": 51}) ... print("document is now %s" % pprint.pformat(new_document)) ... >>> IOLoop.current().run_sync(do_update) updated 1 document document is now {'_id': ObjectId('...'), 'i': 51, 'key': 'value'}

"key" is set to "value" and "i" is still 51.

update_one only affects the first document it finds, you can update all of them with update_many:

await coll.update_many({'i': {'$gt': 100}},
                       {'$set': {'key': 'value'}})

update

Removing Documents

~MotorCollection.delete_one takes a query with the same syntax as ~MotorCollection.find. delete_one immediately removes the first returned matching document.

after-inserting-2000-docs

>>> async def do_delete_one(): ... coll = db.test_collection ... n = await coll.count_documents({}) ... print("%s documents before calling delete_one()" % n) ... result = await db.test_collection.delete_one({"i": {"$gte": 1000}}) ... print("%s documents after" % (await coll.count_documents({}))) ... >>> IOLoop.current().run_sync(do_delete_one) 2000 documents before calling delete_one() 1999 documents after

~MotorCollection.delete_many takes a query with the same syntax as ~MotorCollection.find. delete_many immediately removes all matching documents.

after-inserting-2000-docs

>>> async def do_delete_many(): ... coll = db.test_collection ... n = await coll.count_documents({}) ... print("%s documents before calling delete_many()" % n) ... result = await db.test_collection.delete_many({"i": {"$gte": 1000}}) ... print("%s documents after" % (await coll.count_documents({}))) ... >>> IOLoop.current().run_sync(do_delete_many) 1999 documents before calling delete_many() 1000 documents after

remove

Commands

All operations on MongoDB are implemented internally as commands. Run them using the ~motor.motor_tornado.MotorDatabase.command method on ~motor.motor_tornado.MotorDatabase:

.. doctest:: after-inserting-2000-docs

>>> from bson import SON >>> async def use_distinct_command(): ... response = await db.command(SON([("distinct", "test_collection"), ("key", "i")])) ... >>> IOLoop.current().run_sync(use_distinct_command)

Since the order of command parameters matters, don't use a Python dict to pass the command's parameters. Instead, make a habit of using bson.SON, from the bson module included with PyMongo.

Many commands have special helper methods, such as ~MotorDatabase.create_collection or ~MotorCollection.aggregate, but these are just conveniences atop the basic command method.

commands

Further Reading

The handful of classes and methods introduced here are sufficient for daily tasks. The API documentation for MotorClient, MotorDatabase, MotorCollection, and MotorCursor provides a reference to Motor's complete feature set.

Learning to use the MongoDB driver is just the beginning, of course. For in-depth instruction in MongoDB itself, see The MongoDB Manual.