Skip to content

Commit

Permalink
merge: rename collection.
Browse files Browse the repository at this point in the history
  • Loading branch information
Sam Kleinman committed Oct 1, 2012
2 parents be4adad + 59042bd commit 5c6ecc8
Show file tree
Hide file tree
Showing 2 changed files with 218 additions and 11 deletions.
222 changes: 211 additions & 11 deletions draft/core/read-operations.txt
Expand Up @@ -4,29 +4,228 @@ Read Operations

.. default-domain:: mongodb

Synopsis
--------
Read operations determine how MongoDB returns collection data when you issue a query.

Queries
-------
This document describes how MongoDB performs read operations and how
different factors affect the efficiency of reads.

- :doc:`/reference/operators`
- :method:`find <db.collection.find()>`
- :dbcommand:`findOne`
.. TODO for information about queries, see ???.

.. index:: read operation; query
.. index:: query; read operations
.. _read-operations-query-operators:

Query Operations
----------------

Queries retrieve data from your database collections. How a query
retrieves data is dependent on MongoDB read operations and on the
indexes you have created.

.. _read-operations-query-syntax:

Query Syntax
~~~~~~~~~~~~

For a list of query operators, see :doc:`/reference/operators`.

.. TODO see the yet-to-be created query operations doc

.. _read-operations-indexing:

Indexes
~~~~~~~

Indexes improve the efficiency of read operations by optimizing queries,
pre-sorting query results, and making it possible to store fewer
documents in memory. :

The most selective indexes return the fastest results. The most
selective index possible for a given query is an index for which all the
documents that match the query criteria also match the entire query.

.. example::

Given a collection with the following indexes, data, and query:

Indexes:

.. code-block:: javascript

{ x : 1 } , { y : 1 }

Data:

.. code-block:: javascript

{ x : 1 , y : 2 }
{ x : 2 , y : 1 }
{ x : 3 , y : 0 }
{ x : 4 , y : 0 }

Query:

.. code-block:: javascript

{ x : { $gte : 1 } , y : { $gte : 1} }

The ``{ y : 1 }`` index is more selective because all the documents
that match the query's ``y`` key value also match the entire query.
Conversely, not all the documents that match the query's ``x`` key
value also match the entire query.

.. seealso::

- The :doc:`/indexes` documentation, in particular :doc:`/applications/indexes`
- :doc:`/reference/operators`
- :method:`find <db.collection.find()>`
- :method:`findOne`

.. index:: query optimizer
.. _read-operations-query-optimization:

Query Optimization
~~~~~~~~~~~~~~~~~~

MongoDB provides a query optimizer that matches a query to the index
that performs the fastest read operation for that query.

When you issue a query for the first time, the query optimizer runs the
query against several indexes to find the most efficient. The optimizer
then creates a "query plan" that specifies the index for future runs of
the query.

The MongoDB query optimizer deletes a query plan when a collection has
changed to a point that the the specified index might no longer provide
the fastest results.

Query plans take advantage of MongoDB's indexing features. You should
always write indexes that use the same fields and that sort in the same
order as do your queries. For more information, see :doc:`/applications/indexes`.

MongoDB creates a query plan as follows: When you run a query for which
there is no query plan, either because the query is new or the old plan
is obsolete, the query optimizer runs the query against several indexes
at once in parallel but records the results in a single common buffer,
as though the results all come from the same index. As each index yields
a match, MongoDB records the match in the buffer. If an index returns a
result already returned by another index, the optimizer recognizes the
duplication and skips the duplicate match.

The optimizer determines a "winning" index when either of
the following occur:

- The optimizer exhausts an index, which means that the index has
provided the full result set. At this point, the optimizer stops
querying.

- The optimizer reaches 101 results. At this point, the optimizer
chooses the plan that has provided the most results *first* and
continues reading only from that plan. Note that another index might
have provided all those results as duplicates but because the
"winning" index provided the full result set first, it is more
efficient.

The "winning" index now becomes the index specified in the query plan as
the one to use the next time the query is run.

To evaluate the optimizer's choice of query plan, run the query again
with the :method:`explain() <cursor.explain()>` method and
:method:`hint() <cursor.hint()>` methods appended. Instead of returning
query results, this returns statistics about how the query runs. For example:

.. code-block:: javascript

db.people.find( { name:"John"} ).explain().hint()

For details on the output, see :method:`explain() <cursor.explain()>`.

.. note::

If you run :method:`explain() <cursor.explain()>` without including
:method:`hint() <cursor.hint()>`, the query optimizer will
re-evaluate the query and run against multiple indexes before
returning the query statistics. Unless you want the optimizer to
re-evaluate the query, do not leave off :method:`hint()
<cursor.hint()>`.

Because your collections will likely change over time, the query
optimizer deletes a query plan and re-evaluates the indexes when any
of the following occur:

- The number of writes to the collection reaches 1,000.

- You run the :dbcommand:`reIndex` command on the index.

- You restart :program:`mongod`.

When you re-evaluate a query, the optimizer will display the same
results (assuming no data has changed) but might display the results in
a different order, and the :method:`explain() <cursor.explain()>` method
and :method:`hint() <cursor.hint()>` methods might result in different
statistics. This is because the optimizer retrieves the results from
several indexes at once during re-evaluation and the order in which
results appear depends on the order of the indexes within the parallel
querying.

.. _read-operations-projection:

Projection
~~~~~~~~~~

A projection specifies which field values from an array a query should
return for matching documents. If you run a query *without* a
projection, the query returns all fields and values for matching
documents, which can add unnecessary network and deserialization costs.

To run the most efficient queries, use the following projection
operators when possible when querying on array values. For documentation
on each operator, click the operator name:

- :projection:`$elemMatch`

- :projection:`$slice`

.. _read-operations-aggregation:

Aggregation
-----------
~~~~~~~~~~~

.. Probably short, but there's no docs for old-style aggregation so.

.. - basic aggregation (count, distinct)
.. - legacy agg: group
.. - big things: mapreduce, aggregation

.. seealso:: :doc:`/applications/aggregation`

Indexing
--------
.. index:: read operation; architecture
.. _read-operations-architecture:

Query Operators that Cannot Use Indexes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some query operators cannot take advantage of indexes and require a
collection scan. When using these operators you can narrow the documents
scanned by combining the operator with another operator that does use an
index.

Operators that cannot use indexes include the following:

.. seealso:: :doc:`/core/indexes`
- :operator:`$nin`

- :operator:`$ne`

.. TODO Regular expressions queries also do not use an index.
.. TODO :method:`cursor.skip()` can cause paginating large numbers of docs

Architecture
------------

.. index:: read operation; connection pooling
.. index:: connection pooling; read operations
.. _read-operations-connection-pooling:

Connection Pooling
~~~~~~~~~~~~~~~~~~

Expand All @@ -35,3 +234,4 @@ Shard Clusters

Replica Sets
~~~~~~~~~~~~

7 changes: 7 additions & 0 deletions source/reference/glossary.txt
Expand Up @@ -858,3 +858,10 @@ Glossary
standalone
In MongoDB, a standalone is an instance of :program:`mongod` that
is running as a single server and not as part of a :term:`replica set`.

query optimizer
For each query, the MongoDB query optimizer generates a query plan
that matches the query to the index that produces the fastest
results. The optimizer then uses the query plan each time the
query is run. If a collection changes significantly, the optimizer
creates a new query plan.

0 comments on commit 5c6ecc8

Please sign in to comment.