Skip to content
This repository has been archived by the owner on Feb 20, 2019. It is now read-only.

Latest commit

 

History

History
352 lines (217 loc) · 8.44 KB

django.rst

File metadata and controls

352 lines (217 loc) · 8.44 KB

elasticutils.contrib.django: Using with Django

Django helpers are all located in elasticutils.contrib.django.

This chapter covers using ElasticUtils Django bits.

ElasticUtils depends on the following settings in your Django settings file:

.. module:: django.conf.settings

.. data:: ES_DISABLED

   If `ES_DISABLED = True`, then Any method wrapped with
   `es_required` will return and log a warning. This is useful while
   developing, so you don't have to have ElasticSearch running.

.. data:: ES_DUMP_CURL

   If set to a file path all the requests that `ElasticUtils` makes
   will be dumped into the designated file.

   If set to a class instance, calls the ``.write()`` method with
   the curl equivalents.

   See :ref:`django-debugging` for more details.

.. data:: ES_HOSTS

   This is a list of ES hosts. In development this will look like::

       ES_HOSTS = ['127.0.0.1:9200']

.. data:: ES_INDEXES

   This is a mapping of doctypes to indexes. A `default` mapping is
   required for types that don't have a specific index.

   When ElasticUtils queries the index for a model, by default it
   derives the doctype from `Model._meta.db_table`. When you build
   your indexes and mapping types, make sure to match the indexes and
   mapping types you're using.

   Example 1::

       ES_INDEXES = {'default': 'main_index'}

   This only has a default, so all ElasticUtils queries will look in
   `main_index` for all mapping types.

   Example 2::

       ES_INDEXES = {'default': 'main_index',
                     'splugs': 'splugs_index'}

   Assuming you have a `Splug` model which has a
   `Splug._meta.db_table` value of `splugs`, then ElasticUtils will
   run queries for `Splug` in the `splugs_index`.  ElasticUtils will
   run queries for other models in `main_index` because that's the
   default.

   Example 3::

       ES_INDEXES = {'default': ['main_index'],
                     'splugs': ['splugs_index']}

   FIXME: The API allows for this. Pretty sure it should query
   multiple indexes, but we have no tests for that and I haven't
   tested it, either.


.. data:: ES_TIMEOUT

   Defines the timeout for the `ES` connection.  This defaults to 5
   seconds.


The get_es() in the Django contrib will helpfully cache your ES objects thread-local.

It is built with the settings from your django.conf.settings.

Note

get_es() only caches the ES if you don't pass in any override arguments. If you pass in override arguments, it doesn't cache it and instead creates a new one.

Requirements:Django

The elasticutils.contrib.django.S class takes a MappingType in the constructor. That allows you to tie Django ORM models to ElasticSearch index search results.

In elasticutils.contrib.django.models is DjangoMappingType which has some additional Django ORM-specific code in it to make it easier.

Define a DjangoMappingType subclass for your model. The minimal you need to define is get_model.

Further, you can use the Indexable mixin to get a bunch of helpful indexing-related code.

For example, here's a minimal DjangoMappingType subclass:

from django.models import Model
from elasticutils.contrib.django.models import DjangoMappingType


class MyModel(Model):
    ...


class MyMappingType(DjangoMappingType):
    @classmethod
    def get_model(cls):
        return MyModel

searcher = MyMappingType.search()

Here's one that uses Indexable and handles indexing:

from django.models import Model
from elasticutils.contrib.django.models import DjangoMappingType


class MyModel(Model):
    ...


class MyMappingType(DjangoMappingType, Indexable):
    @classmethod
    def get_model(cls):
        return MyModel

    @classmethod
    def extract_document(cls, obj_id, obj=None):
        if obj is None:
            obj = cls.get_model().get(pk=obj_id)

        return {
            'id': obj.id,
            'name': obj.name,
            'bio': obj.bio,
            'age': obj.age
            }


searcher = MyMappingType.search()

This example doesn't specify a mapping. That's ok because ElasticSearch will infer from the shape of the data how it should analyze and store the data.

If you want to specify this explicitly (and I suggest you do for anything that involves strings), then you want to additionally override .get_mapping(). Let's refine the above example by explicitly specifying .get_mapping().

from django.models import Model
from elasticutils.contrib.django.models import DjangoMappingType


class MyModel(Model):
    ...


class MyMappingType(DjangoMappingType, Indexable):
    @classmethod
    def get_model(cls):
        return MyModel

    @classmethod
    def get_mapping(cls):
        """Returns an ElasticSearch mapping."""
        return {
            # The id is an integer, so store it as such. ES would have
            # inferred this just fine.
            'id': {'type': 'integer'},

            # The name is a name---so we shouldn't analyze it
            # (de-stem, tokenize, parse, etc).
            'name': {'type': 'string', 'index': 'not_analyzed'},

            # The bio has free-form text in it, so analyze it with
            # snowball.
            'bio': {'type': 'string', 'analyzer': 'snowball'},

            # Age is an integer
            'age': {'type': 'integer'}
            }

    @classmethod
    def extract_document(cls, obj_id, obj=None):
        if obj is None:
            obj = cls.get_model().get(pk=obj_id)

        return {
            'id': obj.id,
            'name': obj.name,
            'bio': obj.bio,
            'age': obj.age
            }


searcher = MyMappingType.search()
.. autoclass:: elasticutils.contrib.django.models.DjangoMappingType
   :members:


.. autoclass:: elasticutils.contrib.django.models.Indexable
   :members:


.. seealso::

   http://www.elasticsearch.org/guide/reference/mapping/
     The ElasticSearch guide on mapping types.

   http://www.elasticsearch.org/guide/reference/mapping/core-types.html
     The ElasticSearch guide on mapping type field types.


Requirements:Django, Celery

You can then utilize things such as :func:`elasticutils.contrib.django.tasks.index_objects` to automatically index all new items.

.. autofunction:: elasticutils.contrib.django.es_required

.. autofunction:: elasticutils.contrib.django.es_required_or_50x


.. automodule:: elasticutils.contrib.django.tasks

   .. autofunction:: index_objects(model, ids=[...])


.. automodule:: elasticutils.contrib.django.cron

   .. autofunction:: reindex_objects(model, chunk_size[=150])


Requirements:Django, test_utils, nose

In elasticutils.contrib.django.estestcase, is ESTestCase which can be subclassed in your app's test cases.

It does the following:

  • If ES_HOSTS is empty it raises a SkipTest.
  • self.es is available from the ESTestCase class and any subclasses.
  • At the end of the test case the index is wiped.

Example:

from elasticutils.djangolib import ESTestCase


class TestQueries(ESTestCase):
    def test_query(self):
        ...

    def test_locked_filters(self):
        ...

You can set the settings.ES_DUMP_CURL to a few different things all of which can be helpful in debugging ElasticUtils.

  1. a file path

    This will cause PyES to write the curl equivalents of the commands it's sending to ElasticSearch to a file.

    Example setting:

    ES_DUMP_CURL = '/var/log/es_curl.log'
    

    Note

    The file is not closed until the process ends. Because of that, you don't see much in the file until it's done.

  2. a class instance that has a .write() method

    PyES will call the .write() method with the curl equivalent and then you can do whatever you want with it.

    For example, this writes curl equivalent output to stdout:

    class CurlDumper(object):
        def write(self, s):
            print s
    ES_DUMP_CURL = CurlDumper()