Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

Reworked custom lookups docs.

Mostly just formatting and rewording, but also replaced the example
using ``YearExtract`` to  use an example which is unlikely to ever be
possible directly in the ORM.
  • Loading branch information...
commit f2dc4429a1da04c858364972eea57a35a868dab4 1 parent 2509006
Marc Tamlyn authored January 12, 2014

Showing 1 changed file with 192 additions and 150 deletions. Show diff stats Hide diff stats

  1. 342  docs/ref/models/custom_lookups.txt
342  docs/ref/models/custom_lookups.txt
@@ -2,37 +2,33 @@
2 2
 Custom lookups
3 3
 ==============
4 4
 
  5
+.. versionadded:: 1.7
  6
+
5 7
 .. module:: django.db.models.lookups
6 8
    :synopsis: Custom lookups
7 9
 
8 10
 .. currentmodule:: django.db.models
9 11
 
10  
-By default Django offers a wide variety of different lookups for filtering
11  
-(for example, `exact` and `icontains`). This documentation explains how to
12  
-write custom lookups and how to alter the working of existing lookups. In
13  
-addition how to transform field values is explained. fFor example how to
14  
-extract the year from a DateField. By writing a custom `YearExtract`
15  
-transformer it is possible to filter on the transformed value, for example::
16  
-
17  
-  Author.objects.filter(birthdate__year__lte=1981)
18  
-
19  
-Currently transformers are only available in filtering. So, it is not possible
20  
-to use it in other parts of the ORM, for example this will not work::
21  
-
22  
-  Author.objects.values_list('birthdate__year')
  12
+By default Django offers a wide variety of :ref:`built-in lookups
  13
+<field-lookups>` for filtering (for example, ``exact`` and ``icontains``). This
  14
+documentation explains how to write custom lookups and how to alter the working
  15
+of existing lookups.
23 16
 
24 17
 A simple Lookup example
25 18
 ~~~~~~~~~~~~~~~~~~~~~~~
26 19
 
27  
-Lets start with a simple custom lookup. We will write a custom lookup `ne`
28  
-which works opposite to `exact`. A `Author.objects.filter(name__ne='Jack')`
29  
-will translate to::
  20
+Let's start with a simple custom lookup. We will write a custom lookup ``ne``
  21
+which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
  22
+will translate to the SQL::
30 23
 
31 24
   "author"."name" <> 'Jack'
32 25
 
33  
-A custom lookup will need an implementation and Django needs to be told
34  
-the existence of the lookup. The implementation for this lookup will be
35  
-simple to write::
  26
+This SQL is backend independent, so we don't need to worry about different
  27
+databases.
  28
+
  29
+There are two steps to making this work. Firstly we need to implement the
  30
+lookup, then we need to tell Django about it. The implementation is quite
  31
+straightforwards::
36 32
 
37 33
   from django.db.models import Lookup
38 34
 
@@ -45,131 +41,165 @@ simple to write::
45 41
           params = lhs_params + rhs_params
46 42
           return '%s <> %s' % (lhs, rhs), params
47 43
 
48  
-To register the `NotEqual` lookup we will just need to call register_lookup
49  
-on the field class we want the lookup to be available::
  44
+To register the ``NotEqual`` lookup we will just need to call
  45
+``register_lookup`` on the field class we want the lookup to be available. In
  46
+this case, the lookup makes sense on all ``Field`` subclasses, so we register
  47
+it with ``Field`` directly::
50 48
 
51 49
   from django.db.models.fields import Field
52 50
   Field.register_lookup(NotEqual)
53 51
 
54  
-Now Field and all its subclasses have a NotEqual lookup.
55  
-
56  
-The first notable thing about `NotEqual` is the lookup_name. This name must
57  
-be supplied, and it is used by Django in the register_lookup() call so that
58  
-Django knows to associate `ne` to the NotEqual implementation.
59  
-`
60  
-An Lookup works against two values, lhs and rhs. The abbreviations stand for
61  
-left-hand side and right-hand side. The lhs is usually a field reference,
62  
-but it can be anything implementing the query expression API. The
63  
-rhs is the value given by the user. In the example `name__ne=Jack`, the
64  
-lhs is reference to Author's name field and Jack is the value.
65  
-
66  
-The lhs and rhs are turned into values that are possible to use in SQL.
67  
-In the example above lhs is turned into "author"."name", [], and rhs is
68  
-turned into "%s", ['Jack']. The lhs is just raw string without parameters
69  
-but the rhs is turned into a query parameter 'Jack'.
70  
-
71  
-Finally we combine the lhs and rhs by adding ` <> ` in between of them,
72  
-and supply all the parameters for the query.
73  
-
74  
-A Lookup needs to implement a limited part of query expression API. See
75  
-the query expression API for details.
  52
+We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
  53
+this registration happens before you try to create any querysets using it. You
  54
+could place the implementation in a ``models.py`` file, or register the lookup
  55
+in the ``ready()`` method of an ``AppConfig``.
  56
+
  57
+Taking a closer look at the implementation, the first required attribute is
  58
+``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
  59
+and use ``NotEqual`` to generate the SQL. By convention, these names are always
  60
+lowercase strings containing only letters, but the only hard requirement is
  61
+that it must not contain the string ``__``.
  62
+
  63
+A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
  64
+left-hand side and right-hand side. The left-hand side is usually a field
  65
+reference, but it can be anything implementing the :ref:`query expression API
  66
+<query-expression>`. The right-hand is the value given by the user. In the
  67
+example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
  68
+reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
  69
+right-hand side.
  70
+
  71
+We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
  72
+need for SQL. In the above example, ``process_lhs`` returns
  73
+``('"author"."name"', [])`` and ``process_rhs`` returns ``('"%s"', ['Jack'])``.
  74
+In this example there were no parameters for the left hand side, but this would
  75
+depend on the object we have, so we still need to include them in the
  76
+parameters we return.
  77
+
  78
+Finally we combine the parts into a SQL expression with ``<>``, and supply all
  79
+the parameters for the query. We then return a tuple containing the generated
  80
+SQL string and the parameters.
76 81
 
77 82
 A simple transformer example
78 83
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
79 84
 
80  
-We will next write a simple transformer. The transformer will be called
81  
-`YearExtract`. It can be used to extract the year part from `DateField`.
  85
+The custom lookup above is great, but in some cases you may want to be able to
  86
+chain lookups together. For example, let's suppose we are building an
  87
+application where we want to make use of the ``abs()`` operator.
  88
+We have an ``Experiment`` model which records a start value, end value and the
  89
+change (start - end). We would like to find all experiments where the change
  90
+was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
  91
+or where it did not exceede a certain amount
  92
+(``Experiment.objects.filter(change__abs__lt=27)``).
  93
+
  94
+.. note::
  95
+    This example is somewhat contrived, but it demonstrates nicely the range of
  96
+    functionality which is possible in a database backend independent manner,
  97
+    and without duplicating functionality already in Django.
82 98
 
83  
-Lets start by writing the implementation::
  99
+We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL
  100
+function ``ABS()`` to transform the value before comparison::
84 101
 
85 102
   from django.db.models import Extract
86 103
 
87  
-  class YearExtract(Extract):
88  
-      lookup_name = 'year'
89  
-      output_type = IntegerField()
  104
+  class AbsoluteValue(Extract):
  105
+      lookup_name = 'abs'
90 106
 
91 107
       def as_sql(self, qn, connection):
92 108
           lhs, params = qn.compile(self.lhs)
93  
-          return "EXTRACT(YEAR FROM %s)" % lhs, params
  109
+          return "ABS(%s)" % lhs, params
94 110
 
95  
-Next, lets register it for `DateField`::
  111
+Next, lets register it for ``IntegerField``::
96 112
 
97  
-  from django.db.models import DateField
98  
-  DateField.register_lookup(YearExtract)
  113
+  from django.db.models import IntegerField
  114
+  IntegerField.register_lookup(AbsoluteValue)
99 115
 
100  
-Now any DateField in your project will have `year` transformer. For example
101  
-the following query::
  116
+We can now run the queris we had before.
  117
+``Experiment.objects.filter(change__abs=27)`` will generate the following SQL::
102 118
 
103  
-  Author.objects.filter(birthdate__year__lte=1981)
  119
+    SELECT ... WHERE ABS("experiments"."change") = 27
104 120
 
105  
-would translate to the following query on PostgreSQL::
  121
+By using ``Extract`` instead of ``Lookup`` it means we are able to chain
  122
+further lookups afterwards. So
  123
+``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
  124
+SQL::
106 125
 
107  
-  SELECT ...
108  
-    FROM "author"
109  
-    WHERE EXTRACT(YEAR FROM "author"."birthdate") <= 1981
  126
+    SELECT ... WHERE ABS("experiments"."change") < 27
110 127
 
111  
-An YearExtract class works only against self.lhs. Usually the lhs is
112  
-transformed in some way. Further lookups and extracts work against the
113  
-transformed value.
  128
+Subclasses of ``Extract`` usually only operate on the left-hand side of the
  129
+expression. Further lookups will work on the transformed value. Note that in
  130
+this case where there is no other lookup specified, Django interprets
  131
+``change__abs=27`` as ``change__abs__exact=27``.
114 132
 
115  
-Note the definition of output_type in the `YearExtract`. The output_type is
116  
-a field instance. It informs Django that the Extract class transformed the
117  
-type of the value to an int. This is currently used only to check which
118  
-lookups the extract has.
  133
+When looking for which lookups are allowable after the ``Extract`` has been
  134
+applied, Django uses the ``output_type`` attribute. We didn't need to specify
  135
+this here as it didn't change, but supposing we were applying ``AbsoluteValue``
  136
+to some field which represents a more complex type (for example a point
  137
+relative to an origin, or a complex number) then we may have wanted to specify
  138
+``output_type = FloatField``, which will ensure that further lookups like
  139
+``abs__lte`` behave as they would for a ``FloatField``.
119 140
 
120  
-The used SQL in this example works on most databases. Check you database
121  
-vendor's documentation to see if EXTRACT(year from date) is supported.
  141
+Writing an efficient abs__lt lookup
  142
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122 143
 
123  
-Writing an efficient year__exact lookup
124  
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  144
+When using the above written ``abs`` lookup, the SQL produced will not use
  145
+indexes efficiently in some cases. In particular, when we use
  146
+``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
  147
+``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
125 148
 
126  
-When using the above written `year` lookup, the SQL produced will not use
127  
-indexes efficiently. We will fix that by writing a custom `exact` lookup
128  
-for YearExtract. For example if the user filters on
129  
-`birthdate__year__exact=1981`, then we want to produce the following SQL::
  149
+So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
  150
+the following SQL::
130 151
 
131  
-  birthdate >= to_date('1981-01-01') AND birthdate <= to_date('1981-12-31')
  152
+    SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
132 153
 
133 154
 The implementation is::
134 155
 
135 156
   from django.db.models import Lookup
136 157
 
137  
-  class YearExact(Lookup):
138  
-      lookup_name = 'exact'
  158
+  class AbsoluteValueLessThan(Lookup):
  159
+      lookup_name = 'lt'
139 160
 
140 161
       def as_sql(self, qn, connection):
141 162
           lhs, lhs_params = qn.compile(self.lhs.lhs)
142 163
           rhs, rhs_params = self.process_rhs(qn, connection)
143 164
           params = lhs_params + rhs_params + lhs_params + rhs_params
144  
-          return '%s >= to_date(%s || '-01-01') AND %s <= to_date(%s || '-12-31') % (lhs, rhs, lhs, rhs), params
  165
+          return '%s > %s AND %s < -%s % (lhs, rhs, lhs, rhs), params
145 166
 
146  
-  YearExtract.register_lookup(YearExact)
  167
+  AbsoluteValue.register_lookup(AbsoluteValueLessThan)
147 168
 
148  
-There are a couple of notable things going on. First, `YearExact` isn't
149  
-calling process_lhs(). Instead it skips and compiles directly the lhs used by
150  
-self.lhs. The reason this is done is to skip `YearExtract` from adding the
151  
-EXTRACT clause to the query. Referring directly to self.lhs.lhs is safe as
152  
-`YearExact` can be accessed only from `year__exact` lookup, that is the lhs
153  
-is always `YearExtract`.
  169
+There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
  170
+isn't calling ``process_lhs()``. Instead it skips the transformation of the
  171
+``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
  172
+want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is
  173
+safe as ``AbsoluteValueLessThan`` can be accessed only from the
  174
+``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of
  175
+``AbsoluteValue``.
154 176
 
155  
-Next, as both the lhs and rhs are used multiple times in the query the params
156  
-need to contain lhs_params and rhs_params multiple times.
  177
+Notice also that  as both sides are used multiple times in the query the params
  178
+need to contain ``lhs_params`` and ``rhs_params`` multiple times.
157 179
 
158  
-The final query does string manipulation directly in the database. The reason
159  
-for doing this is that if the self.rhs is something else than a plain integer
160  
-value (for exampel a `F()` reference) we can't do the transformations in
161  
-Python.
  180
+The final query does the inversion (``27`` to ``-27``) directly in the
  181
+database. The reason for doing this is that if the self.rhs is something else
  182
+than a plain integer value (for example an ``F()`` reference) we can't do the
  183
+transformations in Python.
  184
+
  185
+.. note::
  186
+    In fact, most lookups with ``__abs`` could be implemented as range queries
  187
+    like this, and on most database backend it is likely to be more sensible to
  188
+    do so as you can make use of the indexes. However with PostgreSQL you may
  189
+    want to add an index on ``abs(change)`` which would allow these queries to
  190
+    be very efficient.
162 191
 
163 192
 Writing alternative implemenatations for existing lookups
164 193
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
165 194
 
166 195
 Sometimes different database vendors require different SQL for the same
167 196
 operation. For this example we will rewrite a custom implementation for
168  
-MySQL for the NotEqual operator. Instead of `<>` we will be using `!=`
169  
-operator.
  197
+MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
  198
+operator. (Note that in reality almost all databases support both, including
  199
+all the official databases supported by Django).
170 200
 
171  
-There are two ways to do this. The first is to write a subclass with a
172  
-as_mysql() method and registering the subclass over the original class::
  201
+We can change the behaviour on a specific backend by creating a subclass of
  202
+``NotEqual`` with a ``as_mysql`` method::
173 203
 
174 204
   class MySQLNotEqual(NotEqual):
175 205
       def as_mysql(self, qn, connection):
@@ -179,80 +209,92 @@ as_mysql() method and registering the subclass over the original class::
179 209
           return '%s != %s' % (lhs, rhs), params
180 210
   Field.register_lookup(MySQLNotExact)
181 211
 
182  
-The alternate is to monkey-patch the existing class in place::
183  
-
184  
-  def as_mysql(self, qn, connection):
185  
-      lhs, lhs_params = self.process_lhs(qn, connection)
186  
-      rhs, rhs_params = self.process_rhs(qn, connection)
187  
-      params = lhs_params + rhs_params
188  
-      return '%s != %s' % (lhs, rhs), params
189  
-  NotEqual.as_mysql = as_mysql
190  
-
191  
-The subclass way allows one to override methods of the lookup if needed. The
192  
-monkey-patch way allows writing different implementations for the same class
193  
-in different locations of the project.
194  
-
195  
-The way Django knows to call as_mysql() instead of as_sql() is as follows.
196  
-When qn.compile(notequal_instance) is called, Django first checks if there
197  
-is a method named 'as_%s' % connection.vendor. If that method doesn't exist,
198  
-the as_sql() will be called.
199  
-
200  
-The vendor names for Django's in-built backends are 'sqlite', 'postgresql',
201  
-'oracle' and 'mysql'.
  212
+We can then register it with ``Field``. It takes the place of the original
  213
+``NotEqual`` class as it has 
202 214
 
203  
-The Lookup API
204  
-~~~~~~~~~~~~~~
  215
+When compiling a query, Django first looks for ``as_%s % connection.vendor``
  216
+methods, and then falls back to ``as_sql``. The vendor names for the in-built
  217
+backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
205 218
 
206  
-An lookup has attributes lhs and rhs. The lhs is something implementing the
207  
-query expression API and the rhs is either a plain value, or something that
208  
-needs to be compiled into SQL. Examples of SQL-compiled values include `F()`
209  
-references and usage of `QuerySets` as value.
  219
+.. note::
  220
+    If for some reason you need to change the lookup just for a specific query,
  221
+    you can do that and reregister the original lookup afterwards. However you
  222
+    need to be careful to ensure that your patch is in place until the queryset
  223
+    is evaluated, not just created.
210 224
 
211  
-A lookup needs to define lookup_name as a class level attribute. This is used
212  
-when registering lookups.
213  
-
214  
-A lookup has three public methods. The as_sql(qn, connection) method needs
215  
-to produce a query string and parameters used by the query string. The qn has
216  
-a method compile() which can be used to compile self.lhs. However usually it
217  
-is better to call self.process_lhs(qn, connection) instead, which returns
218  
-query string and parameters for the lhs. Similary process_rhs(qn, connection)
219  
-returns query string and parameters for the rhs.
  225
+.. _query-expression:
220 226
 
221 227
 The Query Expression API
222 228
 ~~~~~~~~~~~~~~~~~~~~~~~~
223 229
 
224 230
 A lookup can assume that the lhs responds to the query expression API.
225  
-Currently direct field references, aggregates and `Extract` instances respond
  231
+Currently direct field references, aggregates and ``Extract`` instances respond
226 232
 to this API.
227 233
 
228 234
 .. method:: as_sql(qn, connection)
229 235
 
230  
-Responsible for producing the query string and parameters for the expression.
231  
-The qn has a compile() method that can be used to compile other expressions.
232  
-The connection is the connection used to execute the query. The
233  
-connection.vendor attribute can be used to return different query strings
234  
-for different backends.
  236
+    Responsible for producing the query string and parameters for the
  237
+    expression. The ``qn`` has a ``compile()`` method that can be used to
  238
+    compile other expressions. The ``connection`` is the connection used to
  239
+    execute the query.
235 240
 
236  
-Calling expression.as_sql() directly is usually an error - instead
237  
-qn.compile(expression) should be used. The qn.compile() method will take
238  
-care of calling vendor-specific methods of the expression.
  241
+    Calling expression.as_sql() directly is usually incorrect - instead
  242
+    qn.compile(expression) should be used. The qn.compile() method will take
  243
+    care of calling vendor-specific methods of the expression.
239 244
 
240 245
 .. method:: as_vendorname(qn, connection)
241 246
 
242  
-Works like as_sql() method. When an expression is compiled by qn.compile()
243  
-Django will first try to call as_vendorname(), where vendorname is the vendor
244  
-name of the backend used for executing the query. The vendorname is one of
245  
-'postgresql', 'oracle', 'sqlite' or 'mysql' for Django's inbuilt backends.
  247
+    Works like ``as_sql()`` method. When an expression is compiled by
  248
+    ``qn.compile()``, Django will first try to call ``as_vendorname()``, where
  249
+    vendorname is the vendor name of the backend used for executing the query.
  250
+    The vendorname is one of ``postgresql``, ``oracle``, ``sqlite`` or
  251
+    ``mysql`` for Django's built-in backends.
246 252
 
247  
-.. method:: get_lookup(lookup_name)::
  253
+.. method:: get_lookup(lookup_name)
248 254
 
249  
-The get_lookup() method is used to fetch lookups. By default the lookup
250  
-is fetched from the expression's output type, but it is possible to override
251  
-this method to alter that behaviour.
  255
+    The ``get_lookup()`` method is used to fetch lookups. By default the lookup
  256
+    is fetched from the expression's output type, but it is possible to
  257
+    override this method to alter that behaviour.
252 258
 
253 259
 .. attribute:: output_type
254 260
 
255  
-The output_type attribute is used by the get_lookup() method to check for
256  
-lookups. The output_type should be a field instance.
  261
+    The ``output_type`` attribute is used by the ``get_lookup()`` method to check for
  262
+    lookups. The output_type should be a field.
257 263
 
258 264
 Note that this documentation lists only the public methods of the API.
  265
+
  266
+Lookup reference
  267
+~~~~~~~~~~~~~~~~
  268
+
  269
+.. class:: Lookup
  270
+
  271
+    In addition to the attributes and methods below, lookups also support
  272
+    ``as_sql`` and ``as_vendorname`` from the query expression API.
  273
+
  274
+.. attribute:: lhs
  275
+
  276
+    The ``lhs`` (left-hand side) of a lookup tells us what we are comparing the
  277
+    rhs to. It is an object which implements the query expression API. This is
  278
+    likely to be a field, an aggregate or a subclass of ``Extract``.
  279
+
  280
+.. attribute:: rhs
  281
+
  282
+    The ``rhs`` (right-hand side) of a lookup is the value we are comparing the
  283
+    left hand side to. It may be a plain value, or something which compiles
  284
+    into SQL, for example an ``F()`` object or a ``Queryset``.
  285
+
  286
+.. attribute:: lookup_name
  287
+
  288
+    This class level attribute is used when registering lookups. It determines
  289
+    the name used in queries to triger this lookup. For example, ``contains``
  290
+    or ``exact``. This should not contain the string ``__``.
  291
+
  292
+.. method:: process_lhs(qn, connection)
  293
+
  294
+    This returns a tuple of ``(lhs_string, lhs_params)``. In some cases you may
  295
+    wish to compile ``lhs`` directly in your ``as_sql`` methods using
  296
+    ``qn.compile(self.lhs)``.
  297
+
  298
+.. method:: process_rhs(qn, connection)
  299
+
  300
+    Behaves the same as ``process_lhs`` but acts on the right-hand side.

0 notes on commit f2dc442

Please sign in to comment.
Something went wrong with that request. Please try again.