Skip to content

Commit

Permalink
Docs: Rewrote the term query docs to explain analyzed vs not_analyzed
Browse files Browse the repository at this point in the history
  • Loading branch information
clintongormley committed May 8, 2015
1 parent 748a040 commit a536bd5
Showing 1 changed file with 146 additions and 11 deletions.
157 changes: 146 additions & 11 deletions docs/reference/query-dsl/term-query.asciidoc
Original file line number Diff line number Diff line change
@@ -1,31 +1,166 @@
[[query-dsl-term-query]]
== Term Query

Matches documents that have fields that contain a term (*not analyzed*).
The term query maps to Lucene `TermQuery`. The following matches
documents where the user field contains the term `kimchy`:
The `term` query finds documents that contain the *exact* term specified
in the inverted index. For instance:

[source,js]
--------------------------------------------------
{
"term" : { "user" : "kimchy" }
}
"term" : { "user" : "Kimchy" } <1>
}
--------------------------------------------------
<1> Finds documents which contain the exact term `Kimchy` in the inverted index
of the `user` field.

A boost can also be associated with the query:
A `boost` parameter can be specified to give this `term` query a higher
relevance score than another query, for instance:

[source,js]
--------------------------------------------------
GET /_search
{
"term" : { "user" : { "value" : "kimchy", "boost" : 2.0 } }
}
"query": {
"bool": {
"should": [
{
"term": {
"status": {
"value": "urgent",
"boost": 2.0 <1>
}
}
},
{
"term": {
"status": "normal" <2>
}
}
]
}
}
}
--------------------------------------------------

Or :
<1> The `urgent` query clause has a boost of `2.0`, meaning it is twice as important
as the query clause for `normal`.
<2> The `normal` clause has the default neutral boost of `1.0`.

.Why doesn't the `term` query match my document?
**************************************************
String fields can be `analyzed` (treated as full text, like the body of an
email), or `not_analyzed` (treated as exact values, like an email address or a
zip code). Exact values (like numbers, dates, and `not_analyzed` strings) have
the exact value specified in the field added to the inverted index in order
to make them searchable.
By default, however, `string` fields are `analyzed`. This means that their
values are first passed through an <<analysis,analyzer>> to produce a list of
terms, which are then added to the inverted index.
There are many ways to analyze text: the default
<<analysis-standard-analyzer,`standard` analyzer>> drops most punctuation,
breaks up text into individual words, and lower cases them. For instance,
the `standard` analyzer would turn the string ``Quick Brown Fox!'' into the
terms [`quick`, `brown`, `fox`].
This analysis process makes it possible to search for individual words
within a big block of full text.
The `term` query looks for the *exact* term in the field's inverted index --
it doesn't know anything about the field's analyzer. This makes it useful for
looking up values in `not_analyzed` string fields, or in numeric or date
fields. When querying full text fields, use the
<<query-dsl-match-query,`match` query>> instead, which understands how the field
has been analyzed.
To demonstrate, try out the example below. First, create an index, specifying the field mappings, and index a document:
[source,js]
--------------------------------------------------
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"full_text": {
"type": "string" <1>
},
"exact_value": {
"type": "string",
"index": "not_analyzed" <2>
}
}
}
}
}
PUT my_index/my_type/1
{
"term" : { "user" : { "term" : "kimchy", "boost" : 2.0 } }
}
"full_text": "Quick Foxes!", <3>
"exact_value": "Quick Foxes!" <4>
}
--------------------------------------------------
// AUTOSENSE
<1> The `full_text` field is `analyzed` by default.
<2> The `exact_value` field is set to be `not_analyzed`.
<3> The `full_text` inverted index will contain the terms: [`quick`, `foxes`].
<4> The `exact_value` inverted index will contain the exact term: [`Quick Foxes!`].
Now, compare the results for the `term` query and the `match` query:
[source,js]
--------------------------------------------------
GET my_index/my_type/_search
{
"query": {
"term": {
"exact_value": "Quick Foxes!" <1>
}
}
}
GET my_index/my_type/_search
{
"query": {
"term": {
"full_text": "Quick Foxes!" <2>
}
}
}
GET my_index/my_type/_search
{
"query": {
"term": {
"exact_value": "foxes" <3>
}
}
}
GET my_index/my_type/_search
{
"query": {
"match": {
"full_text": "Quick Foxes!" <4>
}
}
}
--------------------------------------------------
// AUTOSENSE
<1> This query matches because the `exact_value` field contains the exact
term `Quick Foxes!`.
<2> This query does not match, because the `full_text` field only contains
the terms `quick` and `foxes`. It does not contain the exact term
`Quick Foxes!`.
<3> A `term` query for the term `foxes` matches the `full_text` field.
<4> This `match` query on the `full_text` field first analyzes the query string,
then looks for documents containing `quick` or `foxes` or both.
**************************************************


0 comments on commit a536bd5

Please sign in to comment.