forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Docs: Rewrote the term query docs to explain analyzed vs not_analyzed
- Loading branch information
1 parent
748a040
commit a536bd5
Showing
1 changed file
with
146 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,166 @@ | ||
[[query-dsl-term-query]] | ||
== Term Query | ||
|
||
Matches documents that have fields that contain a term (*not analyzed*). | ||
The term query maps to Lucene `TermQuery`. The following matches | ||
documents where the user field contains the term `kimchy`: | ||
The `term` query finds documents that contain the *exact* term specified | ||
in the inverted index. For instance: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
{ | ||
"term" : { "user" : "kimchy" } | ||
} | ||
"term" : { "user" : "Kimchy" } <1> | ||
} | ||
-------------------------------------------------- | ||
<1> Finds documents which contain the exact term `Kimchy` in the inverted index | ||
of the `user` field. | ||
|
||
A boost can also be associated with the query: | ||
A `boost` parameter can be specified to give this `term` query a higher | ||
relevance score than another query, for instance: | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
GET /_search | ||
{ | ||
"term" : { "user" : { "value" : "kimchy", "boost" : 2.0 } } | ||
} | ||
"query": { | ||
"bool": { | ||
"should": [ | ||
{ | ||
"term": { | ||
"status": { | ||
"value": "urgent", | ||
"boost": 2.0 <1> | ||
} | ||
} | ||
}, | ||
{ | ||
"term": { | ||
"status": "normal" <2> | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
|
||
Or : | ||
<1> The `urgent` query clause has a boost of `2.0`, meaning it is twice as important | ||
as the query clause for `normal`. | ||
<2> The `normal` clause has the default neutral boost of `1.0`. | ||
|
||
.Why doesn't the `term` query match my document? | ||
************************************************** | ||
String fields can be `analyzed` (treated as full text, like the body of an | ||
email), or `not_analyzed` (treated as exact values, like an email address or a | ||
zip code). Exact values (like numbers, dates, and `not_analyzed` strings) have | ||
the exact value specified in the field added to the inverted index in order | ||
to make them searchable. | ||
By default, however, `string` fields are `analyzed`. This means that their | ||
values are first passed through an <<analysis,analyzer>> to produce a list of | ||
terms, which are then added to the inverted index. | ||
There are many ways to analyze text: the default | ||
<<analysis-standard-analyzer,`standard` analyzer>> drops most punctuation, | ||
breaks up text into individual words, and lower cases them. For instance, | ||
the `standard` analyzer would turn the string ``Quick Brown Fox!'' into the | ||
terms [`quick`, `brown`, `fox`]. | ||
This analysis process makes it possible to search for individual words | ||
within a big block of full text. | ||
The `term` query looks for the *exact* term in the field's inverted index -- | ||
it doesn't know anything about the field's analyzer. This makes it useful for | ||
looking up values in `not_analyzed` string fields, or in numeric or date | ||
fields. When querying full text fields, use the | ||
<<query-dsl-match-query,`match` query>> instead, which understands how the field | ||
has been analyzed. | ||
To demonstrate, try out the example below. First, create an index, specifying the field mappings, and index a document: | ||
[source,js] | ||
-------------------------------------------------- | ||
PUT my_index | ||
{ | ||
"mappings": { | ||
"my_type": { | ||
"properties": { | ||
"full_text": { | ||
"type": "string" <1> | ||
}, | ||
"exact_value": { | ||
"type": "string", | ||
"index": "not_analyzed" <2> | ||
} | ||
} | ||
} | ||
} | ||
} | ||
PUT my_index/my_type/1 | ||
{ | ||
"term" : { "user" : { "term" : "kimchy", "boost" : 2.0 } } | ||
} | ||
"full_text": "Quick Foxes!", <3> | ||
"exact_value": "Quick Foxes!" <4> | ||
} | ||
-------------------------------------------------- | ||
// AUTOSENSE | ||
<1> The `full_text` field is `analyzed` by default. | ||
<2> The `exact_value` field is set to be `not_analyzed`. | ||
<3> The `full_text` inverted index will contain the terms: [`quick`, `foxes`]. | ||
<4> The `exact_value` inverted index will contain the exact term: [`Quick Foxes!`]. | ||
Now, compare the results for the `term` query and the `match` query: | ||
[source,js] | ||
-------------------------------------------------- | ||
GET my_index/my_type/_search | ||
{ | ||
"query": { | ||
"term": { | ||
"exact_value": "Quick Foxes!" <1> | ||
} | ||
} | ||
} | ||
GET my_index/my_type/_search | ||
{ | ||
"query": { | ||
"term": { | ||
"full_text": "Quick Foxes!" <2> | ||
} | ||
} | ||
} | ||
GET my_index/my_type/_search | ||
{ | ||
"query": { | ||
"term": { | ||
"exact_value": "foxes" <3> | ||
} | ||
} | ||
} | ||
GET my_index/my_type/_search | ||
{ | ||
"query": { | ||
"match": { | ||
"full_text": "Quick Foxes!" <4> | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
// AUTOSENSE | ||
<1> This query matches because the `exact_value` field contains the exact | ||
term `Quick Foxes!`. | ||
<2> This query does not match, because the `full_text` field only contains | ||
the terms `quick` and `foxes`. It does not contain the exact term | ||
`Quick Foxes!`. | ||
<3> A `term` query for the term `foxes` matches the `full_text` field. | ||
<4> This `match` query on the `full_text` field first analyzes the query string, | ||
then looks for documents containing `quick` or `foxes` or both. | ||
************************************************** | ||
|
||
|