-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new 'exact' DSL query #92351
base: main
Are you sure you want to change the base?
Add new 'exact' DSL query #92351
Changes from all commits
9fef06e
4486065
f0c0af7
52b2931
0974c3d
ae22768
ad7fe8b
4a3bf90
6d0c8e0
b09b770
f0b0c8c
c3b6b26
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 92351 | ||
summary: Add new 'exact' DSL query | ||
area: Search | ||
type: feature | ||
issues: [] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
[[query-dsl-exact-query]] | ||
=== Exact query | ||
++++ | ||
<titleabbrev>Exact</titleabbrev> | ||
++++ | ||
|
||
Returns documents that contain an exact value for a field. | ||
|
||
For most field types, this behaves the same as a <<query-dsl-term-query, term query>>. For | ||
analyzed fields such as <<text-field-type, text>> or <<keyword-field-type, keyword>> | ||
fields with a normalizer configured, it will only return documents where | ||
the query value exactly matches the entire content of the field in the source. | ||
|
||
[[exact-query-ex-request]] | ||
==== Example request | ||
|
||
[source,console] | ||
---- | ||
GET /_search | ||
{ | ||
"query": { | ||
"exact": { | ||
"text": "this is some text" | ||
} | ||
} | ||
} | ||
---- | ||
|
||
This will return documents that have precisely `this is some text` in their `text` field, | ||
but it will not return documents with values of `some text`, `this is some text again`, or | ||
`This is some Text`. | ||
|
||
[[exact-query-top-level-params]] | ||
==== Top-level parameters for `exact` | ||
`field`:: | ||
(Required, string) Name of the field you wish to search. | ||
+ | ||
|
||
[[exact-query-notes]] | ||
==== Notes | ||
|
||
[[exact-query-notes-multivalued]] | ||
===== Multi-valued fields | ||
|
||
If a document has multiple values for a field, then `exact` will return if it matches | ||
any one of those values. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -120,6 +120,11 @@ protected Object getSampleValueForDocument() { | |
return "new york city"; | ||
} | ||
|
||
@Override | ||
protected boolean supportsExactQuery() { | ||
return false; // TODO: support this? Needs fielddata script access | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before adding this and other TODOs that are difficult to remove without more context in the future, should we collect reasons for potential support of this and other field types in a follow up issue? To me it gives a better overview, has more space for context than a short TODO and is more visible in the backlog. |
||
} | ||
|
||
@Override | ||
protected Collection<? extends Plugin> getPlugins() { | ||
return List.of(new MapperExtrasPlugin()); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
setup: | ||
- skip: | ||
version: " - 8.7.0" | ||
reason: exact query introduced in 8.7 | ||
|
||
- do: | ||
indices.create: | ||
index: exact_query_test | ||
body: | ||
settings: | ||
analysis: | ||
normalizer: | ||
lowercase: | ||
type: custom | ||
filter: lowercase | ||
mappings: | ||
properties: | ||
text: | ||
type: text | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering if it would also make sense to add a test with a text fields with a non-standard analyzer, e.g. something that changes case here |
||
keyword: | ||
type: keyword | ||
nkeyword: | ||
type: keyword | ||
normalizer: lowercase | ||
|
||
- do: | ||
bulk: | ||
refresh: true | ||
body: | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "1" } }' | ||
- '{ "text" : "here is some text", "keyword" : ["foo", "bar"], "nkeyword" : "hello" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "2" } }' | ||
- '{ "text" : "here is some different text", "keyword" : ["foo", "baz"], "nkeyword" : "HELLO" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "3" } }' | ||
- '{ "text" : "there is some text", "keyword" : "bar", "nkeyword" : "HELLO" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "4" } }' | ||
- '{ "text" : "here is some text", "keyword" : ["foo", "bar"], "nkeyword" : "hello" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "5" } }' | ||
- '{ "text" : "text", "keyword" : "foo" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "6" } }' | ||
- '{ "text" : "here", "keyword" : ["foo", "bar"], "nkeyword" : "hello" }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "7" } }' | ||
- '{ "text" : "there is some text", "keyword" : ["foo", "bar"] }' | ||
- '{ "index" : { "_index" : "exact_query_test", "_id" : "8" } }' | ||
- '{ "text" : "here is some text", "keyword" : ["baz", "bar"], "nkeyword" : "Hello" }' | ||
|
||
--- | ||
"test exact query on text fields": | ||
- do: | ||
search: | ||
index: exact_query_test | ||
body: | ||
query: | ||
exact: | ||
text: "here is some text" | ||
|
||
- match: { hits.total: 3 } | ||
|
||
- do: | ||
search: | ||
index: exact_query_test | ||
body: | ||
query: | ||
exact: | ||
text: "here" | ||
|
||
- match: { hits.total: 1 } | ||
|
||
--- | ||
"test exact query on keyword fields": | ||
- do: | ||
search: | ||
index: exact_query_test | ||
body: | ||
query: | ||
exact: | ||
keyword: baz | ||
|
||
- match: { hits.total: 2 } | ||
|
||
--- | ||
"test exact query on normalized keyword fields": | ||
- do: | ||
search: | ||
index: exact_query_test | ||
body: | ||
query: | ||
exact: | ||
nkeyword: hello | ||
|
||
- match: { hits.total: 3 } | ||
|
||
- do: | ||
search: | ||
index: exact_query_test | ||
body: | ||
query: | ||
exact: | ||
nkeyword: HELLO | ||
|
||
- match: { hits.total: 2 } |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -430,6 +430,14 @@ public Query termQuery(Object value, SearchExecutionContext context) { | |
} | ||
} | ||
|
||
@Override | ||
public Query exactQuery(Object value, SearchExecutionContext context) { | ||
if (normalizer == Lucene.KEYWORD_ANALYZER) { | ||
return super.exactQuery(value, context); | ||
} | ||
return new TextFieldExactQuery(this, context.getForField(this, FielddataOperation.SOURCE), value.toString()); | ||
} | ||
|
||
@Override | ||
public Query termsQuery(Collection<?> values, SearchExecutionContext context) { | ||
failIfNotIndexedNorDocValuesFallback(context); | ||
|
@@ -703,11 +711,8 @@ public IndexFieldData.Builder fielddataBuilder(FieldDataContext fieldDataContext | |
failIfNoDocValues(); | ||
return fieldDataFromDocValues(); | ||
} | ||
if (operation != FielddataOperation.SCRIPT) { | ||
throw new IllegalStateException("unknown operation [" + operation.name() + "]"); | ||
} | ||
|
||
if (hasDocValues()) { | ||
if (operation != FielddataOperation.SOURCE && hasDocValues()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I read this correctly, for operation == FielddataOperation.SEARCH this falls through until here now, formerly it would have raised an exception. Is this intended? |
||
return fieldDataFromDocValues(); | ||
} | ||
if (isSyntheticSource) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -85,8 +85,12 @@ public MappedFieldType( | |
* field data from and generate a representation of doc values. | ||
*/ | ||
public enum FielddataOperation { | ||
// Fielddata to be used as part of a search or aggregation | ||
SEARCH, | ||
SCRIPT | ||
// Fielddata to be used as part of a script | ||
SCRIPT, | ||
// Fielddata that must be read from source | ||
SOURCE | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For my understanding: I see this constant set in FieldDataContext three times but I cannot find a spot where the value is actually used to execute - say another code path - than the existing ones. Is it solely here to mark field data usage other than SCRIPT or SEARCH? |
||
} | ||
|
||
/** | ||
|
@@ -214,6 +218,13 @@ public Query termQueryCaseInsensitive(Object value, @Nullable SearchExecutionCon | |
); | ||
} | ||
|
||
/** | ||
* Generates a query that will only match documents with a field that contains exactly this value | ||
*/ | ||
public Query exactQuery(Object value, SearchExecutionContext context) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add javadocs? |
||
return new ConstantScoreQuery(termQuery(value, context)); | ||
} | ||
|
||
/** Build a constant-scoring query that matches all values. The default implementation uses a | ||
* {@link ConstantScoreQuery} around a {@link BooleanQuery} whose {@link Occur#SHOULD} clauses | ||
* are generated with {@link #termQuery}. */ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe use UnsupportedOperationException here and all other similar implementations?