New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added cross_fields
type to multi_match query
#5005
Conversation
@@ -36,15 +36,35 @@ individual clauses inside the top level query. | |||
|
|||
* `fields` - Fields to be used in the query. | |||
* `use_dis_max` - Boolean indicating to either create a `dis_max` query | |||
or a `bool` query. Defaults to `true`. | |||
or a `bool` query. Defaults to `true`. (Deprecated since coming[1.1.0] use type `best_fields` or `most_fields` instead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaults to true
. deprecated[1.1.0,Use type best_fields
or most_fields
instead]
I merged in the changes from clinton and updated some minor stuff in the docs. I also added some more test randomization. I think it's ready |
+1, so many people on the mailing list have been hit by this "misunderstanding" of the multi-match query. The doc updates in this PR alone will help considerably! I love how you group the fields by analyzer without requiring the user to do it by hand (not that it is hard). Is is going to be in 1.0? I only see a 1.1 tag. |
What is the expected behaviour when (stupidly) querying across a combo of number and text fields? Currently get a NumberFormatException from parser on text search terms. Might user expect e.g. "37 the high street" to work on house number (numeric) and street name (text) fields without error? |
@markharwood if the user specifies an analyzer for the query, then i would have expected this to work, but apparently it doesn't.
returns an explanation of:
But it doesn't:
|
@clintongormley I've tried specifying 'standard' analyzer and it looks like it doesn't get used because the mapping for integer fields doesn't want to use any analyzer on the input (due to IntegerFieldMapper.useTermQueryWithQueryString returns true). So the query string does not get tokenized for this field and IntegerFieldMapper just barfs on the full string "37 the high st" as an invalid number. |
@clintongormley @markharwood I just pushed a fix for the numeric problem your test should work now. I will pull this in if nobody objects. |
|
||
BlendedTermQuery that = (BlendedTermQuery) o; | ||
|
||
if (!Arrays.equals(terms, that.terms)) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should ignore the order of the terms here?
@jpountz I fixed the last two comments and squashed after rebase to master. Can you take a last look? |
@mattweber |
+1! |
@s1monw Just tried this again and was still getting java.lang.NumberFormatException: For input string: "song 2"
This is on the musicbrainz dataset I shared where field "Album" is type string and "Ignore2" is type integer. |
@markharwood can you confirm that last commit fixed your issues? |
Following latest changes I can confirm that #5005 (comment) no longer throws exceptions and the logic for checking numeric fields works in my test |
`cross_fields` attemps to treat fields with the same analysis configuration as a single field and uses maximum score promotion or combination of the scores based depending on the `use_dis_max` setting. By default scores are combined. `cross_fields` can also search across fields of hetrogenous types for instance if numbers can be part of the query it makes sense to search also on numeric fields if an analyzer is provided in the reqeust. Relates to elastic#2959
pushed to 1.x and master |
This fixes a regression introduced by elastic#5005 where the query slop was simply ignored when a `match_phrase_prefix` type was set. Closes elastic#5437
cross_fields
attempts to treat fields with the same analysisconfiguration as a single field and uses maximum score promotion or
combination of the scores based depending on the
use_dis_max
setting.By default scores are combined.
Relates to #2959