New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternate model for field AND logic within MultiMatch query #2959
Conversation
this looks interesting... I hope I can get to it soon! |
hey @tarass I have been playing with similar ideas in many projects where you have enough knowledge / structure about your data to match Does this make sense? |
It does make sense in concept, but I'll have to do some testing on how PositionLengthAttribute works. The delimiter case is something I was concerned about but didn't know how to fix. I'll get out an update in a few days. |
Hi @tarass any update on this patch? We've run into the same problem, and I'd rather have a solution that is built into ES than hack together an ugly query on the client side. Let me know if I can help get this over the finish line. |
Have really not had the time to finish the switch to PositionLengthAttribute. Since someone else needs it, I'll try and find the time. |
Thanks @tarass that would be really great. I'd hoped to get to working on it myself this month, but I'm bogged down with other things at the moment. |
+1 |
+1 Just ran into this problem at a client and I think the described assumption is very valid, especially as {multi_}match is propagated as an alternative to query_string, which can easily have this behaviour. |
+1 |
for those of you that are interested I linked some WIP that I have ^^ and if anybody is up for some feedback that would be much appreciated |
`cross_fields` attemps to treat fields with the same analysis configuration as a single field and uses maximum score promotion or combination of the scores based depending on the `use_dis_max` setting. By default scores are combined. Relates to elastic#2959
`cross_fields` attemps to treat fields with the same analysis configuration as a single field and uses maximum score promotion or combination of the scores based depending on the `use_dis_max` setting. By default scores are combined. `cross_fields` can also search across fields of hetrogenous types for instance if numbers can be part of the query it makes sense to search also on numeric fields if an analyzer is provided in the reqeust. Relates to elastic#2959
`cross_fields` attemps to treat fields with the same analysis configuration as a single field and uses maximum score promotion or combination of the scores based depending on the `use_dis_max` setting. By default scores are combined. `cross_fields` can also search across fields of hetrogenous types for instance if numbers can be part of the query it makes sense to search also on numeric fields if an analyzer is provided in the reqeust. Relates to #2959
`cross_fields` attemps to treat fields with the same analysis configuration as a single field and uses maximum score promotion or combination of the scores based depending on the `use_dis_max` setting. By default scores are combined. `cross_fields` can also search across fields of hetrogenous types for instance if numbers can be part of the query it makes sense to search also on numeric fields if an analyzer is provided in the reqeust. Relates to #2959
I am closing this since |
I wrote a patch to MultiMatch query that provides more natural and processing when considering multiple fields.
Consider document with fields:
title: Something
description: featured on their 1969 album Abbey Road
author: Beatles
Now if I take user's input and run a query to match my documents, it would be natural to consider ether the dreaded _all field or a multi_match query like:
multi_match:{"query":"Something Beatles", "fields":["title", "description", "author"], "operator":"and"}
Which would get transformed into a boolean query such as:
(+title:something +title:beatles) (+description:something +description:beatles) (+author:something +author: beatles)
There is no match for our document! From human input perspective often the most natural way to AND multi-field search is to ensure each term is matched somewhere across all fields such as:
+(title:something description:something author:something) +(title:beatles description:beatles author:beatles)
My patch does exactly that and it also accounts for use of multiple analyzers which may remove tokens from some fields (ex: The Beatles). If a token is skipped by an analyzer it will be turned into a should requirement on remaining fields instead of a must.
I am using facilities of match query for minimum should match as well as fuzzy processing so a new match type felt natural.
multi_match:{"query":"Something Beatles", "fields":["title", "description", "author"], "type":"across"}