Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommonTermsQuery should support querying fields other than its source of commonness #5024

Closed
nik9000 opened this issue Feb 5, 2014 · 2 comments

Comments

@nik9000
Copy link
Member

nik9000 commented Feb 5, 2014

I'd like to be able to do a CommonTermsQuery against multiple fields. One field would determine if the term is common and the others would be ORed together. Something like:

{ "common": {
  "body": {
    "high_freq_operator": "and",
    "fields": ["title", "body"],
    "query": "nelly the elephant not as a cartoon",
  }
} }

would become something like:

{ "bool": {
  "must": {
    "bool": {
      "must": [
        { "bool": {
          "should": [{ "term": { "title": "nelly"}}, { "term": { "body": "nelly"}}],
          "minimum_should_match": 1
        },
        { "bool": {
          "should": [{ "term": { "title": "elephant"}}, { "term": { "body": "elephant"}}],
          "minimum_should_match": 1
        },
        { "bool": {
          "should": [{ "term": { "title": "cartoon"}}, { "term": { "body": "cartoon"}}],
          "minimum_should_match": 1
        }
      ]
    }
  },
  "should": {
    "bool": {
      "should": [
        { "bool": {
          "should": [{ "term": { "title": "the"}}, { "term": { "body": "the"}}],
          "minimum_should_match": 1
        },
        { "bool": {
          "should": [{ "term": { "title": "not"}}, { "term": { "body": "not"}}],
          "minimum_should_match": 1
        },
        { "bool": {
          "should": [{ "term": { "title": "as"}}, { "term": { "body": "as"}}],
          "minimum_should_match": 1
        },
        { "bool": {
          "should": [{ "term": { "title": "a"}}, { "term": { "body": "a"}}],
          "minimum_should_match": 1
        }
      ]
    }
  }
} }

This bug has a Lucene twin: https://issues.apache.org/jira/browse/LUCENE-5435

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Feb 5, 2014
Add support for matching multiple fields to the common term query. Like this:
{ "common": {
  "body": {
    "high_freq_operator": "and",
    "fields": ["title", "body"],
    "query": "nelly the elephant not as a cartoon",
  }
} }
The extra fields come from the "fields" array and the name of the object
containing the query denotes the field from which the common-ness is
derived.

WIP needs:
1.  A review on the funky syntax for adding the fields.  Feals bolted on
because it is.
2.  Gotta get the multi match query to use it.
3.  Docs.  Including a working rest example.
4.  More tests!?

Closes elastic#5024
@s1monw
Copy link
Contributor

s1monw commented Feb 5, 2014

this seems very much related to what #5005 does? I think it can support the same behavior?

@nik9000
Copy link
Member Author

nik9000 commented Feb 5, 2014

Wow, that looks like a much larger effort I should have been paying attention to. I'll set this aside for a while and look there.

@nik9000 nik9000 closed this as completed Mar 31, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants