New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the `field_value_factor` function to the function_score query #5519

Closed
wants to merge 14 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@dakrone
Member

dakrone commented Mar 24, 2014

The field_value_factor function uses the value of a field in the document to influence the score. This is a common case that script scoring was previously used for.

For example, a query that looks like:

{
  "query": {
    "function_score": {
      "query": {"match": { "body": "foo" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity",
            "factor": 1.1,
            "modifier": "square"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

Would have the score modified for each document by:

square(1.1 * doc['popularity'].value)

This is faster and less error-prone than using scripting to influence the score. Speed-wise, I used the IMDB movie database and tested with two different queries:

{
  "query": {
    "function_score": {
      "query": {"match_all": {}},
      "functions": [
        {
          "field_value_factor": {
            "field": "runtime",
            "modifier": "none"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

vs:

{
  "query": {
    "function_score": {
      "query": {"match_all": {}},
      "functions": [
        {
          "script_score": {
            "script": "_score * doc['runtime'].value"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

The field_value_factor version took about 75ms on average, the script_score version took about 145ms on average (after field data was loaded for both versions).

@brwe

View changes

Show outdated Hide outdated ...lasticsearch/common/lucene/search/function/FieldValueFactorFunction.java
@brwe

View changes

Show outdated Hide outdated ...lasticsearch/common/lucene/search/function/FieldValueFactorFunction.java
}
}
}

This comment has been minimized.

@brwe

brwe Mar 25, 2014

Contributor

Maybe check if "field" was actually set and throw an exception if not?

@brwe

brwe Mar 25, 2014

Contributor

Maybe check if "field" was actually set and throw an exception if not?

This comment has been minimized.

@dakrone

dakrone Mar 25, 2014

Member

Good idea, I'll do that!

@dakrone

dakrone Mar 25, 2014

Member

Good idea, I'll do that!

@dakrone

This comment has been minimized.

Show comment
Hide comment
@dakrone

dakrone Mar 25, 2014

Member

@brwe I removed the lenient flag entirely, but I did add in an ignore_missing flag that returns the original score unmodified if the field is missing a value (defaulting to false, which throws an exception if the field is missing). What do you think of this solution?

Member

dakrone commented Mar 25, 2014

@brwe I removed the lenient flag entirely, but I did add in an ignore_missing flag that returns the original score unmodified if the field is missing a value (defaulting to false, which throws an exception if the field is missing). What do you think of this solution?

@brwe

This comment has been minimized.

Show comment
Hide comment
@brwe

brwe Mar 26, 2014

Contributor

I have no strong opinion about the ignore_missing if it is switched off per default, it might make sense in some cases.

Contributor

brwe commented Mar 26, 2014

I have no strong opinion about the ignore_missing if it is switched off per default, it might make sense in some cases.

@brwe

This comment has been minimized.

Show comment
Hide comment
@brwe

brwe Mar 26, 2014

Contributor

+1

Contributor

brwe commented Mar 26, 2014

+1

@imotov

View changes

Show outdated Hide outdated docs/reference/query-dsl/queries/function-score-query.asciidoc
@imotov

This comment has been minimized.

Show comment
Hide comment
@imotov

imotov Mar 26, 2014

Member

LGTM

Member

imotov commented Mar 26, 2014

LGTM

@s1monw

View changes

Show outdated Hide outdated ...lasticsearch/common/lucene/search/function/FieldValueFactorFunction.java
@s1monw

View changes

Show outdated Hide outdated ...lasticsearch/common/lucene/search/function/FieldValueFactorFunction.java
@s1monw

This comment has been minimized.

Show comment
Hide comment
@s1monw

s1monw Mar 27, 2014

Contributor

I like the feature I left some comments on the implementation - speed is important here :)

Contributor

s1monw commented Mar 27, 2014

I like the feature I left some comments on the implementation - speed is important here :)

@nik9000

View changes

Show outdated Hide outdated ...lasticsearch/common/lucene/search/function/FieldValueFactorFunction.java

dakrone added some commits Mar 20, 2014

Add the `field_value_factor` function to the function_score query
The `field_value_factor` function uses the value of a field in the
document to influence the score.

A query that looks like:
{
  "query": {
    "function_score": {
      "query": {"match": { "body": "foo" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity",
            "factor": 1.1,
            "modifier": "square"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

Would have the score modified by:

_score * square(1.1 * doc['popularity'])
Replace the `lenient` flag with an `ignore_missing` flag
Also makes the `field_value_function` not ignore missing fields by default
@dakrone

This comment has been minimized.

Show comment
Hide comment
@dakrone

dakrone Mar 27, 2014

Member

@s1monw updated to use IndexNumericFieldData and load the values more efficiently as you suggested. Also updated the docs like @clintongormley asked.

Member

dakrone commented Mar 27, 2014

@s1monw updated to use IndexNumericFieldData and load the values more efficiently as you suggested. Also updated the docs like @clintongormley asked.

@s1monw

This comment has been minimized.

Show comment
Hide comment
@s1monw

s1monw Mar 27, 2014

Contributor

LGTM

Contributor

s1monw commented Mar 27, 2014

LGTM

dakrone added a commit that referenced this pull request Mar 27, 2014

Add the `field_value_factor` function to the function_score query
The `field_value_factor` function uses the value of a field in the
document to influence the score.

A query that looks like:
{
  "query": {
    "function_score": {
      "query": {"match": { "body": "foo" }},
      "functions": [
        {
          "field_value_factor": {
            "field": "popularity",
            "factor": 1.1,
            "modifier": "square"
          }
        }
      ],
      "score_mode": "max",
      "boost_mode": "sum"
    }
  }
}

Would have the score modified by:

square(1.1 * doc['popularity'].value)

Closes #5519

@dakrone dakrone closed this in 8fbd1bd Mar 27, 2014

@ncolomer

This comment has been minimized.

Show comment
Hide comment
@ncolomer

ncolomer Apr 10, 2014

Nice work! 👍
Any idea when this will be available?

ncolomer commented Apr 10, 2014

Nice work! 👍
Any idea when this will be available?

@dakrone dakrone deleted the dakrone:function-score-factor branch Apr 21, 2014

@ruflin ruflin referenced this pull request Jun 22, 2014

Closed

field_value_factor support #639

dakrone added a commit that referenced this pull request Jul 23, 2014

Reflect that 'field_value_factor' is only in 1.2.x
While the blogpost http://www.elasticsearch.org/blog/2014-04-02-this-week-in-elasticsearch/ states, that feature #5519 was
added to 1.x, the release notes for, e.g. v1.1.2, however tell otherwise.
Only the release notes for 1.2.0 list #5519 as a new feature.

Since the 1.x docs deprecate/discourage from using `_boost`, and seemingly give a migration example at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-boost-field.html#function-score-instead-of-boost
users of 1.1.x should be warned.

dakrone added a commit that referenced this pull request Jul 23, 2014

Reflect that 'field_value_factor' is only in 1.2.x
While the blogpost http://www.elasticsearch.org/blog/2014-04-02-this-week-in-elasticsearch/ states, that feature #5519 was
added to 1.x, the release notes for, e.g. v1.1.2, however tell otherwise.
Only the release notes for 1.2.0 list #5519 as a new feature.

Since the 1.x docs deprecate/discourage from using `_boost`, and seemingly give a migration example at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-boost-field.html#function-score-instead-of-boost
users of 1.1.x should be warned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment