Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decay functions should allow to specify a value in case a field is missing #18892

Open
FlorianWilhelm opened this issue Jun 15, 2016 · 17 comments
Open

Comments

@FlorianWilhelm
Copy link

@FlorianWilhelm FlorianWilhelm commented Jun 15, 2016

Describe the feature:

When using a decay function in a function_score the documentation says in the offical documentation: "If the numeric field is missing in the document, the function will return 1." In many cases this is not intended since documents missing the field are scored with the highest possible value.

Similar to field_value_factor decay functions should provide a missing parameter allowing to define the score in case the field is missing.

@clintongormley
Copy link
Contributor

@clintongormley clintongormley commented Jun 15, 2016

I think this makes sense. @brwe what do you think?

@brwe
Copy link
Contributor

@brwe brwe commented Jun 16, 2016

Last time we had the discussion we decided we will not do anything because there is a workaround: #7788 We can re evaluate this decision. Might make sense to make it consistent with field_value_factor. I have no strong opinion though.

@clintongormley
Copy link
Contributor

@clintongormley clintongormley commented Jun 16, 2016

+1 to consistency

@FlorianWilhelm
Copy link
Author

@FlorianWilhelm FlorianWilhelm commented Jun 16, 2016

@brwe Thank you, I haven't thought about that option actually but I guess I am not the only one and consistency is always better. I post your solution for completeness reasons here.

{
  "query": {
    "function_score": {
      "score_mode": "first",
      "functions": [
        {
          "filter": {
            "exists": {
              "field": "age"
            }
          },
          "gauss": {
            "age": {
              "origin": 22,
              "scale": 5,
              "decay": 0.5
            }
          }
        },
        {
          "script_score": {
            "script": "0"
          }
        }
      ]
    }
  }
}
@magicleo
Copy link

@magicleo magicleo commented Jun 17, 2016

what if I want to use other functions and need "score_mode": "sum"?

@FlorianWilhelm
Copy link
Author

@FlorianWilhelm FlorianWilhelm commented Jun 20, 2016

@magicleo I would say that score_mode: first is only an optimization at that point. If you have sum you will just add 0. sometimes.

@kaka19ace
Copy link

@kaka19ace kaka19ace commented Sep 28, 2016

At most usage cases, we disabled the script option for security reason, so we could not using decay function if the doc field not exists,

Using missing value is a good idea :)

@matthuhiggins
Copy link

@matthuhiggins matthuhiggins commented Jan 2, 2017

kaka19ace - I'm having the same pain. Scripting is not enabled, so the suggested workaround is not available. Makes it tough to use the feature on coordinate fields.

@tuzz
Copy link

@tuzz tuzz commented Feb 17, 2017

I wasn't able to use the workaround above because I have more than one function in function_score. Instead, I found another workaround which is to copy the nullable field to a new field in the index mapping to guarantee a value:

"mappings": {
  "name_of_type": {
    "field_that_might_be_null": {
      "type": "float",
      "copy_to": "field_that_definitely_wont_be_null"
    },
    "field_that_definitely_wont_be_null" {
      "type": "float",
      "null_value": 0
    }
  }
}

Depending on the type of decay, you may need to pick a default value that's far enough out of range to result in a value of 0. Hopefully that helps someone.

@brooks
Copy link

@brooks brooks commented Jun 8, 2017

+1 for missing 👍

@adamdunkley
Copy link

@adamdunkley adamdunkley commented Jul 1, 2017

There is actually a workaround that does not need a null value or scripting to be enabled.

If you make the second score function something that will always yield 0, for example:

{
  "query": {
    "function_score": {
      "score_mode": "first",
      "functions": [
        {
          "filter": {
            "exists": {
              "field": "age"
            }
          },
          "gauss": {
            "age": {
              "origin": 22,
              "scale": 5,
              "decay": 0.5
            }
          }
        },
        {
          "gauss": {
            "age": {
              "origin": "0",
              "offset": "0",
              "scale": "100"
            }
          }
        }
      ]
    }
  }
}

(where age is never going to be 0)

This still does not solve the issue for where you want multiple functions as it will screw with averages (if using average as the rollup function) but at least it doesn't require scripting or changes to mappings :)

@kemcon
Copy link

@kemcon kemcon commented Nov 15, 2017

i like to have the possibility to add a 1 to the result of the decay score (this is an easy to change issue). so the results will be between 1-2 and not between 0-1. with an exists-filter query (if this is nessesary), all non-existing documents will not be scored (this is similar to factor 1) and all others will scored better (1-2).

@javanna
Copy link
Member

@javanna javanna commented Mar 16, 2018

@elastic/es-search-aggs

@cristi23
Copy link

@cristi23 cristi23 commented Oct 23, 2018

I wonder if there is something currently being done about this. We need to use multiple functions on a search over 2 indices (with two different, but similar doc types).
One of the functions is the gauss decay function on a field missing in one of the two doc types. We've tried the suggested workarounds but they don't work in our situation.

@mayya-sharipova
Copy link
Contributor

@mayya-sharipova mayya-sharipova commented Oct 23, 2018

@cristi23 Thanks for your input. We are in the process of redesigning Function Score Query, and are planning to substitute it with Script Score Query: #34533

With this new type of query, you would use painless script to check for a missing value:

doc['field'].size() == 0 ? <your value for missing> : <decayFunction(doc['field'].value)> ;

for example:

"script" : {
	"source" : "doc['dval'].size() == 0 ? 0 : decayNumericExp(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
	"params": {
		"origin": 20,
		"scale": 10,
		"decay" : 0.5,
		"offset" : 0
	}
}
@impguard
Copy link

@impguard impguard commented Nov 5, 2018

@cristi23 Until the script score query comes out, have you found a solution to your situation? We have the same problem except we have multiple indices and we simply want to run the function score query on one index. The missing field in other indices throws a parsing error.

@cristi23
Copy link

@cristi23 cristi23 commented Nov 5, 2018

@impguard Yes, we did. We added the missing field in the mapping of the index which threw errors when using the decay function.
We didn't add any values for that field, but it seems like this is enough for ES to use it without errors without, seemingly, causing any problems for the scoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.