Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring description and details for boost inaccurate & incomplete #46571

Closed
dschneiter opened this issue Sep 10, 2019 · 2 comments
Closed

Scoring description and details for boost inaccurate & incomplete #46571

dschneiter opened this issue Sep 10, 2019 · 2 comments
Labels
>bug :Search/Ranking Scoring, rescoring, rank evaluation. Team:Search Meta label for search team

Comments

@dschneiter
Copy link
Contributor

dschneiter commented Sep 10, 2019

Elasticsearch version (bin/elasticsearch --version):
Version: 7.3.1, Build: default/tar/4749ba6/2019-08-19T20:19:25.651794Z, JVM: 1.8.0_201

Plugins installed: []
none

JVM version (java -version):
OpenJDK 64-Bit Server VM (build 25.201-b09, mixed mode)

OS version (uname -a if on a Unix-like system):
Linux server1 4.4.0-1090-aws #101-Ubuntu SMP Fri Aug 2 15:21:01 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
When asking for the score explanation by setting "explain": true, a boost factor of 2.2 gets reported for queries without boost parameter and 22.0 for queries with a boost of 10. It seems as (1 + k1) from BM25 got factored into the boost factor without reflecting that information in the name nor in the description.

I expect a description similar to "boost factor, computed as (1 + k1) * boost value and in the details the k1 saturation parameter and the actual boost value.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Index sample document
PUT reltest/_doc/1
{
  "title": "My Test Document",
}
  1. Search for that document
GET reltest/_search
{
  "explain": true,
  "query": {
    "match": {
      "title": {
        "query": "Test",
        "boost": 10
      }
    }
  }
}
  1. Output snippet
             "details" : [
                {
                  "value" : 22.0,
                  "description" : "boost",
                  "details" : [ ]
                },

Provide logs (if relevant):

@dschneiter dschneiter changed the title Scoring description and details for boost inaccurate / incomplete Scoring description and details for boost inaccurate & incomplete Sep 10, 2019
@danielmitterdorfer danielmitterdorfer added the :Search/Ranking Scoring, rescoring, rank evaluation. label Sep 11, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@jtibshirani
Copy link
Contributor

jtibshirani commented Feb 1, 2020

It looks like we didn't maintain the exact same explain output when introducing LegacyBM25Similarity in apache/lucene-solr#511. It doesn't look so simple to fix, and I don't think we'd consider it high priority since we plan to switch to the new BM25Similarity which doesn't contain the 1 + k1 term (#36431).

I'll leave this issue open in case others run into the same problem and we get more feedback that it's important to fix.

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Ranking Scoring, rescoring, rank evaluation. Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

5 participants