Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions muted-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@ tests:
- class: org.elasticsearch.xpack.restart.CoreFullClusterRestartIT
method: testSnapshotRestore {cluster=UPGRADED}
issue: https://github.com/elastic/elasticsearch/issues/111799
- class: org.elasticsearch.xpack.inference.InferenceRestIT
method: test {p0=inference/80_random_rerank_retriever/Random rerank retriever predictably shuffles results}
issue: https://github.com/elastic/elasticsearch/issues/111999
- class: org.elasticsearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT
issue: https://github.com/elastic/elasticsearch/issues/112147
- class: org.elasticsearch.smoketest.WatcherYamlRestIT
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ setup:
indices.create:
index: test-index
body:
settings:
number_of_shards: 1
Comment on lines +11 to +12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmpailis just so you know, it is possible in the CCS test cases that you end up with effectively 2 shards which has one remote and one local.

What makes this sensitive to shard count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is with the idf component of the query_string. Looking into explain for the first doc we have:

  • 2 shards case:
 {
                                    "value": 0.13353139,
                                    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                    "details":
                                    [
                                        {
                                            "value": 3,
                                            "description": "n, number of documents containing term",
                                            "details":
                                            []
                                        },
                                        {
                                            "value": 3,
                                            "description": "N, total number of documents with field",
                                            "details":
                                            []
                                        }
                                    ]
                                },

1 shard case:

{
                                    "value": 0.087011375,
                                    "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                                    "details":
                                    [
                                        {
                                            "value": 5,
                                            "description": "n, number of documents containing term",
                                            "details":
                                            []
                                        },
                                        {
                                            "value": 5,
                                            "description": "N, total number of documents with field",
                                            "details":
                                            []
                                        }
                                    ]
                                },

Which is why the final score computed at the end is not the same.

mappings:
properties:
text:
Expand Down