Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change of behaviour for the "query_string": "*" filter after upgrading to 8.14 #110133

Closed
dej611 opened this issue Jun 25, 2024 · 15 comments
Closed
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@dej611
Copy link
Contributor

dej611 commented Jun 25, 2024

Elasticsearch Version

8.14

Installed Plugins

No response

Java Version

bundled

OS Version

linux

Problem Description

As documented in this Kibana issue an integration dashboard stopped reporting results on 8.14: elastic/kibana#186616 (comment)

The affected visualization produces the following query:

POST /index/_async_search?batched_reduce_size=64&ccs_minimize_roundtrips=true&wait_for_completion_timeout=200ms&keep_on_completion=true&keep_alive=60000ms&ignore_unavailable=true&preference=1719317130638
{
  "aggs": {
    "0": {
      "terms": {
        "field": "kubernetes.pod.name",
        "order": {
          "4-bucket>4-metric[pending]": "desc"
        },
        "size": 10000
      },
      "aggs": {
        "1": {
          "filters": {
            "filters": {
              "Status": {
                "bool": {
                  "must": [],
                  "filter": [
                    {
                      "query_string": {
                        "query": "*"
                      }
                    }
                  ],
                  "should": [],
                  "must_not": []
                }
              }
            }
          },
          "aggs": {
            "2-bucket": {
              "filter": {
                "bool": {
                  "filter": [
                    {
                      "range": {
                        "@timestamp": {
                          "format": "strict_date_optional_time",
                          "gte": "2024-06-25T12:04:32.017Z",
                          "lte": "2024-06-25T12:05:32.017Z"
                        }
                      }
                    },
                    {
                      "bool": {
                        "must": [],
                        "filter": [
                          {
                            "bool": {
                              "should": [
                                {
                                  "exists": {
                                    "field": "running"
                                  }
                                }
                              ],
                              "minimum_should_match": 1
                            }
                          }
                        ],
                        "should": [],
                        "must_not": []
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "2-metric": {
                  "top_metrics": {
                    "metrics": {
                      "field": "running"
                    },
                    "size": 1,
                    "sort": {
                      "@timestamp": "desc"
                    }
                  }
                }
              }
            },
            "3-bucket": {
              "filter": {
                "bool": {
                  "filter": [
                    {
                      "range": {
                        "@timestamp": {
                          "format": "strict_date_optional_time",
                          "gte": "2024-06-25T12:04:32.017Z",
                          "lte": "2024-06-25T12:05:32.017Z"
                        }
                      }
                    },
                    {
                      "bool": {
                        "must": [],
                        "filter": [
                          {
                            "bool": {
                              "should": [
                                {
                                  "exists": {
                                    "field": "succeeded"
                                  }
                                }
                              ],
                              "minimum_should_match": 1
                            }
                          }
                        ],
                        "should": [],
                        "must_not": []
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "3-metric": {
                  "top_metrics": {
                    "metrics": {
                      "field": "succeeded"
                    },
                    "size": 1,
                    "sort": {
                      "@timestamp": "desc"
                    }
                  }
                }
              }
            },
            "4-bucket": {
              "filter": {
                "bool": {
                  "filter": [
                    {
                      "range": {
                        "@timestamp": {
                          "format": "strict_date_optional_time",
                          "gte": "2024-06-25T12:04:32.017Z",
                          "lte": "2024-06-25T12:05:32.017Z"
                        }
                      }
                    },
                    {
                      "bool": {
                        "must": [],
                        "filter": [
                          {
                            "bool": {
                              "should": [
                                {
                                  "exists": {
                                    "field": "pending"
                                  }
                                }
                              ],
                              "minimum_should_match": 1
                            }
                          }
                        ],
                        "should": [],
                        "must_not": []
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "4-metric": {
                  "top_metrics": {
                    "metrics": {
                      "field": "pending"
                    },
                    "size": 1,
                    "sort": {
                      "@timestamp": "desc"
                    }
                  }
                }
              }
            },
            "5-bucket": {
              "filter": {
                "bool": {
                  "filter": [
                    {
                      "range": {
                        "@timestamp": {
                          "format": "strict_date_optional_time",
                          "gte": "2024-06-25T12:04:32.017Z",
                          "lte": "2024-06-25T12:05:32.017Z"
                        }
                      }
                    },
                    {
                      "bool": {
                        "must": [],
                        "filter": [
                          {
                            "bool": {
                              "should": [
                                {
                                  "exists": {
                                    "field": "failed"
                                  }
                                }
                              ],
                              "minimum_should_match": 1
                            }
                          }
                        ],
                        "should": [],
                        "must_not": []
                      }
                    }
                  ]
                }
              },
              "aggs": {
                "5-metric": {
                  "top_metrics": {
                    "metrics": {
                      "field": "failed"
                    },
                    "size": 1,
                    "sort": {
                      "@timestamp": "desc"
                    }
                  }
                }
              }
            }
          }
        },
        "4-bucket": {
          "filter": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "@timestamp": {
                      "format": "strict_date_optional_time",
                      "gte": "2024-06-25T12:04:32.017Z",
                      "lte": "2024-06-25T12:05:32.017Z"
                    }
                  }
                },
                {
                  "bool": {
                    "must": [],
                    "filter": [
                      {
                        "bool": {
                          "should": [
                            {
                              "exists": {
                                "field": "pending"
                              }
                            }
                          ],
                          "minimum_should_match": 1
                        }
                      }
                    ],
                    "should": [],
                    "must_not": []
                  }
                }
              ]
            }
          },
          "aggs": {
            "4-metric": {
              "top_metrics": {
                "metrics": {
                  "field": "pending"
                },
                "size": 1,
                "sort": {
                  "@timestamp": "desc"
                }
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "fields": [
    {
      "field": "@timestamp",
      "format": "strict_date_time"
    },
    {
      "field": "event.ingested",
      "format": "date_time"
    },
    {
      "field": "k8s.pod.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.container.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.node.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.pod.start_time",
      "format": "date_time"
    },
    {
      "field": "kubernetes.service.created",
      "format": "date_time"
    },
    {
      "field": "kubernetes.storageclass.created",
      "format": "date_time"
    },
    {
      "field": "kubernetes.system.start_time",
      "format": "date_time"
    },
    {
      "field": "process.cpu.start_time",
      "format": "date_time"
    },
    {
      "field": "system.process.cpu.start_time",
      "format": "date_time"
    }
  ],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "runtime_mappings": {
    "failed": {
      "script": {
        "source": "if (doc['kubernetes.pod.status.phase'].value == \"failed\") { emit(1) }"
      },
      "type": "long"
    },
    "not_running": {
      "script": {
        "source": "if (doc['kubernetes.pod.status.phase'].value == \"pending\" || doc['kubernetes.pod.status.phase'].value == \"failed\") { emit(1) }"
      },
      "type": "long"
    },
    "pending": {
      "script": {
        "source": "if (doc['kubernetes.pod.status.phase'].value == \"pending\") { emit(1) }"
      },
      "type": "long"
    },
    "running": {
      "script": {
        "source": "if (doc['kubernetes.pod.status.phase'].value == \"running\") { emit(1) }"
      },
      "type": "long"
    },
    "succeeded": {
      "script": {
        "source": "if (doc['kubernetes.pod.status.phase'].value == \"succeeded\") { emit(1) }"
      },
      "type": "long"
    }
  },
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_phrase": {
            "data_stream.dataset": "kubernetes.state_pod"
          }
        },
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2024-06-25T11:50:32.017Z",
              "lte": "2024-06-25T12:05:32.017Z"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

In 8.13.1 this query was returning some results:

{
    ...
    "hits": {
      "total": {
        "value": 1260,
        "relation": "eq"
      },
      "max_score": null,
      "hits": []
    },
    "aggregations": {
      "0": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          // ...many buckets
        ]
      }
    }
  }
}

In 8.14 it returns:

{
     ...
    "hits": {
      "total": {
        "value": 0,
        "relation": "eq"
      },
      "max_score": null,
      "hits": []
    },
    "aggregations": {
      "0": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": []
      }
    }
  }
}

In 8.14 removing from the Status > bool > filter the part with:

{
   "query_string": {
     "query": "*"
   }
}

Then the same response comes back.
The problem is that there are many Kibana Lens visualizations within dashboards applying that filter in the query and are affected by this change of behaviour.

Steps to Reproduce

See above.

Logs (if relevant)

No response

@dej611 dej611 added >bug needs:triage Requires assignment of a team area label labels Jun 25, 2024
@dej611 dej611 added the :Analytics/Aggregations Aggregations label Jun 26, 2024
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Jun 26, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@not-napoleon
Copy link
Member

I've been trying for a bit, and I can't get this to reproduce with a more minimal example. Here's the script I just used:

PUT http://localhost:9200/test

{
  "mappings": {
    "properties": {
      "number": {
        "type": "long"
      },
      "text": {
        "type": "keyword"
      }
    }
  }
}

PUT http://localhost:9200/test/_doc/1

{ "number": 100, "text": "foo" }

PUT http://localhost:9200/test/_doc/2

{ "number": 101, "text": "foo" }

PUT http://localhost:9200/test/_doc/3

{ "number": 11, "text": "bar" }

POST http://localhost:9200/test/_search

{
    "size": 0,
        "query": {
            "match_all": {}
        },
        "aggregations": {
            "match_star": {
                "filters": {
                    "filters": {
                        "Status": {
                            "bool": {
                                "must": [],
                                "filter": [
                                {
                                    "query_string": {
                                        "query": "*"
                                    }
                                }
                                ],
                                "should": [],
                                "must_not": []
                            }
                        }
                    }
                }
            }
        }
}

Which returns exactly what I would expect:

  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "match_star": {
      "buckets": {
        "Status": {
          "doc_count": 3
        }
      }
    }
  }

i.e. one bucket with three matches. So I don't think query_string: query: * is the problem.

@dej611 can you trim this query down to the minimum needed to reproduce the issue, please? If you need help doing that, please reach out to me next week and I can find some time to pair with you on it.

Alternatively, that query_string: query: * clause should be functionally a no-op; you could just omit the whole thing. It shouldn't make a difference, and my testing indicates that it won't, but if that fixes your chart I don't see that it would hurt anything.

@not-napoleon
Copy link
Member

I still can't reproduce this locally, but @dej611 was able to get me access to the cluster where this is occurring. Based on analysis on that cluster, I now believe this is an issue with the query_string search, that has nothing to do with the aggregation. I ran two queries:

POST /[REDACTED]/_search
{
  "profile": true, 
  "query": {
    "query_string": {
      "query": "*"
    }
  }
}

POST /[REDACTED]/_search
{
  "profile": true, 
  "query": {
   "bool": {
      "must": []
    }
  }
}

I would expect that these would return the same results, however the query_string version returns no hits, while the empty boolean query returns ~1100 results. Profiling the query_string version, it rewrites to a MatchNoDocs query:

           "query": [
              {
                "type": "MatchNoDocsQuery",
                "description": """MatchNoDocsQuery("unmapped fields [*]")""",

I'm going to pull the search team in on this to get some domain expertise about what might be happening here.

@not-napoleon not-napoleon added the :Search/Search Search-related issues that do not fall into other categories label Jul 8, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jul 8, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@benwtrent
Copy link
Member

@dej611 for the index settings, is there a default_field supplied?

@benwtrent
Copy link
Member

@dej611 another setting that could change this behavior is index.query_string.lenient. Did default_field or index.query.default_field or any index.query_string setting change in kibana between the two versions?

@dej611
Copy link
Contributor Author

dej611 commented Jul 9, 2024

I can see a default_field configured in settings:

GET /[REDACTED]/_settings
...
"query": {
  "default_field": [
    "message"
  ]
}
...

We've checked also any change in the index template used for the integration, and the final index template is derived from this https://github.com/elastic/elasticsearch/blame/main/x-pack/plugin/core/template-resources/src/main/resources/metrics%40settings.json (I see the default_field setting here has been changed last time 4 years ago).

@javanna
Copy link
Member

javanna commented Jul 9, 2024

The profile output says it rather clearly: the field target by the query string is not mapped. Can you check the mapping for the message field on the index being queried?

Just in case, what happens if you specify message:* in the query string query?

@dej611
Copy link
Contributor Author

dej611 commented Jul 9, 2024

I managed to have a 8.13.1 instance with the same integration without the issue.

Checking it I see that some indexes on the 8.13.1 have a different default_field configuration than its 8.14 counter-part...

The profile output says it rather clearly: the field target by the query string is not mapped.

Tried to run the profile on both 8.13.1 and 8.14 and I see both MatchNoDocsQuery on different indexes.

Can you check the mapping for the message field on the index being queried?

In both cases there's:

"message": {
  "type": "match_only_text"
},

Just in case, what happens if you specify message:* in the query string query?

0 results in both instances.

@dej611
Copy link
Contributor Author

dej611 commented Jul 9, 2024

@gizas found this discussion here about removing the default_field for logs-*: #99872 (PR #102456 )
That discussion led to the removal of the default_field in the index template in Kibana for Fleet, fallback to the default [*] which affects the index used in this issue: elastic/kibana#177605 ( PR elastic/kibana#178020 )

@benwtrent
Copy link
Member

So, elastic/kibana#178020 is most likely the cause here? Do those vector changes explain the discrepancy?

@dej611 what happens if you flip back in 8.14 the default field calculation (just do it manually and update the index settings directly), do you get hits again like you expect?

@dej611
Copy link
Contributor Author

dej611 commented Jul 9, 2024

I've manually changed the default_field : ["*"] to indexes in the 8.14.0 and results started to flow again.
If I understood it correctly the changes in elastic/kibana#178020 were expecting to fallback default_field to ["*"] ES value, but because of the default index template for metrics in https://github.com/elastic/elasticsearch/blame/main/x-pack/plugin/core/template-resources/src/main/resources/metrics%40settings.json that limited the actual search space when using the wildcard.
Is that correct?

@benwtrent
Copy link
Member

If multiple templates are being utilized and the default_field value was previously overwritten so that this metrics template was ignored, that is likely the cause of this failure.

@wchaparro wchaparro removed :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jul 9, 2024
@benwtrent
Copy link
Member

@dej611 this seems like an unintended behavior change on the Kibana side. No behavior changed directly in the Elasticsearch server core.

Do you think we can close this?

@dej611
Copy link
Contributor Author

dej611 commented Jul 9, 2024

Yes, I think so. Thanks for your help 👍

@dej611 dej611 closed this as completed Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

6 participants