ES|QL: Add min competitive optimization for lucene operators

We currently only push ES|QL conditions that have an exact equivalent to a Lucene query.

Take as example, where all conditions are pushed down to lucene and we use a SORT that's also pushed down:
```
FROM wikipedia METADATA _score
| WHERE title:"europe"
| sort _score desc
| limit 10
```
This query will use the `LuceneTopNSourceOperator` and should be close in performance to a running a match query in the DSL. The `LuceneTopNSourceOperator` will emit at most 10 rows that need to be processed on the compute service side.

The moment we have a filter condition that is not pushed down, we no longer use the `LuceneTopNSourceOperator`, but `LuceneSourceOperator`. We will still push down the match query, but  `LuceneSourceOperator` will output all docs that match the query string. These docs will then be processed on the compute service side to apply the non-pushable filter and then sorted to get the the top 10.

```
FROM wikipedia METADATA _score
| WHERE title:"europe" and length(title) > 10
| sort _score desc
| limit 10
```

~~We should look into whether we can push down more WHERE conditions as filters.
We will likely need a custom Lucene query for this that can evaluate an ES|QL expression.
We can start with something simple such as pushing down only conditions that depend on the Literals and indexed fields (not other runtime columns resulted from EVALs).
If we want to first validate if this would improve things, we can start with a simple prototype that pushes down a set conditions like `length(title) > 10` as a painless script query and then run a simple benchmark on a larger dataset (where it is more likely that we will see an improvement).~~

EDIT: We should look into adding a min competitive optimization.
For example we can have a callback between the `LuceneSourceOperator` and `TopNOperator` such that we can set a min competitive score back in `LuceneSourceOperator` once we fill the priority queue in `TopNOperator` 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ES|QL: Add min competitive optimization for lucene operators #136267

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ES|QL: Add min competitive optimization for lucene operators #136267

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions