Skip to content

Optimize query filtering on the result of another query #5925

@Jackie-Jiang

Description

@Jackie-Jiang

Example query:
SELECT COUNT(*) FROM table1 WHERE id1 IN (SELECT id2 FROM table2 WHERE ...) ...

A naive solution would be sending the sub-query first, gather the result, then use the result to construct the main query. It works for cases where the result for the sub-query is small (< 100), but when the result size becomes big (> 1000), the cost of ser/de, query compilation and query processing will be too high.

In order to optimize this query, we need to reduce the number of ids to process. We can rely on the partitioning to achieve that. When all the segments for a partition is on a single server, we can solve the query for the partition on the server side without sending and merging the result of the sub-query on the broker.

Metadata

Metadata

Assignees

Labels

featureNew functionality

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions