Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
We are seeing high arrangement record counts for chbench queries, for example:
The largest ones,
However, our ability to compact things is compromised by the arrangement containing primary keys, even though the query does not use them. For example, in both queries we only use five fields from
We could project down to the demanded columns, and arrange just those, in particular for private arrangements. This has a potential negative impact on sharing within a query (e.g. delta queries might want to be careful doing this), but a potential massive upside for collapsing large collections with relatively fewer distinct values of interest.
For bonus points, if we noticed that
cc @wangandi as relevant to your interests.
A more probable way to win partial bonus points: if we push down the reduction (summing