Statistics-based query optimisation #269

kal · 2016-01-20T14:16:23Z

Related to #111

This is a note to myself for future reference...

Tried using the existing store stats for optimization purposes, but currently we aren't really collecting enough statistics. We currently only have a total statements count for each predicate. Ideally we should also collection the number of distinct subjects and objects for each predicate as that would allow us to do a rough estimate of the weight of a pattern with any two of s, p, and o bound. With the current statistics we have to use default weightings for s and o which can throw off the optimizer quite badly.

It would be worth looking at updating collection of store statistics and then revisiting this.

This work is still incomplete though as testing shows that the optimizer isn't really doing a good job. This is primarily (I think) because we don't have enough stats. See issue #269

kal added the enhancement label Jan 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistics-based query optimisation #269

Statistics-based query optimisation #269

kal commented Jan 20, 2016

Statistics-based query optimisation #269

Statistics-based query optimisation #269

Comments

kal commented Jan 20, 2016