Skip to content

Commit

Permalink
Fix cost estimate for transitive path (#383)
Browse files Browse the repository at this point in the history
Make sure that the query plan which computes the full transitive hull is
always the most expensive. Achieved by setting the size estimate for
<predicate>+ to 10000 times the size estimate for <predicate>.
  • Loading branch information
hannahbast committed Apr 19, 2021
1 parent a43b0e2 commit 917eadd
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions src/engine/TransitivePath.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,19 @@ size_t TransitivePath::getSizeEstimate() {
if (_rightSideTree != nullptr) {
return _rightSideTree->getSizeEstimate();
}
// Set costs to something very large, so that we never compute the complete
// transitive hull (unless the variables on both sides are not bound in any
// other way, so that the only possible query plan is to compute the complete
// transitive hull).
//
// NOTE: _subtree->getSizeEstimate() is the number of triples of the
// predicate, for which the transitive hull operator (+) is specified. On
// Wikidata, the predicate with the largest blowup when taking the
// transitive hull is wdt:P2789 (connects with). The blowup is then from 90K
// (without +) to 110M (with +), so about 1000 times larger.
if (_leftIsVar && _rightIsVar) {
return _subtree->getSizeEstimate() * 10000;
}
// TODO(Florian): this is not necessarily a good estimator
if (_leftIsVar) {
return _subtree->getSizeEstimate() / _subtree->getMultiplicity(_leftSubCol);
Expand Down

0 comments on commit 917eadd

Please sign in to comment.