Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Predicates with Uncorrelated Subqueries for Dynamic Pruning #2588

Merged
merged 40 commits into from
Aug 25, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
34bc947
[WIP] subquery pruning
dey4ss Apr 3, 2023
e63f675
ensure deep_copy works for LQPNodes and Operators
dey4ss Apr 3, 2023
eed2c18
set lqp_node for deep copied operators
dey4ss Apr 3, 2023
ec47d98
only count additionally pruned chunks in description
dey4ss Apr 3, 2023
0346490
move visit_tasks to utility
dey4ss Apr 4, 2023
b58657b
refactor
dey4ss Apr 4, 2023
b387197
add some tests
dey4ss Apr 4, 2023
df2b05b
ensure acyclic task graphs
dey4ss Apr 4, 2023
4d828c6
minor
dey4ss Apr 4, 2023
414b92b
add some tests
dey4ss Apr 5, 2023
95163af
more tests
dey4ss Apr 5, 2023
28ca64b
test subquery not executed for subquery pruning
dey4ss Apr 5, 2023
dc6776e
polish
dey4ss Apr 5, 2023
6d9f719
add prunable subquery predicates to hash and equality of StoredTableNode
dey4ss May 11, 2023
2017f3e
Merge branch 'master' into dey4ss/subquery_pruning
dey4ss Jun 15, 2023
ec9eb5e
add include
dey4ss Jun 15, 2023
455465e
merge
dey4ss Jun 28, 2023
2c15006
add include
dey4ss Jun 28, 2023
247966d
simplify StoredTableNode equality check
dey4ss Jun 28, 2023
18950bc
Merge branch 'dey4ss/subquery_pruning' of https://github.com/hyrise/h…
dey4ss Jun 28, 2023
83d6b84
unused var
dey4ss Jun 30, 2023
f3d482d
merge, merge, merge
dey4ss Jul 17, 2023
e2a6533
-.-
dey4ss Jul 17, 2023
7aa6f08
merge
dey4ss Jul 27, 2023
79b443a
some feedback
dey4ss Jul 27, 2023
697f688
helper for prunable subquery mapping
dey4ss Jul 27, 2023
567e81d
should not code late
dey4ss Jul 28, 2023
67c0b1e
wtf gcc o.O
dey4ss Jul 28, 2023
dd39b51
I think for my next paper, I just need to copy Hyrise comments.
dey4ss Aug 1, 2023
f759622
minor
dey4ss Aug 9, 2023
bf26aa8
merge
dey4ss Aug 11, 2023
5416ede
review
dey4ss Aug 14, 2023
3bc22b7
sacre bleu
dey4ss Aug 14, 2023
00f8538
merge
dey4ss Aug 15, 2023
196f3c5
remove code paths for potential task cycles
dey4ss Aug 15, 2023
d9c59c2
where is my mind?
dey4ss Aug 16, 2023
e514e72
Merge branch 'master' into dey4ss/subquery_pruning
dey4ss Aug 17, 2023
914b027
Merge branch 'master' into dey4ss/subquery_pruning
dey4ss Aug 17, 2023
634e979
memory leak? who said memory leak...?
dey4ss Aug 23, 2023
c40c8b0
trigger
dey4ss Aug 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 7 additions & 9 deletions src/lib/operators/get_table.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -359,10 +359,9 @@ std::set<ChunkID> GetTable::_prune_chunks_dynamically() {
continue;
}

// Ignore the subquery if it has not been executed yet. A reason might be that scheduling the subquery before the
// GetTable operator would create a cycle. For instance, this can happen for a query like this:
// SELECT * FROM a_table WHERE x > (SELECT AVG(x) FROM a_table);
// The PQP of the query looks like the following:
// It might happen that scheduling the subquery before the GetTable operator would create a cycle. For instance,
// this can happen for a query like this: SELECT * FROM a_table WHERE x > (SELECT AVG(x) FROM a_table);
// The PQP of the query could look like the following:
//
// [TableScan] x > SUBQUERY
// | *
Expand All @@ -373,11 +372,10 @@ std::set<ChunkID> GetTable::_prune_chunks_dynamically() {
// [GetTable] a_table
//
// We cannot schedule the AggregateHash operator before the GetTable operator to obtain the subquery result for
// pruning: the OperatorTasks wrapping both operators would be in a circular wait for each other.
if (subquery.pqp->state() != OperatorState::ExecutedAndAvailable) {
continue;
}

// pruning: the OperatorTasks wrapping both operators would be in a circular wait for each other. We simply avoid
// this circular wait by StoredTableNodes using their prunable_subquery_predicates for equality checks. Thus, the
// LQPTranslator creates two GetTable operators rather than deduplicating them. resolve_uncorrelated_subquery()
// asserts that the subquery has already been executed.
argument = value_(resolve_uncorrelated_subquery(subquery.pqp));
}

Expand Down
22 changes: 7 additions & 15 deletions src/lib/scheduler/operator_task.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,21 +63,13 @@ void link_tasks_for_subquery_pruning(const std::unordered_set<std::shared_ptr<Op
const auto& subquery_root = subquery->get_or_create_operator_task();
Assert(tasks.contains(subquery_root), "Unknown OperatorTask.");

// Cycles in the task graph would lead to deadlocks during execution. To make sure we do not introduce cycles,
// we only set the subquery task as predecessor of the GetTable task if it is not a successor of the GetTable
// task.
auto is_acyclic = true;
visit_tasks_upwards(task, [&](const auto& successor) {
if (successor == subquery_root) {
is_acyclic = false;
return TaskUpwardVisitation::DoNotVisitSuccessors;
}
return TaskUpwardVisitation::VisitSuccessors;
});

if (is_acyclic) {
subquery_root->set_as_predecessor_of(task);
}
// Cycles in the task graph would lead to deadlocks during execution. This could happen if a table can be pruned
dey4ss marked this conversation as resolved.
Show resolved Hide resolved
// using a predicate on itself (e.g., `SELECT * FROM a_table WHERE x > (SELECT AVG(x) FROM a_table)`). To make
// sure we do not introduce cycles, we include the prunable_subquery_predicates of a StoredTableNode in its
// equality check. Thus, we have to unequal nodes that are translated to distinct operators by the
dey4ss marked this conversation as resolved.
Show resolved Hide resolved
// LQPTranslator (and no further sanity check should be necessary). However, we still check for cycles after
// linking all tasks in debug builds.
subquery_root->set_as_predecessor_of(task);
}
}
}
Expand Down