Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterative optimizer #6956

Merged
merged 2 commits into from Jan 7, 2017
Merged

Iterative optimizer #6956

merged 2 commits into from Jan 7, 2017

Conversation

martint
Copy link
Contributor

@martint martint commented Dec 23, 2016

This optimizer decouples the traversal of the plan tree (IterativeOptimizer)
from the transformation logic (Rule). The optimization loop applies rules
recursively until a fixpoint is reached.

It's implemented as PlanOptimizer so that it fits right into the existing
framework.

Extracted from #6700

@martint martint force-pushed the iterative branch 2 times, most recently from 7db2d56 to d7b54c2 Compare December 23, 2016 02:26
This makes it easier to extend the node hierarchy for tests in the
upcoming iterative optimizer.
@martint martint force-pushed the iterative branch 2 times, most recently from eea7ab9 to 1d57f9d Compare December 23, 2016 04:57
Copy link
Contributor

@kokosing kokosing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -72,4 +74,11 @@ public Symbol getIdColumn()
{
return visitor.visitAssignUniqueId(this, context);
}

@Override
public PlanNode replaceChildren(List<PlanNode> newChildren)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about renaming this to replaceSources? You have getSources() but replaceChildren, it sounds like they were referencing to two different things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sources" is a legacy term. It used to be accurate a long time ago, before we even had support for "index joins". Then, with that feature, and now with Apply, it can mean either source or subplan. So "child" is more correct now.

return node(idAllocator.getNextId(), source);
}

private GenericNode node(PlanNodeId id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could have only two node methods:

node(PlanNodeId id, PlanNode ... sources);
node(PlanNode ... sources);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, duh! I renamed the old filter/project/join/values methods and forgot to deal with the duplication :)

int yGroup = getChildGroup(memo, memo.getRootGroup());
int zGroup = getChildGroup(memo, yGroup);

PlanNode rewrittenW = memo.getNode(zGroup).getSources().get(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just w. There is no rewrite here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this node instance in particular, sure. Potentially, it's a rewritten node, because that's just what the memo does with every node in the plan tree in order to replace their children with group references.

In this case, I'm trying to simulate what a rule would see. It would not be correct to use w directly.


Memo memo = new Memo(idAllocator, x);

assertEquals(memo.getGroupCount(), 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not to use IterativeOptimizer here with some simple lambda rule? Then you could also check the traversal of IterativeOptimizer

Something like:

optimizer(node ->  if (node == y || node == z) return node(node.getSource())).optimize(x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, but I wanted to test the Memo in isolation.


import static org.testng.Assert.assertEquals;

public class TestMemo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for cycle? Rules can have bugs, so it would be nice to see how memo handles that.

Copy link
Contributor Author

@martint martint Dec 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how a rule would be able to cause a cycle. Unless rules make up group ids out of thin air, I don't think it's possible to do so. Let me think about it.

@Config("experimental.iterative-optimizer-enabled")
public FeaturesConfig setIterativeOptimizerEnabled(boolean value)
{
this.newOptimizerEnabled = value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename the field

@martint
Copy link
Contributor Author

martint commented Dec 23, 2016

What about these comments:
13e6b6a#r93578038

It's enabled for all tests deriving from AbstractTestQueries. I didn't want to add another instance to avoid increasing build times.

13e6b6a#r93575465

I expect Lookup to evolve beyond just exposing stuff from Memo (e.g., to provide other things such as "traits" -- effective predicate, sortedness, cost, etc.) that doesn't belong in the Memo structure. We can always fold it into it if necessary.

13e6b6a#r93576086

Let me play with it. It might be over-engineering it.

13e6b6a#r93576974

I'll take a look. The exception may be a remnant from some intermediate code I was working on. Regardless, at some point we need to get rid of OutputNode. It serves no purpose other than a glorified ProjectNode.

@kokosing
Copy link
Contributor

It's enabled for all tests deriving from AbstractTestQueries. I didn't want to add another instance to avoid increasing build times.

What about testing the other way round? Iterative optimizer is disabled by default which could hide some regressions in case that iterative optimizer "fixes" plan in case of bugs in legacy rules.

Copy link
Contributor

@kokosing kokosing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment with testing cycle in Memo and IMO it is ready to go.

@martint
Copy link
Contributor Author

martint commented Dec 23, 2016

What about testing the other way round? Iterative optimizer is disabled by default which could hide some regressions in case that iterative optimizer "fixes" plan in case of bugs in legacy rules.

Turn on the new optimizer in prod, then? ;)

In all seriousness, this is a challenge. Until we come up with a better story around how to deal with the matrix of features vs number of tests (i.e., #6910), we're going to have to live with compromises like this.

@kokosing
Copy link
Contributor

Turn on the new optimizer in prod, then? ;)

Sounds good to me. ;)

In all seriousness, this is a challenge. Until we come up with a better story around how to deal with the matrix of features vs number of tests (i.e., #6910), we're going to have to live with compromises like this.

Lets put #6910 to higher priority then, I will talk about it within TD.

@kokosing
Copy link
Contributor

kokosing commented Dec 23, 2016 via email

@Override
public PlanNode replaceChildren(List<PlanNode> newChildren)
{
return new MetadataDeleteNode(getId(), target, output, tableLayout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert newChildren.isEmpty()?

This optimizer decouples the traversal of the plan tree (IterativeOptimizer)
from the transformation logic (Rule). The optimization loop applies rules
recursively until a fixpoint is reached.

It's implemented as PlanOptimizer so that it fits right into the existing
framework.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants