Skip to content

Conversation

@waynexia
Copy link
Member

@waynexia waynexia commented Apr 11, 2023

Which issue does this PR close?

Closes #.

Rationale for this change

Like Optimizer allows to customize its rules, this patch adds a similar interface to Analyzer and makes it public.

What changes are included in this PR?

Allow users to insert their customized analyzer rules into the optimizer.

Are these changes tested?

Are there any user-facing changes?

Yes but not breaking. This publics the Analyzer and AnalyzerRule. And add corresponding config interfaces to SessionContext

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@github-actions github-actions bot added core Core DataFusion crate optimizer Optimizer rules labels Apr 11, 2023
Comment on lines 263 to 271
pub fn with_rules(
analyzer_rules: Vec<Arc<dyn AnalyzerRule + Send + Sync>>,
rules: Vec<Arc<dyn OptimizerRule + Send + Sync>>,
) -> Self {
Self {
analyzer_rules,
rules,
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if this is a proper place to customize analyzer rules. But from the code structure Analyzer is part of Optimizer. And only Optimizer is exposed to the context. Please let me know what you think @alamb @jackwener

@jackwener
Copy link
Member

I think put analyzer rule in optimizer isn't a good idea.
Because they are two part, and I think someday we may put analyzer into a independent part.

@waynexia
Copy link
Member Author

I think put analyzer rule in optimizer isn't a good idea. Because they are two part, and I think someday we may put analyzer into a independent part.

Same concern. To avoid the second breaking change in the future we'd better separate them at the beginning.

I'll change the config part, add Analyzer to the session context. But leave the code structure unchanged.

@github-actions github-actions bot added the substrait Changes to the substrait crate label Apr 11, 2023
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@waynexia waynexia force-pushed the analyzer-no-default branch from 80d3543 to 3c58741 Compare April 11, 2023 14:14
@github-actions github-actions bot removed the substrait Changes to the substrait crate label Apr 11, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @waynexia -- this looks like a good idea to me.

Could you also add a "end user" example of how to use this new analyzer rule framework to datafusion-examples?

Perhaps we can extend the existing one here:

https://github.com/apache/arrow-datafusion/blob/a6dcc2db250ee746a3b19c52c90337f74b020c20/datafusion-examples/examples/rewrite_expr.rs#L49-L56

The rationale is both to document the API for potential users but also to help ensure that we don't accidentally remove / break this API in the future during a refactpor

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
@waynexia
Copy link
Member Author

The rationale is both to document the API for potential users but also to help ensure that we don't accidentally remove / break this API in the future during a refactpor

Great suggestion! 👍 I have expanded the example to illustrate how to create and implement a basic analyzer rule.

waynexia and others added 2 commits April 12, 2023 21:46
Co-authored-by: jakevin <jakevingoo@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me -- thank you @waynexia and @jackwener

// run the analyzer with our custom rule
let config = OptimizerContext::default().with_skip_failing_rules(false);
let optimized_plan = optimizer.optimize(&logical_plan, &config, observe)?;
let analyzer = Analyzer::with_rules(vec![Arc::new(MyAnalyzerRule {})]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @waynexia

@alamb alamb merged commit 74a778c into apache:main Apr 13, 2023
@waynexia waynexia deleted the analyzer-no-default branch August 9, 2023 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants