Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add method to add analyzer rules to SessionContext #10849

Merged
merged 6 commits into from
Jun 22, 2024

Conversation

pingsutw
Copy link
Member

Which issue does this PR close?

Closes #10846

Rationale for this change

The new session context works fine with the rules I added, but there is no way to register the rules to the session context directly.

What changes are included in this PR?

adding SessionContext::add_analyzer_rule that simply calls SessionState::add_analyzer_rule

Are these changes tested?

unit tests

Are there any user-facing changes?

Signed-off-by: Kevin Su <pingsutw@apache.org>
@github-actions github-actions bot added the core Core datafusion crate label Jun 10, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @pingsutw 🙏

I was going to say that we should add tests for this API, but it seems as though there are no existing tests for add_analyzer_rule 😢

https://github.com/search?q=repo%3Aapache%2Fdatafusion+add_analyzer_rule&type=code

It is probably time to add an example or test, similar to the user_defined_plan here:

.add_optimizer_rule(Arc::new(TopKOptimizerRule {}));

However I don't think you have to do it in this PR unless you want to

I filed #10855 to track the idea

@@ -387,9 +387,9 @@ impl SessionState {
/// Add `analyzer_rule` to the end of the list of
/// [`AnalyzerRule`]s used to rewrite queries.
pub fn add_analyzer_rule(
mut self,
&mut self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically this is an API change as now the api takes a mut reference rather than self

However, I think the change is good as now add_analyzer_rule looks more like a standard mutation style api (that takes &mut self) rather than a builder style (self)

What do you think about adding an api to make things consistent? (we could do this as a separate PR)

    pub fn with_analyzer_rule(
      mut self, 
        analyzer_rule: Arc<dyn AnalyzerRule + Send + Sync>,
  ) -> Self {
..
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update add_optimizer_rule add_physical_optimizer_rule? (mut self -> &mut self)

Also, do we need to add with_optimizer_rule and with_physical_optimizer_rule to make it consistent?
If so, I can do it in a separate PR.

@@ -331,6 +332,15 @@ impl SessionContext {
self
}

/// Adds an analyzer rule to the `SessionState` in the current `SessionContext`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have an examples, or doc test for the this method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree -- we are tracking adding an example for how to use custom analyzer rules in #10855, so perhaps we can add the example as part of that ticket (i think @goldmedal said he may have some time to work on that eventually)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I have a WIP PR to improve these examples -- I hope to get it up for review sometime this weekend

@alamb
Copy link
Contributor

alamb commented Jun 11, 2024

@pingsutw please let me know if you have time to add an example for this API, otherwise I think perhaps we can merge it in and add coverage as part of #10855 (or we could hold off on this code until the PR that adds the example)

@pingsutw
Copy link
Member Author

I'll update my pr and add an example tonight

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @pingsutw -- this is looking good

@@ -619,3 +627,49 @@ impl RecordBatchStream for TopKReader {
self.input.schema()
}
}

struct MyAnalyzerRule {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add test that exercises these APIs?

For example, perhaps a test that does something like

select 42, arrow_typeof(42)

Which I think this code will print out 42 and UInt?

pingsutw and others added 2 commits June 21, 2024 00:55
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @pingsutw

I merged this PR up from main to resolve a conflict

@pingsutw
Copy link
Member Author

I merged this PR up from main to resolve a conflict

Thank you for the help!

@alamb alamb merged commit a22423d into apache:main Jun 22, 2024
23 checks passed
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
* feat: Add method to add analyzer rules to SessionContext

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add a test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add analyze_plan

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update test

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024
* feat: Add method to add analyzer rules to SessionContext

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add a test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add analyze_plan

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update test

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API to add analyzer_rules
3 participants