Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further clarification of the supports_filters_pushdown documentation #9988

Merged

Conversation

cisaacson
Copy link
Contributor

Which issue does this PR close?

Closes #9987.

Rationale for this change

What changes are included in this PR?

Doc change only.

Are these changes tested?

Did check that the code compiles.

Are there any user-facing changes?

Doc change

@github-actions github-actions bot added the core Core datafusion crate label Apr 7, 2024
@alamb alamb added the documentation Improvements or additions to documentation label Apr 8, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cisaacson -- this is a very nice improvment (and first contribution). I have one suggestion for the content of this PR, but I don't think it is needed to merge.

Let me know when you are ready for me to merge the PR

And thanks again!

datafusion/core/src/datasource/provider.rs Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Apr 8, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cisaacson -- I think this is looking really close. I left some more comments. Let me know what you think. Hopefully they make sense

///
/// Here is an example of how this can be done:
///
/// ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rust any example like this is actually compiled and run (more details are here https://doc.rust-lang.org/rustdoc/write-documentation/documentation-tests.html)

So to get this example to work you'll need to add some use etc commands

Comment on lines 179 to 190
/// let result_vec: Vec<TableProviderFilterPushDown> = Vec::with_capacity(&filters.len());
/// for i in 0..filters.len() {
/// // Evaluate a filter
/// let filter = filters[i];
///
/// // Evaluate a filter to support here
/// if filter ... {
/// result_vec.push(TableProviderFilterPushDown::Exact);
/// } else {
/// result_vec.push(TableProviderFilterPushDown::Unsupported);
/// }
/// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you like a more functional style, I think you can write this same logic in this way:

Suggested change
/// let result_vec: Vec<TableProviderFilterPushDown> = Vec::with_capacity(&filters.len());
/// for i in 0..filters.len() {
/// // Evaluate a filter
/// let filter = filters[i];
///
/// // Evaluate a filter to support here
/// if filter ... {
/// result_vec.push(TableProviderFilterPushDown::Exact);
/// } else {
/// result_vec.push(TableProviderFilterPushDown::Unsupported);
/// }
/// }
/// let result_vec: Vec<TableProviderFilterPushDown> = filters.iter()
/// .map (|filter|
/// // Evaluate a filter
///
/// // Evaluate a filter to support here
/// if can_support_filter(filter){
/// TableProviderFilterPushDown::Exact
/// } else {
/// TableProviderFilterPushDown::Unsupported;
/// }
/// }).collect();

Copy link
Contributor Author

@cisaacson cisaacson Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if the latest commit looks OK to you. I didn't see the functional style example until just now, but my example does compile. I bet you may have a better way to do the whole thing, I'm interested in that feedback too. I can look at this again this evening. And I could easily convert it to functional style too. If I put another fn in to evaluate each filter (which is how I would do it) then that would have to be in the example too? That will make it a lot longer so I put just one Expr eval to show the concept.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

/// for i in 0..filters.len() {
/// let expr = *filters[i];
///
/// match expr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For examples, I think it helps to add a comment explaining what it is doing for new readers
for example, here you might add:

Suggested change
/// match expr {
/// // Return `Exact` if we have a way to evaluate the expression exactly. In this
/// // we have a way to evaluate expressions on column c1, but not other columns,
/// match expr {

Comment on lines 198 to 200
/// None => {
/// return Err(DataFusionError::Execution(format!("Could not get column. between_expr: {:?}", between_expr)));
/// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example might be more general if it didn't error but instead returned Unsupported for other exprs

Also, if you want to get your functional Rust programming on, you can use something like this (untested):

  between_expr.expr.try_into_col()
     .map(|column| {
        if &column.name = "c1" {
          TableProviderFilterPushDown::Exact
        } else {
          TableProviderFilterPushDown::Unsupported
        }
     })
     // if not a column reference, return unsupported
    .unwrap_or(TableProviderFilterPushDown::Unsupported)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took out the error, see your point. And I changed it to functional style. Let me know what you think.

@alamb alamb added the documentation Improvements or additions to documentation label Apr 8, 2024
@alamb alamb changed the title Further clarification of the supports_filters_pushdown doc Further clarification of the supports_filters_pushdown documentation Apr 8, 2024
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Apr 8, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

/// }
/// })
/// // If there is no column in the expr set the filter to unsupported.
/// .unwrap_or(TableProviderFilterPushDown::Unsupported);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove the ; here so that this match arm is returning something and not returning the unit type ().

Suggested change
/// .unwrap_or(TableProviderFilterPushDown::Unsupported);
/// .unwrap_or(TableProviderFilterPushDown::Unsupported)

/// Expr::Between(between_expr) => {
/// between_expr.expr.try_into_col()
/// .map(|column| {
/// if column.name = "c1".to_string() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a comparison and not an assignment

Suggested change
/// if column.name = "c1".to_string() {
/// if column.name == "c1".to_string() {

///
/// // Override the supports_filters_pushdown to evaluate which expressions
/// // to accept as pushdown predicates.
/// fn supports_filters_pushdown(&self, filters: &[&Expr]) -> Result<Vec<TableProviderFilterPushDown>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't compile for me when I run cargo test --doc locally and is also failing in CI. One issue here is that this function needs to be part of an impl block for a struct rather than a standalone function, otherwise there is no self to refer to. I also added some comments for smaller issues that caused compilation issues locally.

Copy link
Contributor Author

@cisaacson cisaacson Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove , all good catches.

To add the impl it would be for TableProvider so do I need to create an entire struct, etc. to do that? Any help on how to contribute a minimal doc example would be great. I use VS Code with Rust Analyzer, I also do not see an easy way to copy code into the doc and make it a fn doc comment.

Any help appreciated, then I'll get it cleaned up. I see other areas I can help with the docs too once I have a fully verified example.

I do have all of the fixes done now other than the impl must have all methods required. I can do that but it seems like it will clutter the doc example, any easy way to do that? If I need to have them I'll just add them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb already said this, but you can prefix lines of code with # to still have them compile, but not appear in the generated docs.

See https://doc.rust-lang.org/rustdoc/write-documentation/documentation-tests.html#hiding-portions-of-the-example for more info

@alamb
Copy link
Contributor

alamb commented Apr 10, 2024

Thanks @cisaacson -- I took the liberty of getting the CI to pass as well as moving a bit of the docs around by pushing some commits to your branch.

I'll leave some comments inline

I think it is looking pretty good now:

Screenshot 2024-04-10 at 7 38 21 AM

/// # Example
///
/// ```rust
/// # use std::any::Any; /// #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rustdocs you can use # to have the line not show up in the example but still be included

datafusion/core/src/datasource/provider.rs Show resolved Hide resolved
datafusion/core/src/datasource/provider.rs Show resolved Hide resolved
@andygrove andygrove merged commit 1ec65a4 into apache:main Apr 10, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Apr 10, 2024

Thanks again for all the help and bearing with our back and forth @cisaacson -- I think these docs are getting quite good :bowtie:

@cisaacson cisaacson deleted the 9987-clarify-supports_filters_pushdown-doc branch April 11, 2024 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add clarifying documentation to the TableProvider supports_filters_pushdown
3 participants