Skip to content

ARROW-10547: [Rust][DataFusion] Do not lose Filters with UserDefined plan nodes#8633

Closed
alamb wants to merge 2 commits intoapache:masterfrom
alamb:alamb/ARROW-10547-user-defined-plans
Closed

ARROW-10547: [Rust][DataFusion] Do not lose Filters with UserDefined plan nodes#8633
alamb wants to merge 2 commits intoapache:masterfrom
alamb:alamb/ARROW-10547-user-defined-plans

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Nov 10, 2020

I have a LogicalPlan like this (where "Extension" is some extension node type):

    Extension [non_null_column:Utf8]
      Filter: #state Eq Utf8("MA") [city:Utf8;N, state:Utf8;N, borough:Utf8;N]
        InMemoryScan: projection=None [city:Utf8;N, state:Utf8;N, borough:Utf8;N]

When I run this plan through {{ExecutionContext::optimize}} the plan that results is as follows (note the filter has been lost):

    Extension [non_null_column:Utf8]
      InMemoryScan: projection=Some([0, 1, 2]) [city:Utf8;N, state:Utf8;N, borough:Utf8;N]

I have debugged the problem and the root cause of the issue is that the FilterPushDown logic is not recursing into the inputs of user defined nodes.

This PR has a fix for the problem

}

// AND filter expressions that would be placed on the same depth
if let Some(existing_expression) = new_filtersnew_filters.get(&new_depth) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what was going on with this variable name, but I figured I would reduce the replication ;)

},
1 => analyze_plan(inputs[0], depth + 1)?,
_ => {
return Err(DataFusionError::NotImplemented(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, I am not sure how / if this filter pushdown logic will work if there are multiple child inputs as no existing LogicalPlan variant his multiple inputs at this time. We will have to handle this when / if we want to support joins as well . For now I returned a "Not Yet Implemented" type error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree :)

@alamb alamb changed the title ARROW-10547: Do not lose Filters with UserDefined plan nodes ARROW-10547: [Rust][DataFusion] Do not lose Filters with UserDefined plan nodes Nov 10, 2020
@github-actions
Copy link

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants