Support transparent ExecutionPlan downcasts#22559
Conversation
2128754 to
cae7bf1
Compare
|
I'll review as soon as it's marked ready, but at a high level looks good so far |
259871e to
6563792
Compare
6563792 to
fc8f083
Compare
@timsaucer Should be ready for review now. I just renamed the API to |
gabotechs
left a comment
There was a problem hiding this comment.
👍 Even if it's not super pretty to have that extra method there, it gets the job done without introducing any breaking API change, and it's landed in a way that only people caring about implementing wrapper types need to care about this, without getting in the way of everyone else, so +1 on my side.
Before moving forward, let's wait for at least @timsaucer to give his opinion here.
| match self.downcast_delegate() { | ||
| Some(delegate) => delegate.is::<T>(), | ||
| None => (self as &dyn Any).is::<T>(), | ||
| } |
There was a problem hiding this comment.
Won't this prevent you from downcasting to the wrapper type?
There was a problem hiding this comment.
Yes, and this is intentional to mimic the old as_any behavior. I've updated the PR description to be clear about this choice.
There was a problem hiding this comment.
It seems like the earlier PR would still work for all use cases, right? Because you're always choosing to downcast to a specific type, it seems like it would enable downcasting to either (or many, depending on levels) type.
There was a problem hiding this comment.
The previous PR would stop on the first match, rather than go all the way to the leaf child. I agree this is being pedantic, as "real world" cases would probably never have a case where you'd need to match the leaf child but get interrupted by an intermediate node in a "wrapper chain". However, I still find the current implementation clearer and easier to reason about: the leaf node is what is returned.
There was a problem hiding this comment.
Should we revert the hew :Any instead? I think it was just supposed to be a cleanup but we didn't realize it would have downstream impacts 🤔
There was a problem hiding this comment.
Also I wonder if there is some way to avoid using as_any / delegating logic at all ... But to do that we would have to remove all uses of as_any / downcasting I suspect
There was a problem hiding this comment.
Should we revert the new
:Anyinstead? I think it was just supposed to be a cleanup but we didn't realize it would have downstream impacts 🤔
@alamb Honestly I prefer the new API with downcast_delegate: it's explicit and clearer.
Also I wonder if there is some way to avoid using as_any / delegating logic at all ... But to do that we would have to remove all uses of as_any / downcasting I suspect
That would be a pretty big breaking change as well, as the is a check is quite widespread.
gene-bordegaray
left a comment
There was a problem hiding this comment.
looks good
I think this is feasible and good as we are gonna get 👍
fc8f083 to
41a9861
Compare
|
Do we need to also open a PR to target |
@timsaucer I opened #22565 on |
| match self.downcast_delegate() { | ||
| Some(delegate) => delegate.is::<T>(), | ||
| None => (self as &dyn Any).is::<T>(), | ||
| } |
There was a problem hiding this comment.
Should we revert the hew :Any instead? I think it was just supposed to be a cleanup but we didn't realize it would have downstream impacts 🤔
| match self.downcast_delegate() { | ||
| Some(delegate) => delegate.is::<T>(), | ||
| None => (self as &dyn Any).is::<T>(), | ||
| } |
There was a problem hiding this comment.
Also I wonder if there is some way to avoid using as_any / delegating logic at all ... But to do that we would have to remove all uses of as_any / downcasting I suspect
| /// instrumentation. The default implementation returns `None`, meaning this | ||
| /// plan's concrete type is used for type introspection. | ||
| /// | ||
| /// The `is` and `downcast_ref` helpers follow the returned delegate instead |
There was a problem hiding this comment.
It might be good to add a specific example here (like a wrapper node that wants to observe changes but still be treated as the underlying node when DataFusion checks for patterns using downcast 🤔
There was a problem hiding this comment.
There's the unit test with DowncastDelegatingExec which already shows a possible implementation. Would you also like an inline comment?
There was a problem hiding this comment.
Yeah, I was thinking an inline comment so that if someone else is reading (just) the ExecutionPlan docs they can more quickly understand if they need to worry about implementing this method or not (probably not)
There was a problem hiding this comment.
done: I amended the commit to be clearer in the comment.
41a9861 to
1851c8a
Compare
|
Looks good to me -- thank you everyone |
Which issue does this PR close?
Rationale for this change
DataFusion 54 changed
ExecutionPlandowncasting to use theAnysupertrait directly. That removesExecutionPlan::as_any, which had also served as a customization point for wrapper nodes: wrappers could identify as themselves internally while exposing the wrapped plan type to normal downcast-based inspection.This draft PR proposes one possible fix: add an explicit
ExecutionPlan::downcast_delegate()hook for wrapper nodes that want their publicExecutionPlandowncast identity to be delegated to another plan.The proposed behavior intentionally preserves the old
as_anyoverride semantics: when a node opts into downcast delegation, intermediate delegating wrappers are invisible todyn ExecutionPlan::is::<T>()anddowncast_ref::<T>(). A different "dual identity" model, where wrappers can downcast both as themselves and as their delegates, may also be useful, but that would be a new behavior rather than a compatibility fix.This is only a suggested implementation for the linked issue. Other solution proposals, including different API names or a different shape for the hook, are very welcome.
What changes are included in this PR?
ExecutionPlan::downcast_delegate()with a default implementation returningNone.dyn ExecutionPlan::is::<T>()anddowncast_ref::<T>()to delegate todowncast_delegate()when present, otherwise use the current concrete plan type.downcast_delegate()is only for type introspection and is independent fromchildren()/ plan traversal.An alternative API shape would make every plan expose an explicit downcast target. A concrete spelling could make the self-target case explicit:
That frames every plan as having a public downcast target, which is close to the old
as_anymental model. A simpler conceptual version would befn downcast_target(&self) -> &dyn ExecutionPlanwith the default target beingself, but the real helper still needs an explicit self-target base case to avoid recursing forever. This PR keeps the primary proposal as an explicitOption-based opt-in.Are these changes tested?
Yes.
cargo test -p datafusion-physical-plan execution_plan_downcastcargo test -p datafusion-physical-plan --libcargo fmt --all -- --checkgit diff --checkAre there any user-facing changes?
Yes. This adds a new public
ExecutionPlantrait method with a default implementation, and it changesExecutionPlandowncast helpers to honor wrappers that explicitly opt into delegating public downcast identity.