Describe the bug
In native/spark-expr/src/json_funcs/to_json.rs, ToJson has two equality impls that compare different sets of fields:
impl PartialEq for ToJson {
fn eq(&self, other: &Self) -> bool {
self.expr.eq(&other.expr)
&& self.timezone.eq(&other.timezone)
&& self.ignore_null_fields.eq(&other.ignore_null_fields)
}
}
impl PartialEq<dyn Any> for ToJson {
fn eq(&self, other: &dyn Any) -> bool {
if let Some(other) = other.downcast_ref::<ToJson>() {
self.expr.eq(&other.expr) && self.timezone.eq(&other.timezone)
} else {
false
}
}
}
The PartialEq<dyn Any> impl omits timezone historically, and after #3875 also omits ignore_null_fields. (Originally only timezone was the discrepancy; the gap widened when ignore_null_fields was added.)
This matters because PartialEq<dyn Any> is what DataFusion uses to compare Arc<dyn PhysicalExpr> values, so two ToJson exprs that differ only in timezone or ignore_null_fields will compare equal through the trait-object path. That can cause incorrect deduplication or caching in the physical planner.
Steps to reproduce
Construct two ToJson exprs with the same child but different timezone or ignore_null_fields, compare them via Arc<dyn PhysicalExpr> equality (e.g. through PhysicalExpr::eq on the Any path), and observe that they compare equal.
Expected behavior
Both equality paths should compare the same fields. The PartialEq<dyn Any> impl should include timezone and ignore_null_fields, matching the regular PartialEq impl. Equivalent fix: derive PartialEq<dyn Any> through the regular PartialEq by downcasting and delegating to self == other.
Additional context
Spotted while reviewing #3875. The PR added ignore_null_fields to the regular PartialEq but did not touch the dyn Any impl, which made an existing inconsistency slightly worse.
Describe the bug
In
native/spark-expr/src/json_funcs/to_json.rs,ToJsonhas two equality impls that compare different sets of fields:The
PartialEq<dyn Any>impl omitstimezonehistorically, and after #3875 also omitsignore_null_fields. (Originally onlytimezonewas the discrepancy; the gap widened whenignore_null_fieldswas added.)This matters because
PartialEq<dyn Any>is what DataFusion uses to compareArc<dyn PhysicalExpr>values, so twoToJsonexprs that differ only intimezoneorignore_null_fieldswill compare equal through the trait-object path. That can cause incorrect deduplication or caching in the physical planner.Steps to reproduce
Construct two
ToJsonexprs with the same child but differenttimezoneorignore_null_fields, compare them viaArc<dyn PhysicalExpr>equality (e.g. throughPhysicalExpr::eqon theAnypath), and observe that they compare equal.Expected behavior
Both equality paths should compare the same fields. The
PartialEq<dyn Any>impl should includetimezoneandignore_null_fields, matching the regularPartialEqimpl. Equivalent fix: derivePartialEq<dyn Any>through the regularPartialEqby downcasting and delegating toself == other.Additional context
Spotted while reviewing #3875. The PR added
ignore_null_fieldsto the regularPartialEqbut did not touch thedyn Anyimpl, which made an existing inconsistency slightly worse.