-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Rust] [DataFusion] logical schema = physical schema is not true #25852
Comments
Andy Grove / @andygrove: I would think that the schema should match between the optimized logical plan and the physical plan though? |
Andy Grove / @andygrove: |
Jorge Leitão / @jorgecarleitao:
|
Jorge Leitão / @jorgecarleitao: |
Andy Grove / @andygrove: |
In tests/sql.rs, we test that the physical and the optimized schema must match. However, this is not necessarily true for all our queries. An example:
The physical expression (and schema) of this operation, after optimization, is
CAST(c8 as Int64) Plus c9
(this test fails).AFAIK, the invariant of the optimizer is that the output types and nullability are the same.
Also, note that the reason the optimized logical schema equals the logical schema is that our type coercer does not change the output names of the schema, even though it re-writes logical expressions. I.e. after the optimization,
.to_field()
of an expression may no longer match the field name nor type in the Plan's schema. IMO this is currently by (implicit?) design, as we do not want our logical schema's column names to change during optimizations, or all column references may point to non-existent columns. This is something that brought up on the mailing list about polymorphism.Reporter: Jorge Leitão / @jorgecarleitao
Assignee: Jorge Leitão / @jorgecarleitao
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-9809. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: