Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve consistency of expression names #3330

Open
andygrove opened this issue Sep 1, 2022 · 1 comment
Open

Improve consistency of expression names #3330

andygrove opened this issue Sep 1, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@andygrove
Copy link
Member

andygrove commented Sep 1, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We have many different ways to create names from expressions with duplicated and sometimes inconsistent code.

Logical Expression:

  • Display trait
  • Debug trait
  • Expr.name() method (wrapper for create_name function)
  • ExprIdentifierVisitor::desc_expr

Physical Expression:

  • Display trait
  • Debug trait
  • create_physical_name function

One example of confusion is that queries sometimes result in field names containing Divide and sometimes /. For example:

  • decimal_simple.c1 / CAST(Float64(0.00001) AS Decimal128(5, 5)) uses /
  • CAST(decimal_simple.c1 AS Decimal128(30, 19)) Divide CAST(decimal_simple.c5 AS Decimal128(30, 19)) uses Divide

Describe the solution you'd like
Make names more consistent and avoid duplicate code

Describe alternatives you've considered
None

Additional context
None

@andygrove
Copy link
Member Author

andygrove commented Sep 1, 2022

Here are some observations after reviewing the logical plan code.

  • Display and Debug produce similar output. Display delegates to Debug for most cases.
  • create_name accepts a schema argument, which is never used. If we remove that, then the logic looks to be about the same as Display and Debug

It seems to me that create_name should be used to determine the name of the expression as it would be represented in a schema, whereas Display would be used to create a representation of the expression to show in the logical plan. In many cases, these will be the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant