Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function 'struct' may change the data type of input parameters #8118

Closed
yukkit opened this issue Nov 10, 2023 · 1 comment · Fixed by #8463
Closed

Function 'struct' may change the data type of input parameters #8118

yukkit opened this issue Nov 10, 2023 · 1 comment · Fixed by #8463

Comments

@yukkit
Copy link
Contributor

yukkit commented Nov 10, 2023

When I construct the structure through the function struct, my original data type is changed. I don't know if this is a bug or an expected design, but I think a better way is to retain the original type

To Reproduce

CREATE TABLE values(
    c0 INT,
    c1 String,
    c2 String
) AS VALUES
  (1, 'a', 'a'),
  (2, 'b', 'b'),
  (3, 'c', 'c');

explain verbose select struct(c0, c1, c2) from VALUES;
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------+
| plan_type                                                  | plan                                                                                                     |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------+
| initial_logical_plan                                       | Projection: struct(values.c0, values.c1, values.c2)                                                      |
|                                                            |   TableScan: values                                                                                      |
| logical_plan after inline_table_scan                       | SAME TEXT AS ABOVE                                                                                       |
| logical_plan after type_coercion                           | Projection: struct(CAST(values.c0 AS Utf8), values.c1, values.c2)                                        |
|                                                            |   TableScan: values                                                                                      |
| logical_plan after count_wildcard_rule                     | SAME TEXT AS ABOVE                                                                                       |
| analyzed_logical_plan                                      | SAME TEXT AS ABOVE  

I see AnalyzerRule type_coercion cast values.c0 to values.c0 AS Utf8.

I also have some questions, why struct only supports the following types? @Ted-Jiang My idea is that struct can support any type. Is it possible to use TypeSignature::VariadicAny type of function signature?

/// Currently supported types by the struct function.
pub static SUPPORTED_STRUCT_TYPES: &[DataType] = &[
    DataType::Boolean,
    DataType::UInt8,
    DataType::UInt16,
    DataType::UInt32,
    DataType::UInt64,
    DataType::Int8,
    DataType::Int16,
    DataType::Int32,
    DataType::Int64,
    DataType::Float32,
    DataType::Float64,
    DataType::Utf8,
    DataType::LargeUtf8,
];
@alamb
Copy link
Contributor

alamb commented Nov 13, 2023

I also looked at the implementation in

https://github.com/apache/arrow-datafusion/blob/3df895597d8c2073081fd9d990048c7aefb3b62e/datafusion/physical-expr/src/struct_expressions.rs#L32-L63

I agree there is no reason to restrict the types supported as field types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants