Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support StructArray in Cast Kernel #4908

Open
tustvold opened this issue Oct 9, 2023 · 5 comments · Fixed by #4985
Open

Support StructArray in Cast Kernel #4908

tustvold opened this issue Oct 9, 2023 · 5 comments · Fixed by #4985
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog help wanted

Comments

@tustvold
Copy link
Contributor

tustvold commented Oct 9, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

There is currently extremely limited support for StructArray in the cast kernel

Describe the solution you'd like

We should support combinations of:

  • Re-ordering columns
  • Adding new nullable fields by adding entirely null columns as appropriate
  • Casting columns within the array

Describe alternatives you've considered

Additional context

@tustvold tustvold added enhancement Any new improvement worthy of a entry in the changelog help wanted labels Oct 9, 2023
@fansehep
Copy link
Contributor

Hi @tustvold I want to try this. Can you in detail and with specific reference to these three points? I am not very sure what you want.

@tustvold
Copy link
Contributor Author

Re-ordering columns

Casting from

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
])

To

DataType::Struct(Fields::new(vec![
    Field::new("b", DataType::Int32, false),
    Field::new("a", DataType::Int32, false),
])

Adding new nullable fields by adding entirely null columns as appropriate

Casting from

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
])

To

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
    Field::new("c", DataType::Int32, true),
])

Casting columns within the array

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
])

To

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, true),
    Field::new("b", DataType::Int64, false),
])

We should support combinations of:

So the eventual goal would be to support

DataType::Struct(Fields::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
])

To

DataType::Struct(Fields::new(vec![
    Field::new("c", DataType::Float32, true),
    Field::new("b", DataType::Int64, false),
    Field::new("a", DataType::Int32, false),
])

@tustvold
Copy link
Contributor Author

tustvold commented Nov 2, 2023

label_issue.py automatically added labels {'arrow'} from #4902

@tustvold
Copy link
Contributor Author

Looks like #4985 prematurely closed this

@tustvold
Copy link
Contributor Author

FYI @my-vegetable-has-exploded this is the broader issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog help wanted
Projects
None yet
2 participants