Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convenience method to create DataType::List correctly #4544

Closed
alamb opened this issue Jul 19, 2023 · 4 comments · Fixed by #4561
Closed

Convenience method to create DataType::List correctly #4544

alamb opened this issue Jul 19, 2023 · 4 comments · Fixed by #4561
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented Jul 19, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The embedded field type of a List must always be called "item" -- you can see this is done everywhere in the arrow-rs codebase that a DataType::List is created: https://github.com/search?q=repo%3Aapache%2Farrow-rs+%22%5C%22item%5C%22%22&type=code

While working on https://github.com/influxdata/influxdb_iox/pull/8247 I had to make this change

-       DataType::List(Arc::new(Field::new("state", dt.clone(), false))),
+       DataType::List(Arc::new(Field::new("item", dt.clone(), true))),

to align the type with expected

Having to do this is both verbose/annoying but more to the point means we are exposing some sort of implementation detail that must be cargo-culted in many places

Describe the solution you'd like
I would like a method like this (with good documentation) to create a

let dt = DataType::new_list(DataType::Float64, true);

That would do the same thing as

let df = DataType::List(Arc::new(Field::new("item", DataType::Float64, true))),

cc @tustvold for your thoughts

Describe alternatives you've considered
Do nothing

Additional context

@alamb alamb added arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers labels Jul 19, 2023
@alamb
Copy link
Contributor Author

alamb commented Jul 19, 2023

This is a nice self contained project for someone if they wanted to try their hand at the arrow codebase

@tustvold
Copy link
Contributor

tustvold commented Jul 19, 2023

It isn't a requirement that the name be "item", it is purely a convention. If code is relying on this, that code is incorrect

FWIW we provide Field::new_list already which can be used with the various list constructors

@alamb
Copy link
Contributor Author

alamb commented Jul 19, 2023

It isn't a requirement that the name be "item", it is purely a convention. If code is relying on this, that code is incorrect

At the very least perhaps we can agree that the arrow-rs codebase has very little test coverage of a DataType::List that does not follow the convention (I found one example)

@fansehep
Copy link
Contributor

it seems just add a pub fn in arrow-schema/src/datatype.rs ? and the default name is "item" ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants