Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-7325: [Rust] [Parquet] Update to parquet-format 2.6 and thrift 0.12 #5971

Closed
wants to merge 1 commit into from

Conversation

kornholi
Copy link
Contributor

@kornholi kornholi commented Dec 5, 2019

Primary motivation is to pull in newer versions of byteorder, num-traits, and ordered-float.

PTAL @sunchao @nevi-me

@github-actions
Copy link

github-actions bot commented Dec 5, 2019

parquet::Type::DOUBLE => Type::DOUBLE,
parquet::Type::BYTE_ARRAY => Type::BYTE_ARRAY,
parquet::Type::FIXED_LEN_BYTE_ARRAY => Type::FIXED_LEN_BYTE_ARRAY,
parquet::Type::Boolean => Type::BOOLEAN,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kornholi @liurenjie1024 @sunchao do you think it would be disruptive to change the local enums like Type and LogicalType to have camel casing? I'm assuming we've had them as upper case because earlier versions of thrift generated them like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow's enum has camel casing, so I think it may look better to change parquet's local enums to have camel casing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat confused why these duplicate enums exist in the first place. If we're going to depend on a specific version of parquet-format, why not use the types from there directly?

Copy link
Contributor

@nevi-me nevi-me Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I'll have a look to see why, but my answer won't be authoritative as @sunchao, @sadikovi and @liurenjie1024 are most familiar with the parquet codebase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this part, but I guess this is used for better mapping of definitions of parquet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think originally this was to give an isolation between the Thrift definitions and the definitions in parquet-rs, and also allow us to rename things a little differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the Thrift was using all upper cases before and changed because of the version bump. We may break dependencies if we change the cases for the local enums. Need to be careful with that. I'd say do that in a separate PR later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants