Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New CSV transform feature #42

Open
grantnicholas opened this issue Dec 5, 2018 · 2 comments
Open

New CSV transform feature #42

grantnicholas opened this issue Dec 5, 2018 · 2 comments

Comments

@grantnicholas
Copy link

grantnicholas commented Dec 5, 2018

Are you open to PRs?

I added in support to map multiple CSV files into a single parquet file in order to increase row group compression and decrease spectrum query times.

While mapping a single CSV to a single parquet file is a fine default for unloading full tables, for unloading partitions it tended to produce too many small parquet files.

@c-nichols
Copy link
Collaborator

Definitely open to PRs! This is something I thought about adding but punted on because it wasn't critical for our use.

@grantatspothero
Copy link

Awesome, I just opened PR #43 to add the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants