Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 select improvements #2399

Merged
merged 3 commits into from
Jan 11, 2020
Merged

s3 select improvements #2399

merged 3 commits into from
Jan 11, 2020

Conversation

findepi
Copy link
Member

@findepi findepi commented Jan 3, 2020

No description provided.

@cla-bot cla-bot bot added the cla-signed label Jan 3, 2020
@findepi
Copy link
Member Author

findepi commented Jan 3, 2020

CI failed -- #2348

@findepi findepi force-pushed the findepi/s3-select-fixes branch 3 times, most recently from d9a444f to 474042e Compare January 8, 2020 15:26
@findepi findepi requested a review from electrum January 8, 2020 15:26

if (TextInputFormat.class.getName().equals(inputFormat)) {
if (!Objects.equals(schema.getProperty(SKIP_HEADER_COUNT_KEY, "0"), "0")) {
// S3 Select supports skipping one line of headers, but it was returning incorrect results for presto-hive-hadoop2/conf/files/test_table_with_header.csv.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe shorten to "for gzip files"

// S3 Select supports skipping one line of headers, but it was returning incorrect results for gzip files

Or just the file name test_table_with_header.csv.gz

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to refer to an exact file. I checked that it works correctly some some .gz files, so it looks data-dependent.
I will leave this as-is, and will add a TODO + link to the issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave this as-is,

(primarily because i missed this comment earlier)

S3 Select does not support skipping footer

S3 Select supports skipping one line of header, but it not always was
returning correct results.
@findepi findepi force-pushed the findepi/s3-select-fixes branch 2 times, most recently from 8e67d2e to 9ca4714 Compare January 11, 2020 19:46
@findepi findepi merged commit e7a0291 into master Jan 11, 2020
@findepi findepi deleted the findepi/s3-select-fixes branch January 11, 2020 19:47
@findepi findepi mentioned this pull request Jan 11, 2020
7 tasks
@findepi findepi added this to the 329 milestone Jan 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants