Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "skip.header.line.count" property for HIVE tables #1848

Closed
damiencarol opened this issue Oct 22, 2014 · 24 comments

Comments

Projects
None yet
@damiencarol
Copy link
Contributor

commented Oct 22, 2014

We have text tables with skip.header.line.count property.

@Downchuck

This comment has been minimized.

Copy link

commented Jan 27, 2016

I hit this as well.. Refreshing that it's an issue.

@jakovm

This comment has been minimized.

Copy link

commented Mar 30, 2016

This is still an issue...

@maciejgrzybek

This comment has been minimized.

Copy link
Member

commented Jul 29, 2016

skip.header.line.count is available from metastore. Record cursor will have to skip lines according to that property. Some changes in BackgroundHiveSplitLoader and ColumnarTextHiveRecordCursor will be necessary.

@pletelli

This comment has been minimized.

Copy link

commented Mar 16, 2017

No update on this issue ?

@dain

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2017

I would expect this to work for reading and writing to existing tables. I don't think there is a way to set this property when creating new tables.

@d18s

This comment has been minimized.

Copy link

commented Mar 21, 2017

👍 plus one for this issue

@dain

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2017

If this is still an issue, please reopen with specifics on how to reproduce the problem.

@dain dain closed this Mar 21, 2017

@rupesh1183

This comment has been minimized.

Copy link

commented Mar 30, 2017

This is still an issue. Hive understands the skip.header.line property and skips header while reading. But presto displays the header record on querying the same table.

Example to reproduce the error:

Step 1: create a csv file with 2 columns including header record (having inserted few records),

Step 2: Create table in hive with following property TBLPROPERTIES("skip.header.line.count"="1");

Step3 : query (select *) the table in hive --> Does not show header, furthermore as an added test, using the value(column header) in where clause produces no rows

Step 4: Query (select *) the table in presto --> Includes Header, furthermore as an added test,using the value(column header) in where clause returns the header record.

@chamal-sapumohotti

This comment has been minimized.

Copy link

commented May 23, 2017

facing the same issue. running on AWS EMR with
Hive = Hive 2.1.0
Presto = Presto 0.157.1

@cawallin cawallin reopened this Jun 19, 2017

@raja-sudhan

This comment has been minimized.

Copy link

commented Oct 10, 2017

Is this issue solved since I am also encountering the same issue.

@LouisKottmann

This comment has been minimized.

Copy link

commented Oct 17, 2017

We are experiencing the same issue, @dain is @rupesh1183 's specifics enough to reproduce the problem on your end?

@Micka33

This comment has been minimized.

Copy link

commented Oct 17, 2017

Facing the same issue. Any progress since Oct 22, 2014 ?

@minion96

This comment has been minimized.

Copy link

commented Oct 26, 2017

any other way around this?

@gseva

This comment has been minimized.

Copy link

commented Nov 3, 2017

Still an issue.

@dylancis

This comment has been minimized.

Copy link

commented Nov 22, 2017

same issue here

@robmurtagh

This comment has been minimized.

Copy link

commented Dec 8, 2017

yeah, same issue here...

@kokosing

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2018

This also touches the behaviour when skip.footer.line.count is used

@rupeshmalladi

This comment has been minimized.

Copy link

commented Apr 3, 2018

All - A related update : AWS has fixed this issue in 'Athena' which is AWS's version/product based on Presto.

@findepi

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2018

@rupeshmalladi any plans for AWS Athena to contribute this back?

@rupeshmalladi

This comment has been minimized.

Copy link

commented Apr 5, 2018

@findepi Not that I am aware of!

@kokosing

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2018

Fixed with #10323

@cvandeve

This comment has been minimized.

Copy link

commented May 21, 2018

Presto is still ignoring skip.header.line.count on latest cluster deployment from AWS (5.13) Presto 0.194 with Hadoop 2.8.3 HDFS and Hive 2.3.2

The "closing fix #10323" doesn't apply to this ticket. The linked ticket was closed for a reason that had nothing to do with Presto.

@findepi

This comment has been minimized.

Copy link
Contributor

commented May 21, 2018

@cvandeve the fix is available for since Presto 0.199 (https://prestodb.io/docs/current/release/release-0.199.html#hive-changes), while latest release is 0.201.

@cvandeve

This comment has been minimized.

Copy link

commented May 21, 2018

Thanks @findepi

@shawnzhu shawnzhu referenced this issue Mar 21, 2019

Closed

Presto v0.215 #15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.