Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Support for "skip.header.line.count" property for HIVE tables #1848
This is still an issue. Hive understands the skip.header.line property and skips header while reading. But presto displays the header record on querying the same table.
Example to reproduce the error:
Step 1: create a csv file with 2 columns including header record (having inserted few records),
Step 2: Create table in hive with following property TBLPROPERTIES("skip.header.line.count"="1");
Step3 : query (select *) the table in hive --> Does not show header, furthermore as an added test, using the value(column header) in where clause produces no rows
Step 4: Query (select *) the table in presto --> Includes Header, furthermore as an added test,using the value(column header) in where clause returns the header record.
referenced this issue
May 11, 2018
Presto is still ignoring skip.header.line.count on latest cluster deployment from AWS (5.13) Presto 0.194 with Hadoop 2.8.3 HDFS and Hive 2.3.2
The "closing fix #10323" doesn't apply to this ticket. The linked ticket was closed for a reason that had nothing to do with Presto.
@cvandeve the fix is available for since Presto 0.199 (https://prestodb.io/docs/current/release/release-0.199.html#hive-changes), while latest release is 0.201.