Skip to content

Conversation

zhangbutao
Copy link
Contributor

What changes were proposed in this pull request?

Why are the changes needed?

CREATE TABLE LIKE syntax was added in #3443, and this PR added the ability to create table based on existing orc files.
It is convenient for user to create a table based on a orc file.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=create_table_like_file_orc.q -Dtest.output.overwrite=true

@jfsii
Copy link
Contributor

jfsii commented Sep 21, 2022

Thanks for doing this work @zhangbutao ! Very happy to see ORC support. I'll take a look and give this a try on my machine.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look, and things look good to me, the test covers kind of a lot of things.
Lets wait for @jfsii to have a look

Comment on lines 211 to 213
buffer.append('`');
buffer.append(name.replace("`", "``"));
buffer.append('`');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Concurrent StringBuilder Can be reused.

      buffer.append('`').append(name.replace("`", "``")).append('`');

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thx.

}

static void getStructFieldName(StringBuilder buffer, String name) {
if (UNQUOTED_NAMES.matcher(name).matches()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent of this just to ensure there aren't any quoted name, I am not sure, but do explore

!name.startsWith("'") && !name.endsWith("'")

Which lands up being cheaper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this code snippet was borrowed from ORC repo.
https://github.com/apache/orc/blob/6a74eef74b8101c7396f196f1a27f5e124a005f6/java/core/src/java/org/apache/orc/TypeDescription.java#L667-L673

And related ORC jira is https://issues.apache.org/jira/browse/ORC-104. I didn't do much research here, but i think it is better to be sync with ORC code. wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine, was just exploring a possibility. :-)

CTLF_MISSING_STORAGE_FORMAT_DESCRIPTOR(20021, "Failed to find StorageFormatDescriptor for file format ''{0}''", true),
PARQUET_FOOTER_ERROR(20022, "Failed to read parquet footer:"),
PARQUET_UNHANDLED_TYPE(20023, "Unhandled type {0}", true),
ORC_FOOTER_ERROR(20024, "Failed to read orc footer:"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the colon is there in PARQUET_FOOTER_ERROR but I am not sure why it is so, when this exception is thrown, post the colon it is the trace, should have been a period according to me, but let it be to be in sync with PARQUET_FOOTER_ERROR. But just out of curiosity if you know the reason behind it do let me know

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just follow the PARQUET_FOOTER_ERROR, and we can let @jfsii give a explanation. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source for the change was a comment on a review I had on the patch:
"nit: add a ":" such that the exception is printed after the delimiter"

I could have pushed back on the comment.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 6 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Contributor

@jfsii jfsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my end things look good. Thanks again @zhangbutao

CTLF_MISSING_STORAGE_FORMAT_DESCRIPTOR(20021, "Failed to find StorageFormatDescriptor for file format ''{0}''", true),
PARQUET_FOOTER_ERROR(20022, "Failed to read parquet footer:"),
PARQUET_UNHANDLED_TYPE(20023, "Unhandled type {0}", true),
ORC_FOOTER_ERROR(20024, "Failed to read orc footer:"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source for the change was a comment on a review I had on the patch:
"nit: add a ":" such that the exception is printed after the delimiter"

I could have pushed back on the comment.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jfsii
Copy link
Contributor

jfsii commented Sep 27, 2022

Is this committable? I don't have commit privs and I don't want to see this patch go stale. cc: @ayushtkn

@ayushtkn ayushtkn merged commit fb154b9 into apache:master Sep 27, 2022
rkirtir pushed a commit to rkirtir/hive that referenced this pull request Oct 7, 2022
…angbutao, reviewed by John Sherman, Ayush Saxena)
DongWei-4 pushed a commit to DongWei-4/hive that referenced this pull request Oct 28, 2022
…angbutao, reviewed by John Sherman, Ayush Saxena)
dengzhhu653 pushed a commit to dengzhhu653/hive that referenced this pull request Dec 15, 2022
…angbutao, reviewed by John Sherman, Ayush Saxena)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants