Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Presto iceberg connector #15836

Merged
merged 14 commits into from May 20, 2021
Merged

Add Presto iceberg connector #15836

merged 14 commits into from May 20, 2021

Conversation

ChunxuTang
Copy link
Member

@ChunxuTang ChunxuTang commented Mar 15, 2021

Cherry-pick of trinodb/trino@e82c2d5

Co-Authored-By: Parth Brahmbhatt pbrahmbhatt@netflix.com
Co-Authored-By: Zhenxiao Luo zluo@twitter.com
Co-Authored-By: Beinan Wang beinanw@twitter.com

== RELEASE NOTES ==

General Changes
* Add iceberg connector

@beinan
Copy link
Member

beinan commented Mar 15, 2021

Hello @ChunxuTang could you rebase master? Thanks!

@ChunxuTang
Copy link
Member Author

#15620

@ChunxuTang
Copy link
Member Author

After the rebase, it seems that some modules won't be compiled successfully. Will fix that.

@ChunxuTang
Copy link
Member Author

Provided a quick fix for the parquet conflicts to kick-off the review and testing. Also sent a PR to the prestodb/presto-hive-apache repo for a parquet upgrade. When that PR is merged, we can remove the quick fix safely.

Copy link
Member

@beinan beinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! @ChunxuTang Just put a few questions and code style issues

@@ -1330,7 +1330,7 @@ private static Table buildTableObject(
return tableBuilder.build();
}

private static PrincipalPrivileges buildInitialPrivilegeSet(String tableOwner)
public static PrincipalPrivileges buildInitialPrivilegeSet(String tableOwner)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to make it public? any usage of this method outside?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Have changed it back to private.

import static com.facebook.presto.iceberg.IcebergErrorCode.ICEBERG_FILESYSTEM_ERROR;
import static java.util.Objects.requireNonNull;

public class HdfsFileIo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Io or IO? I think using upper case 'O' looks better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using the upper case of 'O' is better here. Refactored the code here.


private HiveTableOperations(FileIO fileIo, ExtendedHiveMetastore metastore, String database, String table, Optional<String> owner, Optional<String> location)
{
this.fileIo = requireNonNull(fileIo, "fileIo is null");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileIO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updated~

@@ -245,6 +245,10 @@
<ignoredResourcePattern>parquet.thrift</ignoredResourcePattern>
<ignoredResourcePattern>about.html</ignoredResourcePattern>
</ignoredResourcePatterns>
<ignoredClassPatterns>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change still needed after we update presto-hive-apache? if not, add a todo or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Yeah, I think we may still need this after updating the parquet in the presto-hive-apache to avoid dependency conflicts. We can do more tests of compilation to confirm that, avoiding potential changes in other packages' POM files.


@Override
public void testRenameTable()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave a comment if the rename table feature has not been supported yet, or add a todo if it's already supported

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don't support it yet. The reason we have the function is that we need to override the function defined in the AbstractTestDistributedQueries. We leverage that class to test SQL queries against the connector.

Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work, @ChunxuTang
added a few comments. could you please add cherry-pick info, and original commit in commit message?

PrestoPrincipal owner = new PrestoPrincipal(USER, table.getOwner());
PrincipalPrivileges privileges = new PrincipalPrivileges(
ImmutableMultimap.<String, HivePrivilegeInfo>builder()
.put(table.getOwner(), new HivePrivilegeInfo(HivePrivilege.SELECT, true, owner, owner))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static import HivePrivilege.SELECT
same for INSERT, UPDATE, DELETE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated in the new commit.

try {
io().deleteFile(newMetadataLocation);
}
catch (RuntimeException ex) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ex/exception/g

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name updated. Thanks for figuring it out!

// table disappeared during listing operation
}
catch (UnknownTableTypeException e) {
// ignore table of unknown type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we throw exception or log warning here? ignore silently seems not right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Added extra logging here.

columns.put(table, getTableMetadata(session, table).getColumns());
}
catch (TableNotFoundException e) {
// table disappeared during listing operation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we throw exception or log warning here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Added extra logging.

}

Optional<Long> version = Optional.empty();
if (type == TableType.DATA || type == TableType.PARTITIONS || type == TableType.MANIFESTS || type == TableType.FILES) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static import DATA, PARTITIONS, MANIFESTS, FILES

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated~


String table = match.group("table");
String typeString = match.group("type");
String ver1 = match.group("ver1");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ver1/version1/g
s/ver2/version2/g

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated the variable names.

@rohanpednekar rohanpednekar added the Roadmap A top level roadmap item label May 5, 2021
@ChunxuTang ChunxuTang force-pushed the presto-iceberg branch 2 times, most recently from a479836 to 1a4c150 Compare May 7, 2021 21:10
@ChunxuTang
Copy link
Member Author

nice work, @ChunxuTang
added a few comments. could you please add cherry-pick info, and original commit in commit message?

Sure~ Added the cherry-pick info in both the PR and the commit message.

@beinan
Copy link
Member

beinan commented May 11, 2021

Hello @rongrong , looks like the Facebook Integration test keep failing for this PR.

Facebook Integration — Build finished. No test results found.

is it possible to help us check what is happening to the test? Thanks a lot!

Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me
thank you @ChunxuTang
a few minor issues

<parent>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-root</artifactId>
<version>0.253-SNAPSHOT</version>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update to latest version, 254-snapshot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure~ Rebased the master branch and upgraded the presto-iceberg module to 0.254-snapshot.

ICEBERG_CURSOR_ERROR(9, EXTERNAL),
ICEBERG_WRITE_VALIDATION_FAILED(10, INTERNAL_ERROR),
ICEBERG_INVALID_SNAPSHOT_ID(11, USER_ERROR),
/**/;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove /**/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! Removed the unnecessary comment.

@zhenxiao
Copy link
Collaborator

hi @tdcmeehan seems all tests passed, except for Facebook integration. could you please help check if this PR is good to proceed?

@tdcmeehan
Copy link
Contributor

The Presto Facebook test failure is unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Roadmap A top level roadmap item
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants